Sunday, October 28, 2007

DNA Testing and Genealogy - A Dubious Match Made in Cyberspace??

Oh, dear. I read this article almost with a sense of disbelief but people can be quite revealing when it comes to their personal passions! After all of the "boogey man" stories about how nefarious government and corporate agencies (such as insurance companies) would use DNA databases to oppress and discriminate against members of society, people are willingly jumping on the DNA bandwagon in droves because of their passion for genealogy.

"As the internet became an every day tool for millions of people. it changed the way family historians do research. The availability of on-line, easily accessible genealogy and historical information has fueled the phenomenal growth of Genealogy as a hobby... Everywhere we look we see genealogy reported as the fastest growing hobby in the country. Now the internet is the first stop for beginning family historians and is used extensively by experienced researchers." - The USGenWeb Project.

New York Times excerpt: "Men can get a lot more out of DNA testing because they inherit both an X and a Y chromosome, enabling them to identify their paternal haplogroup and easily trace the history of paternal surnames. But women can only identify their maternal haplogroup, unless they use a sample from a close male relative like a brother or father. (Hey, Thanksgiving is coming up!)

Other sites — such as and — offer similar DNA tests that are a reliable way to reveal broad facts like, say, whether you have Native American or Asian ancestors. But scientists warn that it is best to shy away from sites that promise to pinpoint a specific region or country of origin, because in most cases the tests can’t uncover such specific details.

I chose because it already has a user base of 15 million, more than 3 million of whom have posted their searchable family trees at the site. So I’m counting on the network effect.

The more’s users who have their DNA tested, the more results there will be to compare to mine. Genetic matches will be posted on my results page — and then I will be able to e-mail like-minded historians to ask for more help solving the family-tree puzzle — so with luck I won’t have to be a detective alone for long.

Or, as Megan Smolenyak, a spokeswoman for, explained it: “It’s basically a matchmaking game. You get a pile of numbers. I get a pile of numbers. And if they match, those people can become research buddies.”

For now, the database is small, comprising as of last week only 6,500 results from previous tests. So I didn’t really expect to find a long-lost cousin.

At this point, identifying one’s haplogroup, and thus your ancestors’ gradual migration, is the main benefit of the tests (which at cost from $144 to $199)." - Marie Antoinette, Is That You? By MICHELLE SLATALLA.

I see Oxford Ancestry charges from $180 for a single maternal or paternal line analysis to $370 for ONE MatriLine™ analysis, ONE MatriMap™ print, ONE Y-Clan™ analysis and ONE PatriMap™ print. You can also order a smörgåsbord of other options including a "Tribes of Britain" analysis for an extra $25. Hmmm....this looks like it could generate some serious revenue!

National Geographic's "Genographic Project" offers participants a kit for $99.95 that includes:

• Buccal swab kit
• Multimedia DVD
• Exclusive National Geographic
Genographic Map

• "Quick Start" card
• Genographic Project Brochure
• Self-addressed envelope
• Confidential Genographic Project
ID Number (GPID)

However, National Geographic says their results are not genealogical in nature:

"We run ONE test per participation kit. We will test either your mitochondrial DNA, which is passed down each generation from mother to child and reveals your direct maternal ancestry; or your Y chromosome (males only), which is passed down from father to son and reveals your direct paternal ancestry. You choose which test you would like administered.

What to Expect
Your results will reveal your deep ancestry along a single line of direct descent (paternal or maternal) and show the migration paths they followed thousands of years ago. Your results will also place you on a particular branch of the human family tree. Some anthropological stories are more detailed than others, depending upon the lineage you belong to. For example, if you are of African descent, your results will show the initial movements of your ancestors on the African continent, but will not reflect most of the migrations that have occurred within the past 10,000 years. Your individual results may confirm your expectations of what you believe your deep ancestry to be, or you may be surprised to learn a new story about your genetic background.

You will not receive a percentage breakdown of your genetic background by ethnicity, race, or geographic origin. Nor will you receive confirmation of an association with a particular tribe or
ethnic group.

Furthermore, this is not a genealogy study. You will not learn about your great-grandparents or other recent relatives, and your DNA trail will not necessarily lead to your present-day location. Rather, your results will reveal the anthropological story of your direct maternal or paternal ancestors—where they lived and how they migrated around the world many thousands of years ago." - National Geographic Genographic Project.

However, they mention that as research progresses you may be able to obtain more detail:

"Remember, your initial results are just the beginning. They are based on current science and may become more detailed and refined as the ongoing field research yields new information. Be sure to visit this Web site often to follow along as we post new findings and automatically update your results."

National Geographic assigns an anonymous ID number to your record in the database to protect your privacy:

"To ensure the privacy of participants, we have built an anonymous analysis process. Your Participation Kit will be mailed with a randomly-generated, non-sequential Genographic Participant ID number (GPID). Although we will have mailed a Participation Kit to your address, we do not know the random code included in the Kit. When you send in your DNA sample with your consent form, they will only be identified by your GPID. Therefore, your cheek cells will be analyzed completely anonymously."

Since most of us working with database technologies know that somewhere there is usually a crosswalk file to match anonymous IDs with real user identification I wondered if this is true with the Genographic kit. Normally, if subsequent login access to a file is supported, this is the case. But, the Genographic Project has addressed this issue:

The kit contains a password for access to the Genographic Project participant web page. YOU MUST RETAIN THIS PASSWORD IN ORDER TO ACCESS YOUR GENETIC MIGRATORY PROFILE. To protect your privacy, National Geographic does not associate any personally identifiable information about you with this randomly assigned password, and if you lose this password we cannot recover it for you or provide you with any other means of accessing the results of your participation."

So, the only other issue would be if National Geographic could be compelled to track this information by some clandestine order of the NSA or other security agency given "special powers" by our current nosey regime.

However, is not following any type of human subject protocol:

" will not share your testing results with other organizations without your consent. In addition, as with all user submitted content gives you control over your privacy settings that determine whether your information is public or anonymous."

In other words, they have a database connecting DNA results with individual user identification.

Oxford Ancestry's tracking system is similar: "
Oxford Ancestors will not use your DNA for any other purpose than for the services you have requested. Your results will be disclosed only to you, unless you specifically instruct us otherwise, and your DNA will be destroyed after your results have been despatched."

Family Tree DNA assigns a kit ID number but is obviously maintaining a crosswalk file between ID numbers and personal contact information:

"Family Tree DNA follows stringent policies for protecting your privacy according to state legislation guidelines. We control the Surname Database Library and genetic testing scores. Both the University of Arizona testing lab and our Genomics Research Center follow strict guidelines to ensure your privacy is maintained. Only limited information is shared with the testing facility.

Family Tree DNA accepts the responsibility to keep your specific data private, at the same time, making enough general information public to allow us to build a Surname Database library to be used for genealogical purposes."

They also offer participation in special projects that begins to share your data with others:

"Family Tree DNA also provides the option to participate in a group project in order to try to learn more by working with others who may share similar ancestry. If you choose to participate in a project the group administrator will be able to view your results and contact information so that he or she may best help members of the project learn about their ancestry. So that members can share information more easily a public website displaying member results is often created. The free website that Family Tree DNA provides to projects allows results to be listed by kit number, computer generated number, oldest known ancestor, or surname. It does not list personally identifying information. You may join or leave projects at any time after your results are posted at no charge. You can view a list of projects here.

Your unique test kit number will accompany your collection tube to the testing lab. The computer-generated number and your surname is the only information about you that the testing facility will see. Once your test has been completed the results of the Y-DNA or mtDNA test will be entered in a secure database. A comparison between your specific genetic results and those of others in the database will then be performed.

If a genetic match is found between you and another person in the database and you have each signed the release form you will be informed via email.

If a genetic match is found between you and another individual who enters the library at some time in the future, both will be given the information that a potential match is in the database provided that BOTH of you have signed the release form. Only if both parties agree will contact information concerning the separate parties be made available to the other party. In this way, all persons in the database will have the right to decide if they want to contact their genetic match(es).

Privacy and confidentiality will be strictly maintained." - Family Tree DNA.

From a security standpoint, two things jump out at me. The first is the reference to state legislation. We all know the federal government can and has trampled on state law at will. The second thing is the "release form". Obviously signing it begins to obviate the other security assurances.

I am beginning to sound like one of my more paranoid colleagues but I'm concerned that people will treat the provision of this very sensitive information as lightly as they treat signing up for an account on many other Web 2.0 social networking sites. If we didn't currently live under such an oppressive regime that has demonstrated its willingness to ignore personal rights to privacy I probably would be at least a little less nervous. But in the existing environment where both government and corporate espionage against our own people is sanctioned as "necessary for Homeland Security" I fear the worst use of this information will be made.

"Way back" in 1997, a movie was released entitled "Gattaca". Unfortunately, it passed through local theaters all too briefly. Perhaps it was released just a little before its time.

"Gattaca Corp. is an aerospace firm in the future. This future society analyzes your DNA at birth and, based upon your projected life expectancy and disease likelihood, determines where you will be assigned in the social order. Ethan Hawke's character, Vincent, conceived naturally without the accepted clinical "quality control", was born with a 95% chance of developing a heart condition - at least according to his DNA sequence - which has relegated him to the society's trash bin. These individuals are assigned menial tasks such as janitorial work that do not require the society's output of resources for education and training. His dream, however, is to explore space.

In desperation, Vincent dives into the world of a genetic black market, assuming the identity of Jerome (played by Jude Law), a physically spectacular athlete who has had the misfortune to be crippled in an accident. By using samples of Jerome's hair, skin, blood and urine, Vincent is accepted by the Gattaca Corporation and selected for a manned mission to Saturn. As Vincent trains for his lifelong dream assignment, he must constantly pass gene tests each day. He must avoid detection by meticulous hygiene to avoid leaving any of the thousands of cells our bodies shed each day behind at his worksite that could be picked up by corporate security sweeps. Then, one day, an errant eyelash escapes him, and it is found when a mission director is killed and police sweep the scene for evidence. Of course they are sure the miscreant who left the eyelash is the guilty party and to make matters worse, the investigative team is led by Vincent's genetically superior brother. As extra security measures are implemented in the search for the "impostor", Vincent's position becomes more and more precarious. I won't provide a spoiler. I would urge you to rent or buy the DVD to learn what happens to Vincent and his "perfect" world.

Thursday, October 04, 2007

Berkeley Offers Full Lectures on YouTube

UC Berkeley must have worked out an "iTunes-U" type of agreement with Google and is now featuring their lectures on YouTube. They must have been granted special provider status, though, to get around the current 10-minute video size limit. Anyone can upload larger videos to Google Video, but YouTube really has the name awareness and high traffic volume that is needed to ensure widespread exposure. What is particularly nice about having the lectures on YouTube as opposed to iTunes-U is that you can directly link the video files to other web resources or embed the video inside webpages. With iTunes, you can provide only a link to the Apple introductory site to iTunes-U where visitors click on a link to launch the iTunes Store to browse and find the videos.

"YouTube is now an important teaching tool at UC Berkeley.

The school announced on Wednesday that it has begun posting entire course lectures on the Web's No.1 video-sharing site.

Berkeley officials claimed in a statement that the university is the first to make full course lectures available on YouTube. The school said that over 300 hours of videotaped courses will be available at

Berkeley said it will continue to expand the offering. The topics of study found on YouTube included chemistry, physics, biology and even a lecture on search-engine technology given in 2005 by Google cofounder Sergey Brin.

"UC Berkeley on YouTube will provide a public window into university life, academics, events and athletics, which will build on our rich tradition of open educational content for the larger community," said Christina Maslach, UC Berkeley's vice provost for undergraduate education in a statement."