Just six years ago, two draft versions of the human genome were published, an achievement widely hailed as one of the most audacious scientific undertakings in history. Both of these versions are composite sequences derived from the haploid genomesthe single set of 23 chromosomes packaged into the sperm or egg of each parentof (mostly) anonymous donors. But now, one of the principals behind the private human genome initiative has taken the next logical, albeit risky, step: sequencing his own genome. J. Craig Venter, whose technical innovations at Celera helped complete the draft sequences far ahead of schedule, published his entire genome in collaboration with 30 colleagues in this issue of PLoS Biology.
By placing his genome in the public domain, Venter runs the risk of divulging intimate personal details, including any current and future genetic markers for diseasea risk that extends to his family. He has done so, in part, to stimulate efforts to develop cheaper sequencing technology and usher in a new era of individualized genomic medicine. Venter has been working with the X Prize Foundation, which has promised US$10 million to the first person who can sequence a genome for US$1,000. Having the complete sequence of an individual human being will also allow scientists to ask different questions about the nature and origin of human genetic variation.
James Watson, the original director of the Human Genome Project and now chancellor of Cold Spring Harbor Laboratory, has also allowed his genome to be sequenced. He received a DVD documenting his personal sequence in a ceremony at Baylor College of Medicine in May 2007. (The report on his genome had not been published at press time.) With these sequences, scientists have a powerful tool for exploring the genetic contribution to human biology and disease risk. For example, the International HapMap Project maintains a catalog of common genetic variants (single-base variations called SNPs) among different populations, which has helped scientists identify gene variants (or alleles) associated with increased risk of diabetes and other complex diseases. (Neighboring SNPs that are inherited together are compiled into haplotypes, hence the name HapMap.)
But such studies cant detect rare disease-related alleles or those that reflect individual idiosyncrasies. Whats more, it is the complex interactions between different alleles (contributed by each parent), their regulatory elements, and a persons environment that determines an individuals physical characteristics (also known as phenotype) and disease risk. Sequencing both sets of chromosomes from an individualthe diploid genomewho is willing to disclose relevant details about his or her personal life offers the opportunity to correlate genomic variations with a specific phenotype.
Toward this end, Venter donated his blood for DNA extraction and answered questions about his family and medical history, personality, and physical traits. To assemble Venters genome (which the researchers called HuRef), the researchers modified the random shotgun sequencing approach that Venter used to produce the draft human genome. Briefly, the shotgun sequencing approach randomly shreds genetic material into millions of fragments, called reads, each of which is sequenced and then reassembled using a computer (based on sequence similarity), which matches up overlapping reads and merges them into longer sequences. By refining the software algorithms of the computer assembler (to respect the distinct paternal allelic contributions) and increasing the number of times they repeated the sequencing (to enhance data accuracy), the researchers decreased the number of gaps in the assembly to produce a high-quality draft diploid genome sequence. Assembling the sequences in the proper order and location along the chromosomes was guided in part by comparing the HuRef sequence with the composite human sequence assemblies.
To characterize the