Personal Genomics References
(2004). "Finishing the euchromatic sequence of the human genome." Nature 431(7011): 931-45.
The sequence of the human genome encodes the genetic instructions for human physiology, as well as rich information about human evolution. In 2001, the International Human Genome Sequencing Consortium reported a draft sequence of the euchromatic portion of the human genome. Since then, the international collaboration has worked to convert this draft into a genome sequence with high accuracy and nearly complete coverage. Here, we report the result of this finishing process. The current genome sequence (Build 35) contains 2.85 billion nucleotides interrupted by only 341 gaps. It covers approximately 99% of the euchromatic genome and is accurate to an error rate of approximately 1 event per 100,000 bases. Many of the remaining euchromatic gaps are associated with segmental duplications and will require focused work with new methods. The near-complete sequence, the first for a vertebrate, greatly improves the precision of biological analyses of the human genome including studies of gene number, birth and death. Notably, the human genome seems to encode only 20,000-25,000 protein-coding genes. The genome sequence reported here should serve as a firm foundation for biomedical research in the decades ahead.
Anderson, S., A. T. Bankier, et al. (1981). "Sequence and organization of the human mitochondrial genome." Nature 290(5806): 457-65.
The complete sequence of the 16,569-base pair human mitochondrial genome is presented. The genes for the 12S and 16S rRNAs, 22 tRNAs, cytochrome c oxidase subunits I, II and III, ATPase subunit 6, cytochrome b and eight other predicted protein coding genes have been located. The sequence shows extreme economy in that the genes have none or only a few noncoding bases between them, and in many cases the termination codons are not coded in the DNA but are created post-transcriptionally by polyadenylation of the mRNAs.
Chan, E. Y. (2005). "Advances in sequencing technology." Mutat Res 573(1-2): 13-40.
Faster sequencing methods will undoubtedly lead to faster single nucleotide polymorphism (SNP) discovery. The Sanger method has served as the cornerstone for genome sequence production since 1977, close to almost 30 years of tremendous utility [Sanger, F., Nicklen, S., Coulson, A.R, DNA sequencing with chain-terminating inhibitors, Proc. Natl. Acad. Sci. U.S.A. 74 (1977) 5463-5467]. With the completion of the human genome sequence [Venter, J.C. et al., The sequence of the human genome, Science 291 (2001) 1304-1351; Lander, E.S. et al., Initial sequencing and analysis of the human genome, Nature 409 (2001) 860-921], there is now a focus on developing new sequencing methodologies that will enable "personal genomics", or the routine study of our individual genomes. Technologies that will lead us to this lofty goal are those that can provide improvements in three areas: read length, throughput, and cost. As progress is made in this field, large sections of genomes and then whole genomes of individuals will become increasingly more facile to sequence. SNP discovery efforts will be enhanced lock-step with these improvements. Here, the breadth of new sequencing approaches will be summarized including their status and prospects for enabling personal genomics.
Church, G. M. (2005). "The personal genome project." Mol Syst Biol 1: 2005 0030.
Gupta, P. K. (2008). "Single-molecule DNA sequencing technologies for future genomics research." Trends Biotechnol 26(11): 602-11.
During the current genomics revolution, the genomes of a large number of living organisms have been fully sequenced. However, with the advent of new sequencing technologies, genomics research is now at the threshold of a second revolution. Several second-generation sequencing platforms became available in 2007, but a further revolution in DNA resequencing technologies is being witnessed in 2008, with the launch of the first single-molecule DNA sequencer (Helicos Biosciences), which has already been used to resequence the genome of the M13 virus. This review discusses several single-molecule sequencing technologies that are expected to become available during the next few years and explains how they might impact on genomics research.
Levy, S., G. Sutton, et al. (2007). "The diploid genome sequence of an individual human." PLoS Biol 5(10): e254.
Presented here is a genome sequence of an individual human. It was produced from approximately 32 million random DNA fragments, sequenced by Sanger dideoxy technology and assembled into 4,528 scaffolds, comprising 2,810 million bases (Mb) of contiguous sequence with approximately 7.5-fold coverage for any given region. We developed a modified version of the Celera assembler to facilitate the identification and comparison of alternate alleles within this individual diploid genome. Comparison of this genome and the National Center for Biotechnology Information human reference assembly revealed more than 4.1 million DNA variants, encompassing 12.3 Mb. These variants (of which 1,288,319 were novel) included 3,213,401 single nucleotide polymorphisms (SNPs), 53,823 block substitutions (2-206 bp), 292,102 heterozygous insertion/deletion events (indels)(1-571 bp), 559,473 homozygous indels (1-82,711 bp), 90 inversions, as well as numerous segmental duplications and copy number variation regions. Non-SNP DNA variation accounts for 22% of all events identified in the donor, however they involve 74% of all variant bases. This suggests an important role for non-SNP genetic alterations in defining the diploid genome structure. Moreover, 44% of genes were heterozygous for one or more variants. Using a novel haplotype assembly strategy, we were able to span 1.5 Gb of genome sequence in segments >200 kb, providing further precision to the diploid nature of the genome. These data depict a definitive molecular portrait of a diploid human genome that provides a starting point for future genome comparisons and enables an era of individualized genomic information.
Macrogen. from http://www.macrogen.co.kr/eng/macrogen/state.jsp.
Mardis, E. R. (2008). "The impact of next-generation sequencing technology on genetics." Trends Genet 24(3): 133-41.
If one accepts that the fundamental pursuit of genetics is to determine the genotypes that explain phenotypes, the meteoric increase of DNA sequence information applied toward that pursuit has nowhere to go but up. The recent introduction of instruments capable of producing millions of DNA sequence reads in a single run is rapidly changing the landscape of genetics, providing the ability to answer questions with heretofore unimaginable speed. These technologies will provide an inexpensive, genome-wide sequence readout as an endpoint to applications ranging from chromatin immunoprecipitation, mutation mapping and polymorphism discovery to noncoding RNA discovery. Here I survey next-generation sequencing technologies and consider how they can provide a more complete picture of how the genome shapes the organism.
Metzker, M. L. (2005). "Emerging technologies in DNA sequencing." Genome Res 15(12): 1767-76.
Demand for DNA sequence information has never been greater, yet current Sanger technology is too costly, time consuming, and labor intensive to meet this ongoing demand. Applications span numerous research interests, including sequence variation studies, comparative genomics and evolution, forensics, and diagnostic and applied therapeutics. Several emerging technologies show promise of delivering next-generation solutions for fast and affordable genome sequencing. In this review article, the DNA polymerase-dependent strategies of Sanger sequencing, single nucleotide addition, and cyclic reversible termination are discussed to highlight recent advances and potential challenges these technologies face in their development for ultrafast DNA sequencing.
Porreca, G. J., J. Shendure, et al. (2006). "Polony DNA sequencing." Curr Protoc Mol Biol Chapter 7: Unit 7 8.
Polony DNA sequencing provides an inexpensive, accurate, high-throughput way to resequence genomes of interest by comparison to a reference genome. Mate-paired in vitro shotgun genomic libraries are produced and clonally amplified on microbeads by emulsion PCR. These serve as templates for sequencing by fluorescent nonamer ligation reactions on a microscope slide. Each sequencing run results in millions of 26-bp reads that can be aligned to the reference genome, allowing the identification of differences between sequences.
Ring, H. Z., P. Y. Kwok, et al. (2006). "Human Variome Project: an international collaboration to catalogue human genetic variation." Pharmacogenomics 7(7): 969-72.
Sanger, F., G. M. Air, et al. (1977). "Nucleotide sequence of bacteriophage phi X174 DNA." Nature 265(5596): 687-95.
A DNA sequence for the genome of bacteriophage phi X174 of approximately 5,375 nucleotides has been determined using the rapid and simple 'plus and minus' method. The sequence identifies many of the features responsible for the production of the proteins of the nine known genes of the organism, including initiation and termination sites for the proteins and RNAs. Two pairs of genes are coded by the same region of DNA using different reading frames.
Shendure, J., R. D. Mitra, et al. (2004). "Advanced sequencing technologies: methods and goals." Nat Rev Genet 5(5): 335-44.
Wheeler, D. A., M. Srinivasan, et al. (2008). "The complete genome of an individual by massively parallel DNA sequencing." Nature 452(7189): 872-6.
The association of genetic variation with disease and drug response, and improvements in nucleic acid technologies, have given great optimism for the impact of 'genomic medicine'. However, the formidable size of the diploid human genome, approximately 6 gigabases, has prevented the routine application of sequencing methods to deciphering complete individual human genomes. To realize the full potential of genomics for human health, this limitation must be overcome. Here we report the DNA sequence of a diploid genome of a single individual, James D. Watson, sequenced to 7.4-fold redundancy in two months using massively parallel sequencing in picolitre-size reaction vessels. This sequence was completed in two months at approximately one-hundredth of the cost of traditional capillary electrophoresis methods. Comparison of the sequence to the reference genome led to the identification of 3.3 million single nucleotide polymorphisms, of which 10,654 cause amino-acid substitution within the coding sequence. In addition, we accurately identified small-scale (2-40,000 base pair (bp)) insertion and deletion polymorphism as well as copy number variation resulting in the large-scale gain and loss of chromosomal segments ranging from 26,000 to 1.5 million base pairs. Overall, these results agree well with recent results of sequencing of a single individual by traditional methods. However, in addition to being faster and significantly less expensive, this sequencing technology avoids the arbitrary loss of genomic sequences inherent in random shotgun sequencing by bacterial cloning because it amplifies DNA in a cell-free system. As a result, we further demonstrate the acquisition of novel human sequence, including novel genes not previously identified by traditional genomic sequencing. This is the first genome sequenced by next-generation technologies. Therefore it is a pilot for the future challenges of 'personalized genome sequencing'.
[Main page, http://personalgenome.net/index.php/Main_Page]