Refine
Document Type
- Article (19)
- Postprint (6)
- Other (2)
- Conference Proceeding (1)
- Doctoral Thesis (1)
Language
- English (29)
Is part of the Bibliography
- yes (29)
Keywords
- ancient DNA (10)
- palaeogenomics (8)
- short-read mapping (3)
- Ancient DNA (2)
- D statistics (2)
- Genomics (2)
- Hybridisation capture (2)
- Leopards (2)
- Mitochondria (2)
- Mitochondrial genomes (2)
Ancient DNA of extinct species from the Pleistocene and Holocene has provided valuable evolutionary insights. However, these are largely restricted to mammals and high latitudes because DNA preservation in warm climates is typically poor. In the tropics and subtropics, non-avian reptiles constitute a significant part of the fauna and little is known about the genetics of the many extinct reptiles from tropical islands. We have reconstructed the near-complete mitochondrial genome of an extinct giant tortoise from the Bahamas (Chelonoidis alburyorum) using an approximately 1000-year-old humerus from a water-filled sinkhole (blue hole) on Great Abaco Island. Phylogenetic and molecular clock analyses place this extinct species as closely related to Galapagos (C. niger complex) and Chaco tortoises (C. chilensis), and provide evidence for repeated overseas dispersal in this tortoise group. The ancestors of extant Chelonoidis species arrived in South America from Africa only after the opening of the Atlantic Ocean and dispersed from there to the Caribbean and the Galapagos Islands. Our results also suggest that the anoxic, thermally buffered environment of blue holes may enhance DNA preservation, and thus are opening a window for better understanding evolution and population history of extinct tropical species, which would likely still exist without human impact.
Technological innovations such as next generation sequencing and DNA hybridisation enrichment have resulted in multi-fold increases in both the quantity of ancient DNA sequence data and the time depth for DNA retrieval. To date, over 30 ancient genomes have been sequenced, moving from 0.7x coverage (mammoth) in 2008 to more than 50x coverage (Neanderthal) in 2014. Studies of rapid evolutionary changes, such as the evolution and spread of pathogens and the genetic responses of hosts, or the genetics of domestication and climatic adaptation, are developing swiftly and the importance of palaeogenomics for investigating evolutionary processes during the last million years is likely to increase considerably. However, these new datasets require new methods of data processing and analysis, as well as conceptual changes in interpreting the results. In this review we highlight important areas of future technical and conceptual progress and discuss research topics in the rapidly growing field of palaeogenomics.
The future of ancient DNA
(2015)
Technological innovations such as next generation sequencing and DNA hybridisation enrichment have resulted in multi-fold increases in both the quantity of ancient DNA sequence data and the time depth for DNA retrieval. To date, over 30 ancient genomes have been sequenced, moving from 0.7x coverage (mammoth) in 2008 to more than 50x coverage (Neanderthal) in 2014. Studies of rapid evolutionary changes, such as the evolution and spread of pathogens and the genetic responses of hosts, or the genetics of domestication and climatic adaptation, are developing swiftly and the importance of palaeogenomics for investigating evolutionary processes during the last million years is likely to increase considerably. However, these new datasets require new methods of data processing and analysis, as well as conceptual changes in interpreting the results. In this review we highlight important areas of future technical and conceptual progress and discuss research topics in the rapidly growing field of palaeogenomics.
High-throughput sequence data retrieved from ancient or other degraded samples has led to unprecedented insights into the evolutionary history of many species, but the analysis of such sequences also poses specific computational challenges. The most commonly used approach involves mapping sequence reads to a reference genome. However, this process becomes increasingly challenging with an elevated genetic distance between target and reference or with the presence of contaminant sequences with high sequence similarity to the target species. The evaluation and testing of mapping efficiency and stringency are thus paramount for the reliable identification and analysis of ancient sequences. In this paper, we present ‘TAPAS’, (Testing of Alignment Parameters for Ancient Samples), a computational tool that enables the systematic testing of mapping tools for ancient data by simulating sequence data reflecting the properties of an ancient dataset and performing test runs using the mapping software and parameter settings of interest. We showcase TAPAS by using it to assess and improve mapping strategy for a degraded sample from a banded linsang (Prionodon linsang), for which no closely related reference is currently available. This enables a 1.8-fold increase of the number of mapped reads without sacrificing mapping specificity. The increase of mapped reads effectively reduces the need for additional sequencing, thus making more economical use of time, resources, and sample material.
High-throughput sequence data retrieved from ancient or other degraded samples has led to unprecedented insights into the evolutionary history of many species, but the analysis of such sequences also poses specific computational challenges. The most commonly used approach involves mapping sequence reads to a reference genome. However, this process becomes increasingly challenging with an elevated genetic distance between target and reference or with the presence of contaminant sequences with high sequence similarity to the target species. The evaluation and testing of mapping efficiency and stringency are thus paramount for the reliable identification and analysis of ancient sequences. In this paper, we present ‘TAPAS’, (Testing of Alignment Parameters for Ancient Samples), a computational tool that enables the systematic testing of mapping tools for ancient data by simulating sequence data reflecting the properties of an ancient dataset and performing test runs using the mapping software and parameter settings of interest. We showcase TAPAS by using it to assess and improve mapping strategy for a degraded sample from a banded linsang (Prionodon linsang), for which no closely related reference is currently available. This enables a 1.8-fold increase of the number of mapped reads without sacrificing mapping specificity. The increase of mapped reads effectively reduces the need for additional sequencing, thus making more economical use of time, resources, and sample material.
High-throughput sequence data retrieved from ancient or other degraded samples has led to unprecedented insights into the evolutionary history of many species, but the analysis of such sequences also poses specific computational challenges. The most commonly used approach involves mapping sequence reads to a reference genome. However, this process becomes increasingly challenging with an elevated genetic distance between target and reference or with the presence of contaminant sequences with high sequence similarity to the target species. The evaluation and testing of mapping efficiency and stringency are thus paramount for the reliable identification and analysis of ancient sequences. In this paper, we present ‘TAPAS’, (Testing of Alignment Parameters for Ancient Samples), a computational tool that enables the systematic testing of mapping tools for ancient data by simulating sequence data reflecting the properties of an ancient dataset and performing test runs using the mapping software and parameter settings of interest. We showcase TAPAS by using it to assess and improve mapping strategy for a degraded sample from a banded linsang (Prionodon linsang), for which no closely related reference is currently available. This enables a 1.8-fold increase of the number of mapped reads without sacrificing mapping specificity. The increase of mapped reads effectively reduces the need for additional sequencing, thus making more economical use of time, resources, and sample material.
Targeted capture coupled with high-throughput sequencing can be used to gain information about nuclear sequence variation at hundreds to thousands of loci. Divergent reference capture makes use of molecular data of one species to enrich target loci in other (related) species. This is particularly valuable for nonmodel organisms, for which often no a priori knowledge exists regarding these loci. Here, we have used targeted capture to obtain data for 809 nuclear coding DNA sequences (CDS) in a nonmodel organism, the Eurasian lynx Lynx lynx, using baits designed with the help of the published genome of a related model organism (the domestic cat Felis catus). Using this approach, we were able to survey intraspecific variation at hundreds of nuclear loci in L. lynx across the species’ European range. A large set of biallelic candidate SNPs was then evaluated using a high-throughput SNP genotyping platform (Fluidigm), which we then reduced to a final 96 SNP-panel based on assay performance and reliability; validation was carried out with 100 additional Eurasian lynx samples not included in the SNP discovery phase. The 96 SNP-panel developed from CDS performed very successfully in the identification of individuals and in population genetic structure inference (including the assignment of individuals to their source population). In keeping with recent studies, our results show that genic SNPs can be valuable for genetic monitoring of wildlife species.
Although many large mammal species went extinct at the end of the Pleistocene epoch, their DNA may persist due to past episodes of interspecies admixture. However, direct empirical evidence of the persistence of ancient alleles remains scarce. Here, we present multifold coverage genomic data from four Late Pleistocene cave bears (Ursus spelaeus complex) and show that cave bears hybridized with brown bears (Ursus arctos) during the Pleistocene. We develop an approach to assess both the directionality and relative timing of gene flow. We find that segments of cave bear DNA still persist in the genomes of living brown bears, with cave bears contributing 0.9 to 2.4% of the genomes of all brown bears investigated. Our results show that even though extinction is typically considered as absolute, following admixture, fragments of the gene pool of extinct species can survive for tens of thousands of years in the genomes of extant recipient species.
Historically, the giant panda was widely distributed from northern China to southwestern Asia [1]. As a result of range contraction and fragmentation, extant individuals are currently restricted to fragmented mountain ranges on the eastern margin of the Qinghai-Tibet plateau, where they are distributed among three major population clusters [2]. However, little is known about the genetic consequences of this dramatic range contraction. For example, were regions where giant pandas previously existed occupied by ancestors of present-day populations, or were these regions occupied by genetically distinct populations that are now extinct? If so, is there any contribution of these extinct populations to the genomes of giant pandas living today? To investigate these questions, we sequenced the nuclear genome of an similar to 5,000-year-old giant panda from Jiangdongshan, Teng-chong County in Yunnan Province, China. We find that this individual represents a genetically distinct population that diverged prior to the diversification of modern giant panda populations. We find evidence of differential admixture with this ancient population among modern individuals originating from different populations as well as within the same population. We also find evidence for directional gene flow, which transferred alleles from the ancient population into the modern giant panda lineages. A variable proportion of the genomes of extant individuals is therefore likely derived from the ancient population represented by our sequenced individual. Although extant giant panda populations retain reasonable genetic diversity, our results suggest that this represents only part of the genetic diversity this species harbored prior to its recent range contractions.
The prevalence of contaminant microbial DNA in ancient bone samples represents the principal limiting factor for palaeogenomic studies, as it may comprise more than 99% of DNA molecules obtained. Efforts to exclude or reduce this contaminant fraction have been numerous but also variable in their success. Here, we present a simple but highly effective method to increase the relative proportion of endogenous molecules obtained from ancient bones. Using computed tomography (CT) scanning, we identify the densest region of a bone as optimal for sampling. This approach accurately identifies the densest internal regions of petrous bones, which are known to be a source of high-purity ancient DNA. For ancient long bones, CT scans reveal a high-density outermost layer, which has been routinely removed and discarded prior to DNA extraction. For almost all long bones investigated, we find that targeted sampling of this outermost layer provides an increase in endogenous DNA content over that obtained from softer, trabecular bone. This targeted sampling can produce as much as 50-fold increase in the proportion of endogenous DNA, providing a directly proportional reduction in sequencing costs for shotgun sequencing experiments. The observed increases in endogenous DNA proportion are not associated with any reduction in absolute endogenous molecule recovery. Although sampling the outermost layer can result in higher levels of human contamination, some bones were found to have more contamination associated with the internal bone structures. Our method is highly consistent, reproducible and applicable across a wide range of bone types, ages and species. We predict that this discovery will greatly extend the potential to study ancient populations and species in the genomics era.