Refine
Document Type
- Article (21)
- Postprint (5)
- Other (2)
- Conference Proceeding (1)
Language
- English (29)
Is part of the Bibliography
- yes (29)
Keywords
- ancient DNA (9)
- palaeogenomics (7)
- Genomics (4)
- Mitochondria (4)
- evolution (3)
- short-read mapping (3)
- Cattle (2)
- Crotalus (2)
- D statistics (2)
- Domestic animals (2)
Institute
- Institut für Biochemie und Biologie (29) (remove)
The prevalence of contaminant microbial DNA in ancient bone samples represents the principal limiting factor for palaeogenomic studies, as it may comprise more than 99% of DNA molecules obtained. Efforts to exclude or reduce this contaminant fraction have been numerous but also variable in their success. Here, we present a simple but highly effective method to increase the relative proportion of endogenous molecules obtained from ancient bones. Using computed tomography (CT) scanning, we identify the densest region of a bone as optimal for sampling. This approach accurately identifies the densest internal regions of petrous bones, which are known to be a source of high-purity ancient DNA. For ancient long bones, CT scans reveal a high-density outermost layer, which has been routinely removed and discarded prior to DNA extraction. For almost all long bones investigated, we find that targeted sampling of this outermost layer provides an increase in endogenous DNA content over that obtained from softer, trabecular bone. This targeted sampling can produce as much as 50-fold increase in the proportion of endogenous DNA, providing a directly proportional reduction in sequencing costs for shotgun sequencing experiments. The observed increases in endogenous DNA proportion are not associated with any reduction in absolute endogenous molecule recovery. Although sampling the outermost layer can result in higher levels of human contamination, some bones were found to have more contamination associated with the internal bone structures. Our method is highly consistent, reproducible and applicable across a wide range of bone types, ages and species. We predict that this discovery will greatly extend the potential to study ancient populations and species in the genomics era.
Although many large mammal species went extinct at the end of the Pleistocene epoch, their DNA may persist due to past episodes of interspecies admixture. However, direct empirical evidence of the persistence of ancient alleles remains scarce. Here, we present multifold coverage genomic data from four Late Pleistocene cave bears (Ursus spelaeus complex) and show that cave bears hybridized with brown bears (Ursus arctos) during the Pleistocene. We develop an approach to assess both the directionality and relative timing of gene flow. We find that segments of cave bear DNA still persist in the genomes of living brown bears, with cave bears contributing 0.9 to 2.4% of the genomes of all brown bears investigated. Our results show that even though extinction is typically considered as absolute, following admixture, fragments of the gene pool of extinct species can survive for tens of thousands of years in the genomes of extant recipient species.
Consensify
(2020)
A standard practise in palaeogenome analysis is the conversion of mapped short read data into pseudohaploid sequences, frequently by selecting a single high-quality nucleotide at random from the stack of mapped reads. This controls for biases due to differential sequencing coverage, but it does not control for differential rates and types of sequencing error, which are frequently large and variable in datasets obtained from ancient samples. These errors have the potential to distort phylogenetic and population clustering analyses, and to mislead tests of admixture using D statistics. We introduce Consensify, a method for generating pseudohaploid sequences, which controls for biases resulting from differential sequencing coverage while greatly reducing error rates. The error correction is derived directly from the data itself, without the requirement for additional genomic resources or simplifying assumptions such as contemporaneous sampling. For phylogenetic and population clustering analysis, we find that Consensify is less affected by artefacts than methods based on single read sampling. For D statistics, Consensify is more resistant to false positives and appears to be less affected by biases resulting from different laboratory protocols than other frequently used methods. Although Consensify is developed with palaeogenomic data in mind, it is applicable for any low to medium coverage short read datasets. We predict that Consensify will be a useful tool for future studies of palaeogenomes.
Consensify
(2020)
A standard practise in palaeogenome analysis is the conversion of mapped short read data into pseudohaploid sequences, frequently by selecting a single high-quality nucleotide at random from the stack of mapped reads. This controls for biases due to differential sequencing coverage, but it does not control for differential rates and types of sequencing error, which are frequently large and variable in datasets obtained from ancient samples. These errors have the potential to distort phylogenetic and population clustering analyses, and to mislead tests of admixture using D statistics. We introduce Consensify, a method for generating pseudohaploid sequences, which controls for biases resulting from differential sequencing coverage while greatly reducing error rates. The error correction is derived directly from the data itself, without the requirement for additional genomic resources or simplifying assumptions such as contemporaneous sampling. For phylogenetic and population clustering analysis, we find that Consensify is less affected by artefacts than methods based on single read sampling. For D statistics, Consensify is more resistant to false positives and appears to be less affected by biases resulting from different laboratory protocols than other frequently used methods. Although Consensify is developed with palaeogenomic data in mind, it is applicable for any low to medium coverage short read datasets. We predict that Consensify will be a useful tool for future studies of palaeogenomes.
Plio-Pleistocene phylogeography of the Southeast Asian Blue Panchax killifish, Aplocheilus panchax
(2017)
Ancient DNA studies have revolutionized the study of extinct species and populations, providing insights on phylogeny, phylogeography, admixture and demographic history. However, inferences on behaviour and sociality have been far less frequent. Here, we investigate the complete mitochondrial genomes of extinct Late Pleistocene cave bears and middle Holocene brown bears that each inhabited multiple geographically proximate caves in northern Spain. In cave bears, we find that, although most caves were occupied simultaneously, each cave almost exclusively contains a unique lineage of closely related haplotypes. This remarkable pattern suggests extreme fidelity to their birth site in cave bears, best described as homing behaviour, and that cave bears formed stable maternal social groups at least for hibernation. In contrast, brown bears do not show any strong association of mitochondrial lineage and cave, suggesting that these two closely related species differed in aspects of their behaviour and sociality. This difference is likely to have contributed to cave bear extinction, which occurred at a time in which competition for caves between bears and humans was likely intense and the ability to rapidly colonize new hibernation sites would have been crucial for the survival of a species so dependent on caves for hibernation as cave bears. Our study demonstrates the potential of ancient DNA to uncover patterns of behaviour and sociality in ancient species and populations, even those that went extinct many tens of thousands of years ago.
Domestic cattle were brought to Spain by early settlers and agricultural societies. Due to missing Neolithic sites in the Spanish region of Galicia, very little is known about this process in this region. We sampled 18 cattle subfossils from different ages and different mountain caves in Galicia, of which 11 were subject to sequencing of the mitochondrial genome and phylogenetic analysis, to provide insight into the introduction of cattle to this region. We detected high similarity between samples from different time periods and were able to compare the time frame of the first domesticated cattle in Galicia to data from the connecting region of Cantabria to show a plausible connection between the Neolithization of these two regions. Our data shows a close relationship of the early domesticated cattle of Galicia and modern cow breeds and gives a general insight into cattle phylogeny. We conclude that settlers migrated to this region of Spain from Europe and introduced common European breeds to Galicia.
Domestic cattle were brought to Spain by early settlers and agricultural societies. Due to missing Neolithic sites in the Spanish region of Galicia, very little is known about this process in this region. We sampled 18 cattle subfossils from different ages and different mountain caves in Galicia, of which 11 were subject to sequencing of the mitochondrial genome and phylogenetic analysis, to provide insight into the introduction of cattle to this region. We detected high similarity between samples from different time periods and were able to compare the time frame of the first domesticated cattle in Galicia to data from the connecting region of Cantabria to show a plausible connection between the Neolithization of these two regions. Our data shows a close relationship of the early domesticated cattle of Galicia and modern cow breeds and gives a general insight into cattle phylogeny. We conclude that settlers migrated to this region of Spain from Europe and introduced common European breeds to Galicia.
Technological innovations such as next generation sequencing and DNA hybridisation enrichment have resulted in multi-fold increases in both the quantity of ancient DNA sequence data and the time depth for DNA retrieval. To date, over 30 ancient genomes have been sequenced, moving from 0.7x coverage (mammoth) in 2008 to more than 50x coverage (Neanderthal) in 2014. Studies of rapid evolutionary changes, such as the evolution and spread of pathogens and the genetic responses of hosts, or the genetics of domestication and climatic adaptation, are developing swiftly and the importance of palaeogenomics for investigating evolutionary processes during the last million years is likely to increase considerably. However, these new datasets require new methods of data processing and analysis, as well as conceptual changes in interpreting the results. In this review we highlight important areas of future technical and conceptual progress and discuss research topics in the rapidly growing field of palaeogenomics.
Ancient DNA of extinct species from the Pleistocene and Holocene has provided valuable evolutionary insights. However, these are largely restricted to mammals and high latitudes because DNA preservation in warm climates is typically poor. In the tropics and subtropics, non-avian reptiles constitute a significant part of the fauna and little is known about the genetics of the many extinct reptiles from tropical islands. We have reconstructed the near-complete mitochondrial genome of an extinct giant tortoise from the Bahamas (Chelonoidis alburyorum) using an approximately 1000-year-old humerus from a water-filled sinkhole (blue hole) on Great Abaco Island. Phylogenetic and molecular clock analyses place this extinct species as closely related to Galapagos (C. niger complex) and Chaco tortoises (C. chilensis), and provide evidence for repeated overseas dispersal in this tortoise group. The ancestors of extant Chelonoidis species arrived in South America from Africa only after the opening of the Atlantic Ocean and dispersed from there to the Caribbean and the Galapagos Islands. Our results also suggest that the anoxic, thermally buffered environment of blue holes may enhance DNA preservation, and thus are opening a window for better understanding evolution and population history of extinct tropical species, which would likely still exist without human impact.
Genetic analyses of Australasian organisms have resulted in the identification of extensive cryptic diversity across the continent. The venomous elapid snakes are among the best-studied organismal groups in this region, but many knowledge gaps persist: for instance, despite their iconic status, the species-level diversity among Australo-Papuan blacksnakes (Pseudechis) has remained poorly understood due to the existence of a group of cryptic species within the P. australis species complex, collectively termed "pygmy mulga snakes". Using two mitochondrial and three nuclear loci we assess species boundaries within the genus using Bayesian species delimitation methods and reconstruct their phylogenetic history using multispecies coalescent approaches. Our analyses support the recognition of 10 species, including all of the currently described pygmy mulga snakes and one undescribed species from the Northern Territory of Australia. Phylogenetic relationships within the genus are broadly consistent with previous work, with the recognition of three major groups, the viviparous red-bellied black snake P. porphyriacus forming the sister species to two clades consisting of ovoviviparous species.
Utilising a reconstructed ancestral mitochondrial genome of a clade to design hybridisation capture baits can provide the opportunity for recovering mitochondrial sequences from all its descendent and even sister lineages. This approach is useful for taxa with no extant close relatives, as is often the case for rare or extinct species, and is a viable approach for the analysis of historical museum specimens. Asiatic linsangs (genus Prionodon) exemplify this situation, being rare Southeast Asian carnivores for which little molecular data is available. Using ancestral capture we recover partial mitochondrial genome sequences for seven banded linsangs (P. linsang) from historical specimens, representing the first intraspecific genetic dataset for this species. We additionally assemble a high quality mitogenome for the banded linsang using shotgun sequencing for time-calibrated phylogenetic analysis. This reveals a deep divergence between the two Asiatic linsang species (P. linsang, P. pardicolor), with an estimated divergence of ~12 million years (Ma). Although our sample size precludes any robust interpretation of the population structure of the banded linsang, we recover two distinct matrilines with an estimated tMRCA of ~1 Ma. Our results can be used as a basis for further investigation of the Asiatic linsangs, and further demonstrate the utility of ancestral capture for studying divergent taxa without close relatives.
Utilising a reconstructed ancestral mitochondrial genome of a clade to design hybridisation capture baits can provide the opportunity for recovering mitochondrial sequences from all its descendent and even sister lineages. This approach is useful for taxa with no extant close relatives, as is often the case for rare or extinct species, and is a viable approach for the analysis of historical museum specimens. Asiatic linsangs (genus Prionodon) exemplify this situation, being rare Southeast Asian carnivores for which little molecular data is available. Using ancestral capture we recover partial mitochondrial genome sequences for seven banded linsangs (P. linsang) from historical specimens, representing the first intraspecific genetic dataset for this species. We additionally assemble a high quality mitogenome for the banded linsang using shotgun sequencing for time-calibrated phylogenetic analysis. This reveals a deep divergence between the two Asiatic linsang species (P. linsang, P. pardicolor), with an estimated divergence of ~12 million years (Ma). Although our sample size precludes any robust interpretation of the population structure of the banded linsang, we recover two distinct matrilines with an estimated tMRCA of ~1 Ma. Our results can be used as a basis for further investigation of the Asiatic linsangs, and further demonstrate the utility of ancestral capture for studying divergent taxa without close relatives.
Saber-toothed cats (Machairodontinae) are among the most widely recognized representatives of the now largely extinct Pleistocene megafauna. However, many aspects of their ecology, evolution, and extinction remain uncertain. Although ancient-DNA studies have led to huge advances in our knowledge of these aspects of many other megafauna species (e.g., mammoths and cave bears), relatively few ancient-DNA studies have focused on saber-toothed cats [1-3], and they have been restricted to short fragments of mitochondrial DNA. Here we investigate the evolutionary history of two lineages of saber-toothed cats (Smilodon and Homotherium) in relation to living carnivores and find that the Machairodontinae form a well-supported clade that is distinct from all living felids. We present partial mitochondrial genomes from one S. populator sample and three Homotherium sp. samples, including the only Late Pleistocene Homotherium sample from Eurasia [4]. We confirm the identification of the unique Late Pleistocene European fossil through ancient-DNA analyses, thus strengthening the evidence that Homotherium occurred in Europe over 200,000 years later than previously believed. This in turn forces a re-evaluation of its demography and extinction dynamics. Within the Machairodontinae, we find a deep divergence between Smilodon and Homotherium (similar to 18 million years) but limited diversity between the American and European Homotherium specimens. The genetic data support the hypothesis that all Late Pleistocene (or post-Villafrancian) Homotherium should be considered a single species, H. latidens, which was previously proposed based on morphological data [5, 6].
Historically, the giant panda was widely distributed from northern China to southwestern Asia [1]. As a result of range contraction and fragmentation, extant individuals are currently restricted to fragmented mountain ranges on the eastern margin of the Qinghai-Tibet plateau, where they are distributed among three major population clusters [2]. However, little is known about the genetic consequences of this dramatic range contraction. For example, were regions where giant pandas previously existed occupied by ancestors of present-day populations, or were these regions occupied by genetically distinct populations that are now extinct? If so, is there any contribution of these extinct populations to the genomes of giant pandas living today? To investigate these questions, we sequenced the nuclear genome of an similar to 5,000-year-old giant panda from Jiangdongshan, Teng-chong County in Yunnan Province, China. We find that this individual represents a genetically distinct population that diverged prior to the diversification of modern giant panda populations. We find evidence of differential admixture with this ancient population among modern individuals originating from different populations as well as within the same population. We also find evidence for directional gene flow, which transferred alleles from the ancient population into the modern giant panda lineages. A variable proportion of the genomes of extant individuals is therefore likely derived from the ancient population represented by our sequenced individual. Although extant giant panda populations retain reasonable genetic diversity, our results suggest that this represents only part of the genetic diversity this species harbored prior to its recent range contractions.
High-throughput sequence data retrieved from ancient or other degraded samples has led to unprecedented insights into the evolutionary history of many species, but the analysis of such sequences also poses specific computational challenges. The most commonly used approach involves mapping sequence reads to a reference genome. However, this process becomes increasingly challenging with an elevated genetic distance between target and reference or with the presence of contaminant sequences with high sequence similarity to the target species. The evaluation and testing of mapping efficiency and stringency are thus paramount for the reliable identification and analysis of ancient sequences. In this paper, we present ‘TAPAS’, (Testing of Alignment Parameters for Ancient Samples), a computational tool that enables the systematic testing of mapping tools for ancient data by simulating sequence data reflecting the properties of an ancient dataset and performing test runs using the mapping software and parameter settings of interest. We showcase TAPAS by using it to assess and improve mapping strategy for a degraded sample from a banded linsang (Prionodon linsang), for which no closely related reference is currently available. This enables a 1.8-fold increase of the number of mapped reads without sacrificing mapping specificity. The increase of mapped reads effectively reduces the need for additional sequencing, thus making more economical use of time, resources, and sample material.
High-throughput sequence data retrieved from ancient or other degraded samples has led to unprecedented insights into the evolutionary history of many species, but the analysis of such sequences also poses specific computational challenges. The most commonly used approach involves mapping sequence reads to a reference genome. However, this process becomes increasingly challenging with an elevated genetic distance between target and reference or with the presence of contaminant sequences with high sequence similarity to the target species. The evaluation and testing of mapping efficiency and stringency are thus paramount for the reliable identification and analysis of ancient sequences. In this paper, we present ‘TAPAS’, (Testing of Alignment Parameters for Ancient Samples), a computational tool that enables the systematic testing of mapping tools for ancient data by simulating sequence data reflecting the properties of an ancient dataset and performing test runs using the mapping software and parameter settings of interest. We showcase TAPAS by using it to assess and improve mapping strategy for a degraded sample from a banded linsang (Prionodon linsang), for which no closely related reference is currently available. This enables a 1.8-fold increase of the number of mapped reads without sacrificing mapping specificity. The increase of mapped reads effectively reduces the need for additional sequencing, thus making more economical use of time, resources, and sample material.
High-throughput sequence data retrieved from ancient or other degraded samples has led to unprecedented insights into the evolutionary history of many species, but the analysis of such sequences also poses specific computational challenges. The most commonly used approach involves mapping sequence reads to a reference genome. However, this process becomes increasingly challenging with an elevated genetic distance between target and reference or with the presence of contaminant sequences with high sequence similarity to the target species. The evaluation and testing of mapping efficiency and stringency are thus paramount for the reliable identification and analysis of ancient sequences. In this paper, we present ‘TAPAS’, (Testing of Alignment Parameters for Ancient Samples), a computational tool that enables the systematic testing of mapping tools for ancient data by simulating sequence data reflecting the properties of an ancient dataset and performing test runs using the mapping software and parameter settings of interest. We showcase TAPAS by using it to assess and improve mapping strategy for a degraded sample from a banded linsang (Prionodon linsang), for which no closely related reference is currently available. This enables a 1.8-fold increase of the number of mapped reads without sacrificing mapping specificity. The increase of mapped reads effectively reduces the need for additional sequencing, thus making more economical use of time, resources, and sample material.