• Treffer 14 von 50
Zurück zur Trefferliste

Consensify

  • A standard practise in palaeogenome analysis is the conversion of mapped short read data into pseudohaploid sequences, frequently by selecting a single high-quality nucleotide at random from the stack of mapped reads. This controls for biases due to differential sequencing coverage, but it does not control for differential rates and types of sequencing error, which are frequently large and variable in datasets obtained from ancient samples. These errors have the potential to distort phylogenetic and population clustering analyses, and to mislead tests of admixture using D statistics. We introduce Consensify, a method for generating pseudohaploid sequences, which controls for biases resulting from differential sequencing coverage while greatly reducing error rates. The error correction is derived directly from the data itself, without the requirement for additional genomic resources or simplifying assumptions such as contemporaneous sampling. For phylogenetic and population clustering analysis, we find that Consensify is less affectedA standard practise in palaeogenome analysis is the conversion of mapped short read data into pseudohaploid sequences, frequently by selecting a single high-quality nucleotide at random from the stack of mapped reads. This controls for biases due to differential sequencing coverage, but it does not control for differential rates and types of sequencing error, which are frequently large and variable in datasets obtained from ancient samples. These errors have the potential to distort phylogenetic and population clustering analyses, and to mislead tests of admixture using D statistics. We introduce Consensify, a method for generating pseudohaploid sequences, which controls for biases resulting from differential sequencing coverage while greatly reducing error rates. The error correction is derived directly from the data itself, without the requirement for additional genomic resources or simplifying assumptions such as contemporaneous sampling. For phylogenetic and population clustering analysis, we find that Consensify is less affected by artefacts than methods based on single read sampling. For D statistics, Consensify is more resistant to false positives and appears to be less affected by biases resulting from different laboratory protocols than other frequently used methods. Although Consensify is developed with palaeogenomic data in mind, it is applicable for any low to medium coverage short read datasets. We predict that Consensify will be a useful tool for future studies of palaeogenomes.zeige mehrzeige weniger

Volltext Dateien herunterladen

  • pmnr1033.pdfeng
    (2199KB)

    SHA-512:c7d264f09bdc6f6ff590fbbd4593df3303c4c01cd6ac95ab759ecb697f6304fcc4bd659e69792599fff4a73242ee13c4476444cf85f5a62df01d032c9ea95d22

Metadaten exportieren

Weitere Dienste

Suche bei Google Scholar Statistik - Anzahl der Zugriffe auf das Dokument
Metadaten
Verfasserangaben:Axel BarlowORCiDGND, Stefanie HartmannORCiDGND, Javier Gonzalez, Michael HofreiterORCiDGND, Johanna L. A. PaijmansORCiDGND
URN:urn:nbn:de:kobv:517-opus4-472521
DOI:https://doi.org/10.25932/publishup-47252
ISSN:1866-8372
Titel des übergeordneten Werks (Deutsch):Postprints der Universität Potsdam : Mathematisch-Naturwissenschaftliche Reihe
Untertitel (Englisch):a method for generating pseudohaploid genome sequences from palaeogenomic datasets with reduced error rates
Schriftenreihe (Bandnummer):Zweitveröffentlichungen der Universität Potsdam : Mathematisch-Naturwissenschaftliche Reihe (1033)
Publikationstyp:Postprint
Sprache:Englisch
Datum der Erstveröffentlichung:14.12.2020
Erscheinungsjahr:2020
Veröffentlichende Institution:Universität Potsdam
Datum der Freischaltung:14.12.2020
Freies Schlagwort / Tag:D statistics; ancient DNA; bioinformatics; error reduction; palaeogenomics; sequencing error
Ausgabe:1033
Seitenanzahl:24
Quelle:Genes 11 (2020) 1, 50 DOI:10.3390/genes11010050
Organisationseinheiten:Mathematisch-Naturwissenschaftliche Fakultät / Institut für Biochemie und Biologie
DDC-Klassifikation:5 Naturwissenschaften und Mathematik / 57 Biowissenschaften; Biologie / 570 Biowissenschaften; Biologie
Peer Review:Referiert
Publikationsweg:Open Access / Green Open-Access
Lizenz (Deutsch):License LogoCC-BY - Namensnennung 4.0 International
Verstanden ✔
Diese Webseite verwendet technisch erforderliche Session-Cookies. Durch die weitere Nutzung der Webseite stimmen Sie diesem zu. Unsere Datenschutzerklärung finden Sie hier.