• Treffer 2 von 40
Zurück zur Trefferliste

A hierarchical model for incomplete alignments in phylogenetic inference

  • Motivation: Full-length DNA and protein sequences that span the entire length of a gene are ideally used for multiple sequence alignments (MSAs) and the subsequent inference of their relationships. Frequently, however, MSAs contain a substantial amount of missing data. For example, expressed sequence tags (ESTs), which are partial sequences of expressed genes, are the predominant source of sequence data for many organisms. The patterns of missing data typical for EST-derived alignments greatly compromise the accuracy of estimated phylogenies. Results: We present a statistical method for inferring phylogenetic trees from EST-based incomplete MSA data. We propose a class of hierarchical models for modeling pairwise distances between the sequences, and develop a fully Bayesian approach for estimation of the model parameters. Once the distance matrix is estimated, the phylogenetic tree may be constructed by applying neighbor-joining (or any other algorithm of choice). We also show that maximizing the marginal likelihood from the BayesianMotivation: Full-length DNA and protein sequences that span the entire length of a gene are ideally used for multiple sequence alignments (MSAs) and the subsequent inference of their relationships. Frequently, however, MSAs contain a substantial amount of missing data. For example, expressed sequence tags (ESTs), which are partial sequences of expressed genes, are the predominant source of sequence data for many organisms. The patterns of missing data typical for EST-derived alignments greatly compromise the accuracy of estimated phylogenies. Results: We present a statistical method for inferring phylogenetic trees from EST-based incomplete MSA data. We propose a class of hierarchical models for modeling pairwise distances between the sequences, and develop a fully Bayesian approach for estimation of the model parameters. Once the distance matrix is estimated, the phylogenetic tree may be constructed by applying neighbor-joining (or any other algorithm of choice). We also show that maximizing the marginal likelihood from the Bayesian approach yields similar results to a pro. le likelihood estimation. The proposed methods are illustrated using simulated protein families, for which the true phylogeny is known, and one real protein family.zeige mehrzeige weniger

Metadaten exportieren

Weitere Dienste

Suche bei Google Scholar Statistik - Anzahl der Zugriffe auf das Dokument
Metadaten
Verfasserangaben:Fuxia Cheng, Stefanie HartmannORCiDGND, Mayetri Gupta, Joseph G. Ibrahim, Todd J. Vision
URL:http://bioinformatics.oxfordjournals.org/
DOI:https://doi.org/10.1093/bioinformatics/btp015
ISSN:1367-4803
Publikationstyp:Wissenschaftlicher Artikel
Sprache:Englisch
Jahr der Erstveröffentlichung:2009
Erscheinungsjahr:2009
Datum der Freischaltung:25.03.2017
Quelle:Bioinformatics. - ISSN 1367-4803. - 25 (2009), 5, S. 592 - 598
Organisationseinheiten:Mathematisch-Naturwissenschaftliche Fakultät / Institut für Biochemie und Biologie
Peer Review:Referiert
Verstanden ✔
Diese Webseite verwendet technisch erforderliche Session-Cookies. Durch die weitere Nutzung der Webseite stimmen Sie diesem zu. Unsere Datenschutzerklärung finden Sie hier.