publish.UP Search

Annotated genome sequences of the carnivorous plant Roridula gorgonias and a non-carnivorous relative, Clethra arborea (2020)

Hartmann, Stefanie ; Preick, Michaela ; Abelt, Silke ; Scheffel, André ; Hofreiter, Michael

Objective Plant carnivory is distributed across the tree of life and has evolved at least six times independently, but sequenced and annotated nuclear genomes of carnivorous plants are currently lacking. We have sequenced and structurally annotated the nuclear genome of the carnivorous Roridula gorgonias and that of a non-carnivorous relative, Madeira’s lily-of-the-valley-tree, Clethra arborea, both within the Ericales. This data adds an important resource to study the evolutionary genetics of plant carnivory across angiosperm lineages and also for functional and systematic aspects of plants within the Ericales. Results Our assemblies have total lengths of 284 Mbp (R. gorgonias) and 511 Mbp (C. arborea) and show high BUSCO scores of 84.2% and 89.5%, respectively. We used their predicted genes together with publicly available data from other Ericales’ genomes and transcriptomes to assemble a phylogenomic data set for the inference of a species tree. However, groups of orthologs showed a marked absence of species represented by a transcriptome. We discuss possible reasons and caution against combining predicted genes from genome- and transriptome-based assemblies.

A hierarchical model for incomplete alignments in phylogenetic inference (2009)

Cheng, Fuxia ; Hartmann, Stefanie ; Gupta, Mayetri ; Ibrahim, Joseph G. ; Vision, Todd J.

Motivation: Full-length DNA and protein sequences that span the entire length of a gene are ideally used for multiple sequence alignments (MSAs) and the subsequent inference of their relationships. Frequently, however, MSAs contain a substantial amount of missing data. For example, expressed sequence tags (ESTs), which are partial sequences of expressed genes, are the predominant source of sequence data for many organisms. The patterns of missing data typical for EST-derived alignments greatly compromise the accuracy of estimated phylogenies. Results: We present a statistical method for inferring phylogenetic trees from EST-based incomplete MSA data. We propose a class of hierarchical models for modeling pairwise distances between the sequences, and develop a fully Bayesian approach for estimation of the model parameters. Once the distance matrix is estimated, the phylogenetic tree may be constructed by applying neighbor-joining (or any other algorithm of choice). We also show that maximizing the marginal likelihood from the Bayesian approach yields similar results to a pro. le likelihood estimation. The proposed methods are illustrated using simulated protein families, for which the true phylogeny is known, and one real protein family.

Functional insights from the GC-poor genomes of two aphid parasitoids, Aphidius ervi and Lysiphlebus fabarum (2020)

Background Parasitoid wasps have fascinating life cycles and play an important role in trophic networks, yet little is known about their genome content and function. Parasitoids that infect aphids are an important group with the potential for biological control. Their success depends on adapting to develop inside aphids and overcoming both host aphid defenses and their protective endosymbionts. Results We present the de novo genome assemblies, detailed annotation, and comparative analysis of two closely related parasitoid wasps that target pest aphids: Aphidius ervi and Lysiphlebus fabarum (Hymenoptera: Braconidae: Aphidiinae). The genomes are small (139 and 141 Mbp) and the most AT-rich reported thus far for any arthropod (GC content: 25.8 and 23.8%). This nucleotide bias is accompanied by skewed codon usage and is stronger in genes with adult-biased expression. AT-richness may be the consequence of reduced genome size, a near absence of DNA methylation, and energy efficiency. We identify missing desaturase genes, whose absence may underlie mimicry in the cuticular hydrocarbon profile of L. fabarum. We highlight key gene groups including those underlying venom composition, chemosensory perception, and sex determination, as well as potential losses in immune pathway genes. Conclusions These findings are of fundamental interest for insect evolution and biological control applications. They provide a strong foundation for further functional studies into coevolution between parasitoids and their hosts. Both genomes are available at https://bipaa.genouest.org.

On the phylogenetic position of Myzostomida : can 77 genes get it wrong? (2009)

Bleidorn, Christoph ; Podsiadlowski, Lars ; Zhong, Min ; Eeckhaut, Igor ; Hartmann, Stefanie ; Halanych, Kenneth M. ; Tiedemann, Ralph

Background: Phylogenomic analyses recently became popular to address questions about deep metazoan phylogeny. Ribosomal proteins (RP) dominate many of these analyses or are, in some cases, the only genes included. Despite initial hopes, hylogenomic analyses including tens to hundreds of genes still fail to robustly place many bilaterian taxa. Results: Using the phylogenetic position of myzostomids as an example, we show that phylogenies derived from RP genes and mitochondrial genes produce incongruent results. Whereas the former support a position within a clade of platyzoan taxa, mitochondrial data recovers an annelid affinity, which is strongly supported by the gene order data and is congruent with morphology. Using hypothesis testing, our RP data significantly rejects the annelids affinity, whereas a platyzoan relationship is significantly rejected by the mitochondrial data. Conclusion: We conclude (i) that reliance of a set of markers belonging to a single class of macromolecular complexes might bias the analysis, and (ii) that concatenation of all available data might introduce conflicting signal into phylogenetic analyses. We therefore strongly recommend testing for data incongruence in phylogenomic analyses. Furthermore, judging all available data, we consider the annelid affinity hypothesis more plausible than a possible platyzoan affinity for myzostomids, and suspect long branch attraction is influencing the RP data. However, this hypothesis needs further confirmation by future analyses.

Novel Genes, Ancient Genes, and Gene Co-Option Contributed o the Genetic Basis of the Radula, a Molluscan Innovation (2018)

Hilgers, Leon ; Hartmann, Stefanie ; Hofreiter, Michael ; von Rintelen, Thomas

The radula is the central foraging organ and apomorphy of the Mollusca. However, in contrast to other innovations, including the mollusk shell, genetic underpinnings of radula formation remain virtually unknown. Here, we present the first radula formative tissue transcriptome using the viviparous freshwater snail Tylomelania sarasinorum and compare it to foot tissue and the shell-building mantle of the same species. We combine differential expression, functional enrichment, and phylostratigraphic analyses to identify both specific and shared genetic underpinnings of the three tissues as well as their dominant functions and evolutionary origins. Gene expression of radula formative tissue is very distinct, but nevertheless more similar to mantle than to foot. Generally, the genetic bases of both radula and shell formation were shaped by novel orchestration of preexisting genes and continuous evolution of novel genes. A significantly increased proportion of radula-specific genes originated since the origin of stem-mollusks, indicating that novel genes were especially important for radula evolution. Genes with radula-specific expression in our study are frequently also expressed during the formation of other lophotrochozoan hard structures, like chaetae (hes1, arx), spicules (gbx), and shells of mollusks (gbx, heph) and brachiopods (heph), suggesting gene co-option for hard structure formation. Finally, a Lophotrochozoa-specific chitin synthase with a myosin motor domain (CS-MD), which is expressed during mollusk and brachiopod shell formation, had radula-specific expression in our study. CS-MD potentially facilitated the construction of complex chitinous structures and points at the potential of molecular novelties to promote the evolution of different morphological innovations.

A genomic comparison of putative pathogenicity-related gene families in five members of the Ophiostomatales with different lifestyles (2017)

Lah, Ljerka ; Löber, Ulrike ; Hsiang, Tom ; Hartmann, Stefanie

Ophiostomatoid fungi are vectored by their bark-beetle associates and colonize different host tree species. To survive and proliferate in the host, they have evolved mechanisms for detoxification and elimination of host defence compounds, efficient nutrient sequestration, and, in pathogenic species, virulence towards plants. Here, we assembled a draft genome of the spruce pathogen Ophiostoma bicolor. For our comparative and phylogenetic analyses, we mined the genomes of closely related species (Ophiostoma piceae, Ophiostoma ulmi, Ophiostoma novo-ulmi, and Grosmannia clavigera). Our aim was to acquire a genomic and evolutionary perspective of gene families important in host colonization. Genome comparisons showed that both the nuclear and mitochondrial genomes in our assembly were largely complete. Our O. bicolor 25.3 Mbp draft genome had 10 018 predicted genes, 6041 proteins with gene ontology (GO) annotation, 269 carbohydrate-active enzymes (CAZymes), 559 peptidases and inhibitors, and 1373 genes likely involved in pathogen-host interactions. Phylogenetic analyses of selected protein families revealed core sets of cytochrome P450 genes, ABC transporters and backbone genes involved in secondary metabolite (SM) biosynthesis (polyketide synthases (PKS) and non-ribosomal synthases), and species-specific gene losses and duplications. Phylogenetic analyses of protein families of interest provided insight into evolutionary adaptations to host biochemistry in ophiostomatoid fungi.

Reconstructing protein-coding sequences from ancient DNA (2020)

Hofreiter, Michael ; Hartmann, Stefanie

Obtaining information about functional details of proteins of extinct species is of critical importance for a better understanding of the real-life appearance, behavior and ecology of these lost entries in the book of life. In this chapter, we discuss the possibilities to retrieve the necessary DNA sequence information from paleogenomic data obtained from fossil specimens, which can then be used to express and subsequently analyze the protein of interest. We discuss the problems specific to ancient DNA, including mis-coding lesions, short read length and incomplete paleogenome assemblies. Finally, we discuss an alternative, but currently rarely used approach, direct PCR amplification, which is especially useful for comparatively short proteins.

Ancient DNA reveals twenty million years of aquatic life in beavers (2020)

Xenikoudakis, Georgios ; Ahmed, Mayeesha ; Harris, Jacob Colt ; Wadleigh, Rachel ; Paijmans, Johanna L. A. ; Hartmann, Stefanie ; Barlow, Axel ; Lerner, Heather ; Hofreiter, Michael

Xenikoudakis et al. report a partial mitochondrial genome of the extinct giant beaver Castoroides and estimate the origin of aquatic behavior in beavers to approximately 20 million years. This time estimate coincides with the extinction of terrestrial beavers and raises the question whether the two events had a common cause.

A mitogenomic timetree for Darwin's enigmatic South American mammal Macrauchenia patachonica (2017)

The unusual mix of morphological traits displayed by extinct South American native ungulates (SANUs) confounded both Charles Darwin, who first discovered them, and Richard Owen, who tried to resolve their relationships. Here we report an almost complete mitochondrial genome for the litoptern Macrauchenia. Our dated phylogenetic tree places Macrauchenia as sister to Perissodactyla, but close to the radiation of major lineages within Laurasiatheria. This position is consistent with a divergence estimate of B66Ma (95% credibility interval, 56.64-77.83 Ma) obtained for the split between Macrauchenia and other Panperissodactyla. Combined with their morphological distinctiveness, this evidence supports the positioning of Litopterna (possibly in company with other SANU groups) as a separate order within Laurasiatheria. We also show that, when using strict criteria, extinct taxa marked by deep divergence times and a lack of close living relatives may still be amenable to palaeogenomic analysis through iterative mapping against more distant relatives.

Consensify (2020)

Barlow, Axel ; Hartmann, Stefanie ; Gonzalez, Javier ; Hofreiter, Michael ; Paijmans, Johanna L. A.

A standard practise in palaeogenome analysis is the conversion of mapped short read data into pseudohaploid sequences, frequently by selecting a single high-quality nucleotide at random from the stack of mapped reads. This controls for biases due to differential sequencing coverage, but it does not control for differential rates and types of sequencing error, which are frequently large and variable in datasets obtained from ancient samples. These errors have the potential to distort phylogenetic and population clustering analyses, and to mislead tests of admixture using D statistics. We introduce Consensify, a method for generating pseudohaploid sequences, which controls for biases resulting from differential sequencing coverage while greatly reducing error rates. The error correction is derived directly from the data itself, without the requirement for additional genomic resources or simplifying assumptions such as contemporaneous sampling. For phylogenetic and population clustering analysis, we find that Consensify is less affected by artefacts than methods based on single read sampling. For D statistics, Consensify is more resistant to false positives and appears to be less affected by biases resulting from different laboratory protocols than other frequently used methods. Although Consensify is developed with palaeogenomic data in mind, it is applicable for any low to medium coverage short read datasets. We predict that Consensify will be a useful tool for future studies of palaeogenomes.

Author(s)
Title
Additional Person(s)
Referee(s)
Abstract
Fulltext

Refine

Has Fulltext

Author

Year of publication

Document Type

Language

Is part of the Bibliography

Keywords

Institute

40 search hits