publish.UP Search

Exploiting gene families for phylogenomic analysis of myzostomid transcriptome data (2012)

Hartmann, Stefanie ; Helm, Conrad ; Nickel, Birgit ; Meyer, Matthias ; Struck, Torsten H. ; Tiedemann, Ralph ; Selbig, Joachim ; Bleidorn, Christoph

Background: In trying to understand the evolutionary relationships of organisms, the current flood of sequence data offers great opportunities, but also reveals new challenges with regard to data quality, the selection of data for subsequent analysis, and the automation of steps that were once done manually for single-gene analyses. Even though genome or transcriptome data is available for representatives of most bilaterian phyla, some enigmatic taxa still have an uncertain position in the animal tree of life. This is especially true for myzostomids, a group of symbiotic ( or parasitic) protostomes that are either placed with annelids or flatworms. Methodology: Based on similarity criteria, Illumina-based transcriptome sequences of one myzostomid were compared to protein sequences of one additional myzostomid and 29 reference metazoa and clustered into gene families. These families were then used to investigate the phylogenetic position of Myzostomida using different approaches: Alignments of 989 sequence families were concatenated, and the resulting superalignment was analyzed under a Maximum Likelihood criterion. We also used all 1,878 gene trees with at least one myzostomid sequence for a supertree approach: the individual gene trees were computed and then reconciled into a species tree using gene tree parsimony. Conclusions: Superalignments require strictly orthologous genes, and both the gene selection and the widely varying amount of data available for different taxa in our dataset may cause anomalous placements and low bootstrap support. In contrast, gene tree parsimony is designed to accommodate multilocus gene families and therefore allows a much more comprehensive data set to be analyzed. Results of this supertree approach showed a well-resolved phylogeny, in which myzostomids were part of the annelid radiation, and major bilaterian taxa were found to be monophyletic.

Identification of sperm proteins as candidate biomarkers for the analysis of reproductive isolation in Mytilus: a case study for the enkurin locus (2012)

Bartel, Manuela ; Hartmann, Stefanie ; Lehmann, Karola ; Postel, Kai ; Quesada, Humberto ; Philipp, Eva E. R. ; Heilmann, Katja ; Micheel, Burkhard ; Stuckas, Heiko

Sperm proteins of the marine sessile mussels of the Mytilus edulis species complex are models to investigate reproductive isolation and speciation. This study aimed at identifying sperm proteins and their corresponding genes. This was aided by the use of monoclonal antibodies that preferentially bind to yet unknown sperm molecules. By identifying their target molecules, this approach identified proteins with relevance to Mytilus sperm function. This procedure identified 16 proteins, for example, enkurin, laminin, porin and heat shock proteins. The potential use of these proteins as genetic markers to study reproductive isolation is exemplified by analysing the enkurin locus. Enkurin evolution is driven by purifying selection, the locus displays high levels of intraspecific variation and species-specific alleles group in distinct phylogenetic clusters. These findings characterize enkurin as informative candidate biomarker for analyses of clinal variation and differential introgression in hybrid zones, for example, to understand determinants of reproductive isolation in Baltic Mytilus populations.

Analysis of phylogenetic signal in protostomial intron patterns using Mutual Information (2013)

Hill, Natascha ; Leow, Alexander ; Bleidorn, Christoph ; Groth, Detlef ; Tiedemann, Ralph ; Selbig, Joachim ; Hartmann, Stefanie

Many deep evolutionary divergences still remain unresolved, such as those among major taxa of the Lophotrochozoa. As alternative phylogenetic markers, the intron-exon structure of eukaryotic genomes and the patterns of absence and presence of spliceosomal introns appear to be promising. However, given the potential homoplasy of intron presence, the phylogenetic analysis of this data using standard evolutionary approaches has remained a challenge. Here, we used Mutual Information (MI) to estimate the phylogeny of Protostomia using gene structure data, and we compared these results with those obtained with Dollo Parsimony. Using full genome sequences from nine Metazoa, we identified 447 groups of orthologous sequences with 21,732 introns in 4,870 unique intron positions. We determined the shared absence and presence of introns in the corresponding sequence alignments and have made this data available in "IntronBase", a web-accessible and downloadable SQLite database. Our results obtained using Dollo Parsimony are obviously misled through systematic errors that arise from multiple intron loss events, but extensive filtering of data improved the quality of the estimated phylogenies. Mutual Information, in contrast, performs better with larger datasets, but at the same time it requires a complete data set, which is difficult to obtain for orthologs from a large number of taxa. Nevertheless, Mutual Information-based distances proved to be useful in analyzing this kind of data, also because the estimation of MI-based distances is independent of evolutionary models and therefore no pre-definitions of ancestral and derived character states are necessary.

The Arabidopsis Kinome (2014)

Zulawski, Monika ; Schulze, Gunnar ; Braginets, Rostyslav ; Hartmann, Stefanie ; Schulze, Waltraud X

Background Protein kinases constitute a particularly large protein family in Arabidopsis with important functions in cellular signal transduction networks. At the same time Arabidopsis is a model plant with high frequencies of gene duplications. Here, we have conducted a systematic analysis of the Arabidopsis kinase complement, the kinome, with particular focus on gene duplication events. We matched Arabidopsis proteins to a Hidden-Markov Model of eukaryotic kinases and computed a phylogeny of 942 Arabidopsis protein kinase domains and mapped their origin by gene duplication. Results The phylogeny showed two major clades of receptor kinases and soluble kinases, each of which was divided into functional subclades. Based on this phylogeny, association of yet uncharacterized kinases to families was possible which extended functional annotation of unknowns. Classification of gene duplications within these protein kinases revealed that representatives of cytosolic subfamilies showed a tendency to maintain segmentally duplicated genes, while some subfamilies of the receptor kinases were enriched for tandem duplicates. Although functional diversification is observed throughout most subfamilies, some instances of functional conservation among genes transposed from the same ancestor were observed. In general, a significant enrichment of essential genes was found among genes encoding for protein kinases. Conclusions The inferred phylogeny allowed classification and annotation of yet uncharacterized kinases. The prediction and analysis of syntenic blocks and duplication events within gene families of interest can be used to link functional biology to insights from an evolutionary viewpoint. The approach undertaken here can be applied to any gene family in any organism with an annotated genome.

Comparative analysis of the gonadal transcriptomes of the all-female species Poecilia formosa and its maternal ancestor Poecilia mexicana (2014)

Schedina, Ina Maria ; Hartmann, Stefanie ; Groth, Detlef ; Schlupp, Ingo ; Tiedemann, Ralph

Background The Amazon molly, Poecilia formosa (Teleostei: Poeciliinae) is an unisexual, all-female species. It evolved through the hybridisation of two closely related sexual species and exhibits clonal reproduction by sperm dependent parthenogenesis (or gynogenesis) where the sperm of a parental species is only used to activate embryogenesis of the apomictic, diploid eggs but does not contribute genetic material to the offspring. Here we provide and describe the first de novo assembled transcriptome of the Amazon molly in comparison with its maternal ancestor, the Atlantic molly Poecilia mexicana. The transcriptome data were produced through sequencing of single end libraries (100 bp) with the Illumina sequencing technique. Results 83,504,382 reads for the Amazon molly and 81,625,840 for the Atlantic molly were assembled into 127,283 and 78,961 contigs for the Amazon molly and the Atlantic molly, respectively. 63% resp. 57% of the contigs could be annotated with gene ontology terms after sequence similarity comparisons. Furthermore, we were able to identify genes normally involved in reproduction and especially in meiosis also in the transcriptome dataset of the apomictic reproducing Amazon molly. Conclusions We assembled and annotated the transcriptome of a non-model organism, the Amazon molly, without a reference genome (de novo). The obtained dataset is a fundamental resource for future research in functional and expression analysis. Also, the presence of 30 meiosis-specific genes within a species where no meiosis is known to take place is remarkable and raises new questions for future research.

Isolation and characterization of eight microsatellite loci in the brook lamprey Lampetra planeri (Petromyzontiformes) using 454 sequence data (2014)

Schedina, Ina-Maria ; Pfautsch, Simone ; Hartmann, Stefanie ; Dolgener, N. ; Polgar, Anika ; Bianco, Pier Giorgio ; Tiedemann, Ralph ; Ketmaier, Valerio

Eight polymorphic microsatellite loci were developed for the brook lamprey Lampetra planeri through 454 sequencing and their usefulness was tested in 45 individuals of both L. planeri and the river lamprey Lampetra fluviatilis. The number of alleles per loci ranged between two and five; the Italian and Irish populations had a mean expected heterozygosity of 0.388 and 0.424 and a mean observed heterozygosity of 0.418 and 0.411, respectively. (C) 2014 The Fisheries Society of the British Isles

The Arabidopsis Kinome: phylogeny and evolutionary insights into functional diversification (2014)

Zulawski, Monika ; Schulze, Gunnar ; Braginets, Rostyslav ; Hartmann, Stefanie ; Schulze, Waltraud X.

Background: Protein kinases constitute a particularly large protein family in Arabidopsis with important functions in cellular signal transduction networks. At the same time Arabidopsis is a model plant with high frequencies of gene duplications. Here, we have conducted a systematic analysis of the Arabidopsis kinase complement, the kinome, with particular focus on gene duplication events. We matched Arabidopsis proteins to a Hidden-Markov Model of eukaryotic kinases and computed a phylogeny of 942 Arabidopsis protein kinase domains and mapped their origin by gene duplication. Results: The phylogeny showed two major clades of receptor kinases and soluble kinases, each of which was divided into functional subclades. Based on this phylogeny, association of yet uncharacterized kinases to families was possible which extended functional annotation of unknowns. Classification of gene duplications within these protein kinases revealed that representatives of cytosolic subfamilies showed a tendency to maintain segmentally duplicated genes, while some subfamilies of the receptor kinases were enriched for tandem duplicates. Although functional diversification is observed throughout most subfamilies, some instances of functional conservation among genes transposed from the same ancestor were observed. In general, a significant enrichment of essential genes was found among genes encoding for protein kinases. Conclusions: The inferred phylogeny allowed classification and annotation of yet uncharacterized kinases. The prediction and analysis of syntenic blocks and duplication events within gene families of interest can be used to link functional biology to insights from an evolutionary viewpoint. The approach undertaken here can be applied to any gene family in any organism with an annotated genome.

Endogenous murine leukemia retroviral variation across wild European and inbred strains of house mouse (2015)

Hartmann, Stefanie ; Hasenkamp, Natascha ; Mayer, Jens ; Michaux, Johan ; Morand, Serge ; Mazzoni, Camila J. ; Roca, Alfred L. ; Greenwood, Alex D.

Background: Endogenous murine leukemia retroviruses (MLVs) are high copy number proviral elements difficult to comprehensively characterize using standard low throughput sequencing approaches. However, high throughput approaches generate data that is challenging to process, interpret and present. Results: Next generation sequencing (NGS) data was generated for MLVs from two wild caught Mus musculus domesticus (from mainland France and Corsica) and for inbred laboratory mouse strains C3H, LP/J and SJL. Sequence reads were grouped using a novel sequence clustering approach as applied to retroviral sequences. A Markov cluster algorithm was employed, and the sequence reads were queried for matches to specific xenotropic (Xmv), polytropic (Pmv) and modified polytropic (Mpmv) viral reference sequences. Conclusions: Various MLV subtypes were more widespread than expected among the mice, which may be due to the higher coverage of NGS, or to the presence of similar sequence across many different proviral loci. The results did not correlate with variation in the major MLV receptor Xpr1, which can restrict exogenous MLVs, suggesting that endogenous MLV distribution may reflect gene flow more than past resistance to infection.

Endogenous murine leukemia retroviral variation across wild European and inbred strains of house mouse (2015)

Hartmann, Stefanie ; Hasenkamp, Natascha ; Mayer, Jens ; Michaux, Johan ; Morand, Serge ; Mazzoni, Camila J. ; Roca, Alfred L. ; Greenwood, Alex D.

Background: Endogenous murine leukemia retroviruses (MLVs) are high copy number proviral elements difficult to comprehensively characterize using standard low throughput sequencing approaches. However, high throughput approaches generate data that is challenging to process, interpret and present. Results: Next generation sequencing (NGS) data was generated for MLVs from two wild caught Mus musculus domesticus (from mainland France and Corsica) and for inbred laboratory mouse strains C3H, LP/J and SJL. Sequence reads were grouped using a novel sequence clustering approach as applied to retroviral sequences. A Markov cluster algorithm was employed, and the sequence reads were queried for matches to specific xenotropic (Xmv), polytropic (Pmv) and modified polytropic (Mpmv) viral reference sequences. Conclusions: Various MLV subtypes were more widespread than expected among the mice, which may be due to the higher coverage of NGS, or to the presence of similar sequence across many different proviral loci. The results did not correlate with variation in the major MLV receptor Xpr1, which can restrict exogenous MLVs, suggesting that endogenous MLV distribution may reflect gene flow more than past resistance to infection.

A genomic comparison of putative pathogenicity-related gene families in five members of the Ophiostomatales with different lifestyles (2017)

Lah, Ljerka ; Löber, Ulrike ; Hsiang, Tom ; Hartmann, Stefanie

Ophiostomatoid fungi are vectored by their bark-beetle associates and colonize different host tree species. To survive and proliferate in the host, they have evolved mechanisms for detoxification and elimination of host defence compounds, efficient nutrient sequestration, and, in pathogenic species, virulence towards plants. Here, we assembled a draft genome of the spruce pathogen Ophiostoma bicolor. For our comparative and phylogenetic analyses, we mined the genomes of closely related species (Ophiostoma piceae, Ophiostoma ulmi, Ophiostoma novo-ulmi, and Grosmannia clavigera). Our aim was to acquire a genomic and evolutionary perspective of gene families important in host colonization. Genome comparisons showed that both the nuclear and mitochondrial genomes in our assembly were largely complete. Our O. bicolor 25.3 Mbp draft genome had 10 018 predicted genes, 6041 proteins with gene ontology (GO) annotation, 269 carbohydrate-active enzymes (CAZymes), 559 peptidases and inhibitors, and 1373 genes likely involved in pathogen-host interactions. Phylogenetic analyses of selected protein families revealed core sets of cytochrome P450 genes, ABC transporters and backbone genes involved in secondary metabolite (SM) biosynthesis (polyketide synthases (PKS) and non-ribosomal synthases), and species-specific gene losses and duplications. Phylogenetic analyses of protein families of interest provided insight into evolutionary adaptations to host biochemistry in ophiostomatoid fungi.

Author(s)
Title
Additional Person(s)
Referee(s)
Abstract
Fulltext

Refine

Has Fulltext

Author

Year of publication

Document Type

Language

Is part of the Bibliography

Keywords

Institute

43 search hits