Refine
Year of publication
Document Type
- Article (10)
- Postprint (8)
- Doctoral Thesis (1)
Language
- English (19)
Is part of the Bibliography
- yes (19)
Keywords
- biomarker (3)
- machine learning (3)
- untargeted metabolomics (3)
- Gene Ontology (2)
- IDP (2)
- LEA protein (2)
- LysoPC(20:0) (2)
- MS neurodegeneration (2)
- PPMS (2)
- Phosphorylation Site (2)
Institute
Bioinformatics studies of biological systems across multiple levels of molecular organization
(2012)
Primary progressive multiple sclerosis (PPMS) shows a highly variable disease progression with poor prognosis and a characteristic accumulation of disabilities in patients. These hallmarks of PPMS make it difficult to diagnose and currently impossible to efficiently treat. This study aimed to identify plasma metabolite profiles that allow diagnosis of PPMS and its differentiation from the relapsing remitting subtype (RRMS), primary neurodegenerative disease (Parkinson’s disease, PD), and healthy controls (HCs) and that significantly change during the disease course and could serve as surrogate markers of multiple sclerosis (MS)-associated neurodegeneration over time. We applied untargeted high-resolution metabolomics to plasma samples to identify PPMS-specific signatures, validated our findings in independent sex- and age-matched PPMS and HC cohorts and built discriminatory models by partial least square discriminant analysis (PLS-DA). This signature was compared to sex- and age-matched RRMS patients, to patients with PD and HC. Finally, we investigated these metabolites in a longitudinal cohort of PPMS patients over a 24-month period. PLS-DA yielded predictive models for classification along with a set of 20 PPMS-specific informative metabolite markers. These metabolites suggest disease-specific alterations in glycerophospholipid and linoleic acid pathways. Notably, the glycerophospholipid LysoPC(20:0) significantly decreased during the observation period. These findings show potential for diagnosis and disease course monitoring, and might serve as biomarkers to assess treatment efficacy in future clinical trials for neuroprotective MS therapies.
Primary progressive multiple sclerosis (PPMS) shows a highly variable disease progression with poor prognosis and a characteristic accumulation of disabilities in patients. These hallmarks of PPMS make it difficult to diagnose and currently impossible to efficiently treat. This study aimed to identify plasma metabolite profiles that allow diagnosis of PPMS and its differentiation from the relapsing remitting subtype (RRMS), primary neurodegenerative disease (Parkinson’s disease, PD), and healthy controls (HCs) and that significantly change during the disease course and could serve as surrogate markers of multiple sclerosis (MS)-associated neurodegeneration over time. We applied untargeted high-resolution metabolomics to plasma samples to identify PPMS-specific signatures, validated our findings in independent sex- and age-matched PPMS and HC cohorts and built discriminatory models by partial least square discriminant analysis (PLS-DA). This signature was compared to sex- and age-matched RRMS patients, to patients with PD and HC. Finally, we investigated these metabolites in a longitudinal cohort of PPMS patients over a 24-month period. PLS-DA yielded predictive models for classification along with a set of 20 PPMS-specific informative metabolite markers. These metabolites suggest disease-specific alterations in glycerophospholipid and linoleic acid pathways. Notably, the glycerophospholipid LysoPC(20:0) significantly decreased during the observation period. These findings show potential for diagnosis and disease course monitoring, and might serve as biomarkers to assess treatment efficacy in future clinical trials for neuroprotective MS therapies.
Parkinson's disease (PD) shows high heterogeneity with regard to the underlying molecular pathogenesis involving multiple pathways and mechanisms. Diagnosis is still challenging and rests entirely on clinical features. Thus, there is an urgent need for robust diagnostic biofluid markers. Untargeted metabolomics allows establishing low-molecular compound biomarkers in a wide range of complex diseases by the measurement of various molecular classes in biofluids such as blood plasma, serum, and cerebrospinal fluid (CSF). Here, we applied untargeted high-resolution mass spectrometry to determine plasma and CSF metabolite profiles. We semiquantitatively determined small-molecule levels (<= 1.5 kDa) in the plasma and CSF from early PD patients (disease duration 0-4 years; n = 80 and 40, respectively), and sex-and age-matched controls (n = 76 and 38, respectively). We performed statistical analyses utilizing partial least square and random forest analysis with a 70/30 training and testing split approach, leading to the identification of 20 promising plasma and 14 CSF metabolites. The semetabolites differentiated the test set with an AUC of 0.8 (plasma) and 0.9 (CSF). Characteristics of the metabolites indicate perturbations in the glycerophospholipid, sphingolipid, and amino acid metabolism in PD, which underscores the high power of metabolomic approaches. Further studies will enable to develop a potential metabolite-based biomarker panel specific for PD
Climate models predict an increased likelihood of seasonal droughts for many areas of the world. Breeding for drought tolerance could be accelerated by marker-assisted selection. As a basis for marker identification, we studied the genetic variance, predictability of field performance and potential costs of tolerance in potato (Solanum tuberosum L.). Potato produces high calories per unit of water invested, but is drought-sensitive. In 14 independent pot or field trials, 34 potato cultivars were grown under optimal and reduced water supply to determine starch yield. In an artificial dataset, we tested several stress indices for their power to distinguish tolerant and sensitive genotypes independent of their yield potential. We identified the deviation of relative starch yield from the experimental median (DRYM) as the most efficient index. DRYM corresponded qualitatively to the partial least square model-based metric of drought stress tolerance in a stress effect model. The DRYM identified significant tolerance variation in the European potato cultivar population to allow tolerance breeding and marker identification. Tolerance results from pot trials correlated with those from field trials but predicted field performance worse than field growth parameters. Drought tolerance correlated negatively with yield under optimal conditions in the field. The distribution of yield data versus DRYM indicated that tolerance can be combined with average yield potentials, thus circumventing potential yield penalties in tolerance breeding.
Potato (Solanum tuberosum L.) is one of the most important food crops worldwide. Current potato varieties are highly susceptible to drought stress. In view of global climate change, selection of cultivars with improved drought tolerance and high yield potential is of paramount importance. Drought tolerance breeding of potato is currently based on direct selection according to yield and phenotypic traits and requires multiple trials under drought conditions. Marker‐assisted selection (MAS) is cheaper, faster and reduces classification errors caused by noncontrolled environmental effects. We analysed 31 potato cultivars grown under optimal and reduced water supply in six independent field trials. Drought tolerance was determined as tuber starch yield. Leaf samples from young plants were screened for preselected transcript and nontargeted metabolite abundance using qRT‐PCR and GC‐MS profiling, respectively. Transcript marker candidates were selected from a published RNA‐Seq data set. A Random Forest machine learning approach extracted metabolite and transcript markers for drought tolerance prediction with low error rates of 6% and 9%, respectively. Moreover, by combining transcript and metabolite markers, the prediction error was reduced to 4.3%. Feature selection from Random Forest models allowed model minimization, yielding a minimal combination of only 20 metabolite and transcript markers that were successfully tested for their reproducibility in 16 independent agronomic field trials. We demonstrate that a minimum combination of transcript and metabolite markers sampled at early cultivation stages predicts potato yield stability under drought largely independent of seasonal and regional agronomic conditions.
Potato (Solanum tuberosum L.) is one of the most important food crops worldwide. Current potato varieties are highly susceptible to drought stress. In view of global climate change, selection of cultivars with improved drought tolerance and high yield potential is of paramount importance. Drought tolerance breeding of potato is currently based on direct selection according to yield and phenotypic traits and requires multiple trials under drought conditions. Marker-assisted selection (MAS) is cheaper, faster and reduces classification errors caused by noncontrolled environmental effects. We analysed 31 potato cultivars grown under optimal and reduced water supply in six independent field trials. Drought tolerance was determined as tuber starch yield. Leaf samples from young plants were screened for preselected transcript and nontargeted metabolite abundance using qRT-PCR and GC-MS profiling, respectively. Transcript marker candidates were selected from a published RNA-Seq data set. A Random Forest machine learning approach extracted metabolite and transcript markers for drought tolerance prediction with low error rates of 6% and 9%, respectively. Moreover, by combining transcript and metabolite markers, the prediction error was reduced to 4.3%. Feature selection from Random Forest models allowed model minimization, yielding a minimal combination of only 20 metabolite and transcript markers that were successfully tested for their reproducibility in 16 independent agronomic field trials. We demonstrate that a minimum combination of transcript and metabolite markers sampled at early cultivation stages predicts potato yield stability under drought largely independent of seasonal and regional agronomic conditions.
RNA folding is assumed to be a hierarchical process. The secondary structure of an RNA molecule, signified by base-pairing and stacking interactions between the paired bases, is formed first. Subsequently, the RNA molecule adopts an energetically favorable three-dimensional conformation in the structural space determined mainly by the rotational degrees of freedom associated with the backbone of regions of unpaired nucleotides (loops). To what extent the backbone conformation of RNA loops also results from interactions within the local sequence context or rather follows global optimization constraints alone has not been addressed yet. Because the majority of base stacking interactions are exerted locally, a critical influence of local sequence on local structure appears plausible. Thus, local loop structure ought to be predictable, at least in part, from the local sequence context alone. To test this hypothesis, we used Random Forests on a nonredundant data set of unpaired nucleotides extracted from 97 X-ray structures from the Protein Data Bank (PDB) to predict discrete backbone angle conformations given by the discretized eta/theta-pseudo-torsional space. Predictions on balanced sets with four to six conformational classes using local sequence information yielded average accuracies of up to 55%, thus significantly better than expected by chance (17%-25%). Bases close to the central nucleotide appear to be most tightly linked to its conformation. Our results suggest that RNA loop structure does not only depend on long-range base-pairing interactions; instead, it appears that local sequence context exerts a significant influence on the formation of the local loop structure.
Background: Protein phosphorylation is an important post-translational modification influencing many aspects of dynamic cellular behavior. Site-specific phosphorylation of amino acid residues serine, threonine, and tyrosine can have profound effects on protein structure, activity, stability, and interaction with other biomolecules. Phosphorylation sites can be affected in diverse ways in members of any species, one such way is through single nucleotide polymorphisms (SNPs). The availability of large numbers of experimentally identified phosphorylation sites, and of natural variation datasets in Arabidopsis thaliana prompted us to analyze the effect of non-synonymous SNPs (nsSNPs) onto phosphorylation sites.
Results: From the analyses of 7,178 experimentally identified phosphorylation sites we found that: (i) Proteins with multiple phosphorylation sites occur more often than expected by chance. (ii) Phosphorylation hotspots show a preference to be located outside conserved domains. (iii) nsSNPs affected experimental phosphorylation sites as much as the corresponding non-phosphorylated amino acid residues. (iv) Losses of experimental phosphorylation sites by nsSNPs were identified in 86 A. thaliana proteins, among them receptor proteins were overrepresented.
These results were confirmed by similar analyses of predicted phosphorylation sites in A. thaliana. In addition, predicted threonine phosphorylation sites showed a significant enrichment of nsSNPs towards asparagines and a significant depletion of the synonymous substitution. Proteins in which predicted phosphorylation sites were affected by nsSNPs (loss and gain), were determined to be mainly receptor proteins, stress response proteins and proteins involved in nucleotide and protein binding. Proteins involved in metabolism, catalytic activity and biosynthesis were less affected.
Conclusions: We analyzed more than 7,100 experimentally identified phosphorylation sites in almost 4,300 protein-coding loci in silico, thus constituting the largest phosphoproteomics dataset for A. thaliana available to date. Our findings suggest a relatively high variability in the presence or absence of phosphorylation sites between different natural accessions in receptor and other proteins involved in signal transduction. Elucidating the effect of phosphorylation sites affected by nsSNPs on adaptive responses represents an exciting research goal for the future.
Background: Protein phosphorylation is an important post-translational modification influencing many aspects of dynamic cellular behavior. Site-specific phosphorylation of amino acid residues serine, threonine, and tyrosine can have profound effects on protein structure, activity, stability, and interaction with other biomolecules. Phosphorylation sites can be affected in diverse ways in members of any species, one such way is through single nucleotide polymorphisms (SNPs). The availability of large numbers of experimentally identified phosphorylation sites, and of natural variation datasets in Arabidopsis thaliana prompted us to analyze the effect of non-synonymous SNPs (nsSNPs) onto phosphorylation sites.
Results: From the analyses of 7,178 experimentally identified phosphorylation sites we found that: (i) Proteins with multiple phosphorylation sites occur more often than expected by chance. (ii) Phosphorylation hotspots show a preference to be located outside conserved domains. (iii) nsSNPs affected experimental phosphorylation sites as much as the corresponding non-phosphorylated amino acid residues. (iv) Losses of experimental phosphorylation sites by nsSNPs were identified in 86 A. thaliana proteins, among them receptor proteins were overrepresented.
These results were confirmed by similar analyses of predicted phosphorylation sites in A. thaliana. In addition, predicted threonine phosphorylation sites showed a significant enrichment of nsSNPs towards asparagines and a significant depletion of the synonymous substitution. Proteins in which predicted phosphorylation sites were affected by nsSNPs (loss and gain), were determined to be mainly receptor proteins, stress response proteins and proteins involved in nucleotide and protein binding. Proteins involved in metabolism, catalytic activity and biosynthesis were less affected.
Conclusions: We analyzed more than 7,100 experimentally identified phosphorylation sites in almost 4,300 protein-coding loci in silico, thus constituting the largest phosphoproteomics dataset for A. thaliana available to date. Our findings suggest a relatively high variability in the presence or absence of phosphorylation sites between different natural accessions in receptor and other proteins involved in signal transduction. Elucidating the effect of phosphorylation sites affected by nsSNPs on adaptive responses represents an exciting research goal for the future.