TY - GEN A1 - Riaño-Pachón, Diego Mauricio A1 - Kleessen, Sabrina A1 - Neigenfind, Jost A1 - Durek, Pawel A1 - Weber, Elke A1 - Engelsberger, Wolfgang R. A1 - Walther, Dirk A1 - Selbig, Joachim A1 - Schulze, Waltraud X. A1 - Kersten, Birgit T1 - Proteome-wide survey of phosphorylation patterns affected by nuclear DNA polymorphisms in Arabidopsis thaliana T2 - Zweitveröffentlichungen der Universität Potsdam : Mathematisch-Naturwissenschaftliche Reihe N2 - Background: Protein phosphorylation is an important post-translational modification influencing many aspects of dynamic cellular behavior. Site-specific phosphorylation of amino acid residues serine, threonine, and tyrosine can have profound effects on protein structure, activity, stability, and interaction with other biomolecules. Phosphorylation sites can be affected in diverse ways in members of any species, one such way is through single nucleotide polymorphisms (SNPs). The availability of large numbers of experimentally identified phosphorylation sites, and of natural variation datasets in Arabidopsis thaliana prompted us to analyze the effect of non-synonymous SNPs (nsSNPs) onto phosphorylation sites. Results: From the analyses of 7,178 experimentally identified phosphorylation sites we found that: (i) Proteins with multiple phosphorylation sites occur more often than expected by chance. (ii) Phosphorylation hotspots show a preference to be located outside conserved domains. (iii) nsSNPs affected experimental phosphorylation sites as much as the corresponding non-phosphorylated amino acid residues. (iv) Losses of experimental phosphorylation sites by nsSNPs were identified in 86 A. thaliana proteins, among them receptor proteins were overrepresented. These results were confirmed by similar analyses of predicted phosphorylation sites in A. thaliana. In addition, predicted threonine phosphorylation sites showed a significant enrichment of nsSNPs towards asparagines and a significant depletion of the synonymous substitution. Proteins in which predicted phosphorylation sites were affected by nsSNPs (loss and gain), were determined to be mainly receptor proteins, stress response proteins and proteins involved in nucleotide and protein binding. Proteins involved in metabolism, catalytic activity and biosynthesis were less affected. Conclusions: We analyzed more than 7,100 experimentally identified phosphorylation sites in almost 4,300 protein-coding loci in silico, thus constituting the largest phosphoproteomics dataset for A. thaliana available to date. Our findings suggest a relatively high variability in the presence or absence of phosphorylation sites between different natural accessions in receptor and other proteins involved in signal transduction. Elucidating the effect of phosphorylation sites affected by nsSNPs on adaptive responses represents an exciting research goal for the future. T3 - Zweitveröffentlichungen der Universität Potsdam : Mathematisch-Naturwissenschaftliche Reihe - 1328 KW - Gene Ontology KW - Phosphorylation Site KW - phosphorylated amino acid KW - slim term KW - single nucleotide polymorphism mapping Y1 - 2010 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-431181 SN - 1866-8372 IS - 1328 ER - TY - GEN A1 - Knox-Brown, Patrick A1 - Rindfleisch, Tobias A1 - Günther, Anne A1 - Balow, Kim A1 - Bremer, Anne A1 - Walther, Dirk A1 - Miettinen, Markus S. A1 - Hincha, Dirk K. A1 - Thalhammer, Anja T1 - Similar Yet Different BT - Structural and Functional Diversity among Arabidopsis thaliana LEA_4 Proteins T2 - Postprints der Universität Potsdam : Mathematisch-Naturwissenschaftliche Reihe N2 - The importance of intrinsically disordered late embryogenesis abundant (LEA) proteins in the tolerance to abiotic stresses involving cellular dehydration is undisputed. While structural transitions of LEA proteins in response to changes in water availability are commonly observed and several molecular functions have been suggested, a systematic, comprehensive and comparative study of possible underlying sequence-structure-function relationships is still lacking. We performed molecular dynamics (MD) simulations as well as spectroscopic and light scattering experiments to characterize six members of two distinct, lowly homologous clades of LEA_4 family proteins from Arabidopsis thaliana. We compared structural and functional characteristics to elucidate to what degree structure and function are encoded in LEA protein sequences and complemented these findings with physicochemical properties identified in a systematic bioinformatics study of the entire Arabidopsis thaliana LEA_4 family. Our results demonstrate that although the six experimentally characterized LEA_4 proteins have similar structural and functional characteristics, differences concerning their folding propensity and membrane stabilization capacity during a freeze/thaw cycle are obvious. These differences cannot be easily attributed to sequence conservation, simple physicochemical characteristics or the abundance of sequence motifs. Moreover, the folding propensity does not appear to be correlated with membrane stabilization capacity. Therefore, the refinement of LEA_4 structural and functional properties is likely encoded in specific patterns of their physicochemical characteristics. T3 - Zweitveröffentlichungen der Universität Potsdam : Mathematisch-Naturwissenschaftliche Reihe - 901 KW - IDP KW - LEA protein KW - abiotic stress KW - dehydration KW - conformational rearrangement KW - membrane stabilization KW - sequence-structure-function relationship Y1 - 2020 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-469419 SN - 1866-8372 IS - 901 ER - TY - GEN A1 - Köhl, Karin I. A1 - Basler, Georg A1 - Lüdemann, Alexander A1 - Selbig, Joachim A1 - Walther, Dirk T1 - A plant resource and experiment management system based on the Golm Plant Database as a basic tool for omics research T2 - Postprints der Universität Potsdam : Mathematisch Naturwissenschaftliche Reihe N2 - Background: For omics experiments, detailed characterisation of experimental material with respect to its genetic features, its cultivation history and its treatment history is a requirement for analyses by bioinformatics tools and for publication needs. Furthermore, meta-analysis of several experiments in systems biology based approaches make it necessary to store this information in a standardised manner, preferentially in relational databases. In the Golm Plant Database System, we devised a data management system based on a classical Laboratory Information Management System combined with web-based user interfaces for data entry and retrieval to collect this information in an academic environment. Results: The database system contains modules representing the genetic features of the germplasm, the experimental conditions and the sampling details. In the germplasm module, genetically identical lines of biological material are generated by defined workflows, starting with the import workflow, followed by further workflows like genetic modification (transformation), vegetative or sexual reproduction. The latter workflows link lines and thus create pedigrees. For experiments, plant objects are generated from plant lines and united in so-called cultures, to which the cultivation conditions are linked. Materials and methods for each cultivation step are stored in a separate ACCESS database of the plant cultivation unit. For all cultures and thus every plant object, each cultivation site and the culture's arrival time at a site are logged by a barcode-scanner based system. Thus, for each plant object, all site-related parameters, e. g. automatically logged climate data, are available. These life history data and genetic information for the plant objects are linked to analytical results by the sampling module, which links sample components to plant object identifiers. This workflow uses controlled vocabulary for organs and treatments. Unique names generated by the system and barcode labels facilitate identification and management of the material. Web pages are provided as user interfaces to facilitate maintaining the system in an environment with many desktop computers and a rapidly changing user community. Web based search tools are the basis for joint use of the material by all researchers of the institute. Conclusion: The Golm Plant Database system, which is based on a relational database, collects the genetic and environmental information on plant material during its production or experimental use at the Max-Planck-Institute of Molecular Plant Physiology. It thus provides information according to the MIAME standard for the component 'Sample' in a highly standardised format. The Plant Database system thus facilitates collaborative work and allows efficient queries in data analysis for systems biology research. T3 - Zweitveröffentlichungen der Universität Potsdam : Mathematisch-Naturwissenschaftliche Reihe - 830 KW - microarray data KW - arabidopsis KW - information Y1 - 2020 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-427595 IS - 830 ER - TY - GEN A1 - Stoessel, Daniel A1 - Stellmann, Jan-Patrick A1 - Willing, Anne A1 - Behrens, Birte A1 - Rosenkranz, Sina C. A1 - Hodecker, Sibylle C. A1 - Stürner, Klarissa H. A1 - Reinhardt, Stefanie A1 - Fleischer, Sabine A1 - Deuschle, Christian A1 - Maetzler, Walter A1 - Berg, Daniela A1 - Heesen, Christoph A1 - Walther, Dirk A1 - Schauer, Nicolas A1 - Friese, Manuel A. A1 - Pless, Ole T1 - Metabolomic profiles for primary progressive multiple sclerosis stratification and disease course monitoring T2 - Postprints der Universität Potsdam Mathematisch-Naturwissenschaftliche Reihe N2 - Primary progressive multiple sclerosis (PPMS) shows a highly variable disease progression with poor prognosis and a characteristic accumulation of disabilities in patients. These hallmarks of PPMS make it difficult to diagnose and currently impossible to efficiently treat. This study aimed to identify plasma metabolite profiles that allow diagnosis of PPMS and its differentiation from the relapsing remitting subtype (RRMS), primary neurodegenerative disease (Parkinson’s disease, PD), and healthy controls (HCs) and that significantly change during the disease course and could serve as surrogate markers of multiple sclerosis (MS)-associated neurodegeneration over time. We applied untargeted high-resolution metabolomics to plasma samples to identify PPMS-specific signatures, validated our findings in independent sex- and age-matched PPMS and HC cohorts and built discriminatory models by partial least square discriminant analysis (PLS-DA). This signature was compared to sex- and age-matched RRMS patients, to patients with PD and HC. Finally, we investigated these metabolites in a longitudinal cohort of PPMS patients over a 24-month period. PLS-DA yielded predictive models for classification along with a set of 20 PPMS-specific informative metabolite markers. These metabolites suggest disease-specific alterations in glycerophospholipid and linoleic acid pathways. Notably, the glycerophospholipid LysoPC(20:0) significantly decreased during the observation period. These findings show potential for diagnosis and disease course monitoring, and might serve as biomarkers to assess treatment efficacy in future clinical trials for neuroprotective MS therapies. T3 - Zweitveröffentlichungen der Universität Potsdam : Mathematisch-Naturwissenschaftliche Reihe - 694 KW - untargeted metabolomics KW - biomarker KW - PPMS KW - MS neurodegeneration KW - LysoPC(20:0) Y1 - 2019 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-426307 SN - 1866-8372 IS - 694 ER - TY - GEN A1 - Sprenger, Heike A1 - Erban, Alexander A1 - Seddig, Sylvia A1 - Rudack, Katharina A1 - Thalhammer, Anja A1 - Le, Mai Q. A1 - Walther, Dirk A1 - Zuther, Ellen A1 - Köhl, Karin I. A1 - Kopka, Joachim A1 - Hincha, Dirk K. T1 - Metabolite and transcript markers for the prediction of potato drought tolerance T2 - Postprints der Universität Potsdam : Mathematisch-Naturwissenschaftliche Reihe N2 - Potato (Solanum tuberosum L.) is one of the most important food crops worldwide. Current potato varieties are highly susceptible to drought stress. In view of global climate change, selection of cultivars with improved drought tolerance and high yield potential is of paramount importance. Drought tolerance breeding of potato is currently based on direct selection according to yield and phenotypic traits and requires multiple trials under drought conditions. Marker‐assisted selection (MAS) is cheaper, faster and reduces classification errors caused by noncontrolled environmental effects. We analysed 31 potato cultivars grown under optimal and reduced water supply in six independent field trials. Drought tolerance was determined as tuber starch yield. Leaf samples from young plants were screened for preselected transcript and nontargeted metabolite abundance using qRT‐PCR and GC‐MS profiling, respectively. Transcript marker candidates were selected from a published RNA‐Seq data set. A Random Forest machine learning approach extracted metabolite and transcript markers for drought tolerance prediction with low error rates of 6% and 9%, respectively. Moreover, by combining transcript and metabolite markers, the prediction error was reduced to 4.3%. Feature selection from Random Forest models allowed model minimization, yielding a minimal combination of only 20 metabolite and transcript markers that were successfully tested for their reproducibility in 16 independent agronomic field trials. We demonstrate that a minimum combination of transcript and metabolite markers sampled at early cultivation stages predicts potato yield stability under drought largely independent of seasonal and regional agronomic conditions. T3 - Zweitveröffentlichungen der Universität Potsdam : Mathematisch-Naturwissenschaftliche Reihe - 673 KW - drought tolerance KW - machine learning KW - metabolite markers KW - potato (Solanum tuberosum) KW - prediction models KW - transcript markers Y1 - 2019 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-424630 SN - 1866-8372 IS - 673 ER - TY - GEN A1 - Childs, Liam H. A1 - Nikoloski, Zoran A1 - May, Patrick A1 - Walther, Dirk T1 - Identification and classification of ncRNA molecules using graph properties N2 - The study of non-coding RNA genes has received increased attention in recent years fuelled by accumulating evidence that larger portions of genomes than previously acknowledged are transcribed into RNA molecules of mostly unknown function, as well as the discovery of novel non-coding RNA types and functional RNA elements. Here, we demonstrate that specific properties of graphs that represent the predicted RNA secondary structure reflect functional information. We introduce a computational algorithm and an associated web-based tool (GraPPLE) for classifying non-coding RNA molecules as functional and, furthermore, into Rfam families based on their graph properties. Unlike sequence-similarity-based methods and covariance models, GraPPLE is demonstrated to be more robust with regard to increasing sequence divergence, and when combined with existing methods, leads to a significant improvement of prediction accuracy. Furthermore, graph properties identified as most informative are shown to provide an understanding as to what particular structural features render RNA molecules functional. Thus, GraPPLE may offer a valuable computational filtering tool to identify potentially interesting RNA molecules among large candidate datasets. T3 - Zweitveröffentlichungen der Universität Potsdam : Mathematisch-Naturwissenschaftliche Reihe - paper 145 KW - RNA secondary structure KW - Noncoding RNAs KW - Structure prediction KW - Gene-expression KW - Structured RNAs Y1 - 2009 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus-45192 ER - TY - GEN A1 - Durek, Pawel A1 - Schudoma, Christian A1 - Weckwerth, Wolfram A1 - Selbig, Joachim A1 - Walther, Dirk T1 - Detection and characterization of 3D-signature phosphorylation site motifs and their contribution towards improved phosphorylation site prediction in proteins N2 - Background: Phosphorylation of proteins plays a crucial role in the regulation and activation of metabolic and signaling pathways and constitutes an important target for pharmaceutical intervention. Central to the phosphorylation process is the recognition of specific target sites by protein kinases followed by the covalent attachment of phosphate groups to the amino acids serine, threonine, or tyrosine. The experimental identification as well as computational prediction of phosphorylation sites (P-sites) has proved to be a challenging problem. Computational methods have focused primarily on extracting predictive features from the local, one-dimensional sequence information surrounding phosphorylation sites. Results: We characterized the spatial context of phosphorylation sites and assessed its usability for improved phosphorylation site predictions. We identified 750 non-redundant, experimentally verified sites with three-dimensional (3D) structural information available in the protein data bank (PDB) and grouped them according to their respective kinase family. We studied the spatial distribution of amino acids around phosphorserines, phosphothreonines, and phosphotyrosines to extract signature 3D-profiles. Characteristic spatial distributions of amino acid residue types around phosphorylation sites were indeed discernable, especially when kinase-family-specific target sites were analyzed. To test the added value of using spatial information for the computational prediction of phosphorylation sites, Support Vector Machines were applied using both sequence as well as structural information. When compared to sequence-only based prediction methods, a small but consistent performance improvement was obtained when the prediction was informed by 3D-context information. Conclusion: While local one-dimensional amino acid sequence information was observed to harbor most of the discriminatory power, spatial context information was identified as relevant for the recognition of kinases and their cognate target sites and can be used for an improved prediction of phosphorylation sites. A web-based service (Phos3D) implementing the developed structurebased P-site prediction method has been made available at http://phos3d.mpimp-golm.mpg.de. T3 - Zweitveröffentlichungen der Universität Potsdam : Mathematisch-Naturwissenschaftliche Reihe - paper 141 KW - Support vector machines KW - Microarray data KW - Docking interactions KW - Signal-transduction KW - Sequence alignment Y1 - 2009 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus-45129 ER - TY - GEN A1 - May, Patrick A1 - Christian, Jan-Ole A1 - Kempa, Stefan A1 - Walther, Dirk T1 - ChlamyCyc : an integrative systems biology database and web-portal for Chlamydomonas reinhardtii N2 - Background: The unicellular green alga Chlamydomonas reinhardtii is an important eukaryotic model organism for the study of photosynthesis and plant growth. In the era of modern highthroughput technologies there is an imperative need to integrate large-scale data sets from highthroughput experimental techniques using computational methods and database resources to provide comprehensive information about the molecular and cellular organization of a single organism. Results: In the framework of the German Systems Biology initiative GoFORSYS, a pathway database and web-portal for Chlamydomonas (ChlamyCyc) was established, which currently features about 250 metabolic pathways with associated genes, enzymes, and compound information. ChlamyCyc was assembled using an integrative approach combining the recently published genome sequence, bioinformatics methods, and experimental data from metabolomics and proteomics experiments. We analyzed and integrated a combination of primary and secondary database resources, such as existing genome annotations from JGI, EST collections, orthology information, and MapMan classification. Conclusion: ChlamyCyc provides a curated and integrated systems biology repository that will enable and assist in systematic studies of fundamental cellular processes in Chlamydomonas. The ChlamyCyc database and web-portal is freely available under http://chlamycyc.mpimp-golm.mpg.de. T3 - Zweitveröffentlichungen der Universität Potsdam : Mathematisch-Naturwissenschaftliche Reihe - paper 127 KW - Biochemical pathway database KW - Gene-expression data KW - Quantitative proteomics KW - Metabolic pathways KW - Genome annotation Y1 - 2009 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus-44947 ER -