TY - JOUR A1 - Riaño-Pachón, Diego Mauricio A1 - Kleessen, Sabrina A1 - Neigenfind, Jost A1 - Durek, Pawel A1 - Weber, Elke A1 - Engelsberger, Wolfgang R. A1 - Walther, Dirk A1 - Selbig, Joachim A1 - Schulze, Waltraud X. A1 - Kersten, Birgit T1 - Proteome-wide survey of phosphorylation patterns affected by nuclear DNA polymorphisms in Arabidopsis thaliana JF - BMC Genomics N2 - Background: Protein phosphorylation is an important post-translational modification influencing many aspects of dynamic cellular behavior. Site-specific phosphorylation of amino acid residues serine, threonine, and tyrosine can have profound effects on protein structure, activity, stability, and interaction with other biomolecules. Phosphorylation sites can be affected in diverse ways in members of any species, one such way is through single nucleotide polymorphisms (SNPs). The availability of large numbers of experimentally identified phosphorylation sites, and of natural variation datasets in Arabidopsis thaliana prompted us to analyze the effect of non-synonymous SNPs (nsSNPs) onto phosphorylation sites. Results: From the analyses of 7,178 experimentally identified phosphorylation sites we found that: (i) Proteins with multiple phosphorylation sites occur more often than expected by chance. (ii) Phosphorylation hotspots show a preference to be located outside conserved domains. (iii) nsSNPs affected experimental phosphorylation sites as much as the corresponding non-phosphorylated amino acid residues. (iv) Losses of experimental phosphorylation sites by nsSNPs were identified in 86 A. thaliana proteins, among them receptor proteins were overrepresented. These results were confirmed by similar analyses of predicted phosphorylation sites in A. thaliana. In addition, predicted threonine phosphorylation sites showed a significant enrichment of nsSNPs towards asparagines and a significant depletion of the synonymous substitution. Proteins in which predicted phosphorylation sites were affected by nsSNPs (loss and gain), were determined to be mainly receptor proteins, stress response proteins and proteins involved in nucleotide and protein binding. Proteins involved in metabolism, catalytic activity and biosynthesis were less affected. Conclusions: We analyzed more than 7,100 experimentally identified phosphorylation sites in almost 4,300 protein-coding loci in silico, thus constituting the largest phosphoproteomics dataset for A. thaliana available to date. Our findings suggest a relatively high variability in the presence or absence of phosphorylation sites between different natural accessions in receptor and other proteins involved in signal transduction. Elucidating the effect of phosphorylation sites affected by nsSNPs on adaptive responses represents an exciting research goal for the future. KW - Gene Ontology KW - Phosphorylation Site KW - phosphorylated amino acid KW - slim term KW - single nucleotide polymorphism mapping Y1 - 2010 U6 - https://doi.org/10.1186/1471-2164-11-411 SN - 1471-2164 VL - 11 PB - Biomed Central CY - London ER - TY - GEN A1 - Riaño-Pachón, Diego Mauricio A1 - Kleessen, Sabrina A1 - Neigenfind, Jost A1 - Durek, Pawel A1 - Weber, Elke A1 - Engelsberger, Wolfgang R. A1 - Walther, Dirk A1 - Selbig, Joachim A1 - Schulze, Waltraud X. A1 - Kersten, Birgit T1 - Proteome-wide survey of phosphorylation patterns affected by nuclear DNA polymorphisms in Arabidopsis thaliana T2 - Zweitveröffentlichungen der Universität Potsdam : Mathematisch-Naturwissenschaftliche Reihe N2 - Background: Protein phosphorylation is an important post-translational modification influencing many aspects of dynamic cellular behavior. Site-specific phosphorylation of amino acid residues serine, threonine, and tyrosine can have profound effects on protein structure, activity, stability, and interaction with other biomolecules. Phosphorylation sites can be affected in diverse ways in members of any species, one such way is through single nucleotide polymorphisms (SNPs). The availability of large numbers of experimentally identified phosphorylation sites, and of natural variation datasets in Arabidopsis thaliana prompted us to analyze the effect of non-synonymous SNPs (nsSNPs) onto phosphorylation sites. Results: From the analyses of 7,178 experimentally identified phosphorylation sites we found that: (i) Proteins with multiple phosphorylation sites occur more often than expected by chance. (ii) Phosphorylation hotspots show a preference to be located outside conserved domains. (iii) nsSNPs affected experimental phosphorylation sites as much as the corresponding non-phosphorylated amino acid residues. (iv) Losses of experimental phosphorylation sites by nsSNPs were identified in 86 A. thaliana proteins, among them receptor proteins were overrepresented. These results were confirmed by similar analyses of predicted phosphorylation sites in A. thaliana. In addition, predicted threonine phosphorylation sites showed a significant enrichment of nsSNPs towards asparagines and a significant depletion of the synonymous substitution. Proteins in which predicted phosphorylation sites were affected by nsSNPs (loss and gain), were determined to be mainly receptor proteins, stress response proteins and proteins involved in nucleotide and protein binding. Proteins involved in metabolism, catalytic activity and biosynthesis were less affected. Conclusions: We analyzed more than 7,100 experimentally identified phosphorylation sites in almost 4,300 protein-coding loci in silico, thus constituting the largest phosphoproteomics dataset for A. thaliana available to date. Our findings suggest a relatively high variability in the presence or absence of phosphorylation sites between different natural accessions in receptor and other proteins involved in signal transduction. Elucidating the effect of phosphorylation sites affected by nsSNPs on adaptive responses represents an exciting research goal for the future. T3 - Zweitveröffentlichungen der Universität Potsdam : Mathematisch-Naturwissenschaftliche Reihe - 1328 KW - Gene Ontology KW - Phosphorylation Site KW - phosphorylated amino acid KW - slim term KW - single nucleotide polymorphism mapping Y1 - 2010 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-431181 SN - 1866-8372 IS - 1328 ER - TY - GEN A1 - Durek, Pawel A1 - Schudoma, Christian A1 - Weckwerth, Wolfram A1 - Selbig, Joachim A1 - Walther, Dirk T1 - Detection and characterization of 3D-signature phosphorylation site motifs and their contribution towards improved phosphorylation site prediction in proteins N2 - Background: Phosphorylation of proteins plays a crucial role in the regulation and activation of metabolic and signaling pathways and constitutes an important target for pharmaceutical intervention. Central to the phosphorylation process is the recognition of specific target sites by protein kinases followed by the covalent attachment of phosphate groups to the amino acids serine, threonine, or tyrosine. The experimental identification as well as computational prediction of phosphorylation sites (P-sites) has proved to be a challenging problem. Computational methods have focused primarily on extracting predictive features from the local, one-dimensional sequence information surrounding phosphorylation sites. Results: We characterized the spatial context of phosphorylation sites and assessed its usability for improved phosphorylation site predictions. We identified 750 non-redundant, experimentally verified sites with three-dimensional (3D) structural information available in the protein data bank (PDB) and grouped them according to their respective kinase family. We studied the spatial distribution of amino acids around phosphorserines, phosphothreonines, and phosphotyrosines to extract signature 3D-profiles. Characteristic spatial distributions of amino acid residue types around phosphorylation sites were indeed discernable, especially when kinase-family-specific target sites were analyzed. To test the added value of using spatial information for the computational prediction of phosphorylation sites, Support Vector Machines were applied using both sequence as well as structural information. When compared to sequence-only based prediction methods, a small but consistent performance improvement was obtained when the prediction was informed by 3D-context information. Conclusion: While local one-dimensional amino acid sequence information was observed to harbor most of the discriminatory power, spatial context information was identified as relevant for the recognition of kinases and their cognate target sites and can be used for an improved prediction of phosphorylation sites. A web-based service (Phos3D) implementing the developed structurebased P-site prediction method has been made available at http://phos3d.mpimp-golm.mpg.de. T3 - Zweitveröffentlichungen der Universität Potsdam : Mathematisch-Naturwissenschaftliche Reihe - paper 141 KW - Support vector machines KW - Microarray data KW - Docking interactions KW - Signal-transduction KW - Sequence alignment Y1 - 2009 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus-45129 ER - TY - THES A1 - Durek, Pawel T1 - Comparative analysis of molecular interaction networks : the interplay between spatial and functional organizing principles T1 - Vergleichende Analyse molekularer Interaktionsnetzwerke : der Zusammenhang von räumlichen und funktionellen Organisationsprinzipien N2 - The study of biological interaction networks is a central theme in systems biology. Here, we investigate common as well as differentiating principles of molecular interaction networks associated with different levels of molecular organization. They include metabolic pathway maps, protein-protein interaction networks as well as kinase interaction networks. First, we present an integrated analysis of metabolic pathway maps and protein-protein interaction networks (PIN). It has long been established that successive enzymatic steps are often catalyzed by physically interacting proteins forming permanent or transient multi-enzyme complexes. Inspecting high-throughput PIN data, it has been shown recently that, indeed, enzymes involved in successive reactions are generally more likely to interact than other protein pairs. In this study, we expanded this line of research to include comparisons of the respective underlying network topologies as well as to investigate whether the spatial organization of enzyme interactions correlates with metabolic efficiency. Analyzing yeast data, we detected long-range correlations between shortest paths between proteins in both network types suggesting a mutual correspondence of both network architectures. We discovered that the organizing principles of physical interactions between metabolic enzymes differ from the general PIN of all proteins. While physical interactions between proteins are generally dissortative, enzyme interactions were observed to be assortative. Thus, enzymes frequently interact with other enzymes of similar rather than different degree. Enzymes carrying high flux loads are more likely to physically interact than enzymes with lower metabolic throughput. In particular, enzymes associated with catabolic pathways as well as enzymes involved in the biosynthesis of complex molecules were found to exhibit high degrees of physical clustering. Single proteins were identified that connect major components of the cellular metabolism and hence might be essential for the structural integrity of several biosynthetic systems. Besides metabolic aspects of PINs, we investigated the characteristic topological properties of protein interactions involved in signaling and regulatory functions mediated by kinase interactions. Characteristic topological differences between PINs associated with metabolism, and those describing phosphorylation networks were revealed and shown to reflect the different modes of biological operation of both network types. The construction of phosphorylation networks is based on the identification of specific kinase-target relations including the determination of the actual phosphorylation sites (P-sites). The computational prediction of P-sites as well as the identification of involved kinases still suffers from insufficient accuracies and specificities of the underlying prediction algorithms, and the experimental identification in a genome-scale manner is not (yet) doable. Computational prediction methods have focused primarily on extracting predictive features from the local, one-dimensional sequence information surrounding P-sites. However the recognition of such motifs by the respective kinases is a spatial event. Therefore, we characterized the spatial distributions of amino acid residue types around P-sites and extracted signature 3D-profiles. We then tested the added value of spatial information on the prediction performance. When compared to sequence-only based predictors, a consistent performance gain was obtained. The availability of reliable training data of experimentally determined P-sites is critical for the development of computational prediction methods. As part of this thesis, we provide an assessment of false-positive rates of phosphoproteomic data. N2 - Ein zentrales Thema der Systembiologie ist die Untersuchung biologischer Interaktionsnetzwerke. In der vorliegenden Arbeit wurden gemeinsame sowie differenzierende Prinzipien molekularer Interaktionsnetzwerke untersucht, die sich durch unterschiedliche Ebenen der molekulareren Organisation auszeichnen. Zu den untersuchten Interaktionsnetzwerken gehörten Netzwerke, die auf metabolischen Wechselwirkungen, physikalischen Wechselwirkungen zwischen Proteinen und Kinase-Interaktionen aufbauen. Zunächst wird eine integrativen Analyse der metabolischen Pfade und Protein Interaktionsnetzwerke vorgestellt. Es wird seit schon seit langem angenommen, dass aufeinander folgende enzymatische Schritte oft durch permanente oder transiente Multienzymkomplexe, die auf physikalischen Wechselwirkungen der involvierten Proteine basieren, katalysiert werden. Diese Annahme konnte durch die Auswertung von Ergebnissen aus Hochdurchsatz-Experimenten bestätigt werden. Demnach treten aufeinander folgende Enzyme häufiger in physikalische Wechselwirkung als zufällige Enzympaare. Die vorliegende Arbeit geht in ihrer Analyse weiter, in dem die Topologien der zugrundeliegenden Netzwerke, die auf metabolischen und physikalischen Wechselwirkungen basieren verglichen werden und der Zusammenhang zwischen der räumlichen Organisation der Enzyme und der metabolischen Effizienz gesucht wird. Ausgehend von Interaktionsdaten aus Hefe hat die Analyse der auf metabolischen und physikalischen Wechselwirkungen aufbauenden Interaktionswege eine weitgehende Korrelation der Distanzen aufgezeigt und somit eine wechselseitige Übereinstimmung der Architekturen nahegelegt. Allerdings folgen physikalische Wechselwirkungen zwischen metabolischen Enzymen anderen organisatorischen Regeln als Proteininteraktionen im allgemeinem PIN, das alle Proteininteraktionen enthält. Während Proteininteraktionen im allgemeinen PIN sich dissortativ verhalten, sind physikalische Enzyminteraktionen assortativ, d.h. dass die Anzahl der Interaktionen benachbarter Proteine im allgemeinem Netzwerk negativ und im metabolischen Netzwerk positiv korreliert. Ferner scheinen Enzyme von höherem metabolischen Durchsatz häufiger in Wechselwirkungen involviert zu sein. Enzyme der zentralen katabolischen Prozesse sowie der Biosynthese komplexer Membranlipide zeigen dabei einen besonders hohen Verknüpfungsgrad und eine dichte Clusterbildung. Einzelne Proteine wurden identifiziert, die die Hauptkomponenten des zellulären Metabolismus verbinden und so die Integrität verschiedener biosynthetischer Systeme essenziell beeinflussen könnten. Neben dem metabolischen Aspekt der PIN wurde auch der Aspekt der Regulation sowie der Signaltransduktion, der Kinase-Interaktionen, näher analysiert. Dabei wurden charakteristische topologische Unterschiede der mit dem Metabolismus und der Phosphorylierung assoziierten PIN gefunden, die die unterschiedlichen Aufgaben beider Netzwerke widerspiegeln. Die Rekonstruktion von Phosphorylierungs-Netzwerken basiert im Wesentlichen auf der Vorhersage von Kinase-Zielprotein Relationen und kann deshalb immer noch an der nicht genügenden Vorhersagegüte der angewandten Vorhersage-Algorithmen während der Bestimmung von Phosphorylierungsstellen (P-Stellen) und der dazugehörigen Kinasen leiden. Auch die experimentelle, genomweite Bestimmung der P-Stellen ist (noch) nicht durchführbar. Bisherige computergestützte Vorhersagemethoden beruhten für gewöhnlich auf der Auswertung charakteristischer Merkmale der lokalen, die P-Stelle umgebenden Proteinsequenz. Dieser Ansatz wird durch die Verwendung räumlicher 3D-Information in der vorliegenden Arbeit erweitert. Hierbei wird die Verteilung der Aminosäuren um die P-Stelle berechnet und spezifische 3D-Signaturen zur Vorhersage extrahiert. Beim Vergleich mit sequenz-basierten Vorhersagemethoden konnte eine konsistente Verbesserung der Vorhersage durch die Einbeziehung räumlicher Information gezeigt werden. Weiterhin wird in der vorliegenden Arbeit auch der Frage nach der Fehlerrate der experimentellen Phosphoprotein-Daten nachgegangen und ihre Verlässlichkeit bewertet. Die Verfügbarkeit eines verlässlichen Datensatzes ist bei der Entwicklung einer Vorhersagemethode ein entscheidendes Kriterium. KW - Proteinphosphorylation KW - Systembiologie KW - Netzwerke KW - metabolisch KW - phosphorylation KW - systemsbiology KW - networks KW - metabolic Y1 - 2008 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus-31439 ER -