publish.UP Search

Detection and characterization of 3D-signature phosphorylation site motifs and their contribution towards improved phosphorylation site prediction in proteins (2009)

Durek, Pawel ; Schudoma, Christian ; Weckwerth, Wolfram ; Selbig, Joachim ; Walther, Dirk

Background: Phosphorylation of proteins plays a crucial role in the regulation and activation of metabolic and signaling pathways and constitutes an important target for pharmaceutical intervention. Central to the phosphorylation process is the recognition of specific target sites by protein kinases followed by the covalent attachment of phosphate groups to the amino acids serine, threonine, or tyrosine. The experimental identification as well as computational prediction of phosphorylation sites (P-sites) has proved to be a challenging problem. Computational methods have focused primarily on extracting predictive features from the local, one-dimensional sequence information surrounding phosphorylation sites. Results: We characterized the spatial context of phosphorylation sites and assessed its usability for improved phosphorylation site predictions. We identified 750 non-redundant, experimentally verified sites with three-dimensional (3D) structural information available in the protein data bank (PDB) and grouped them according to their respective kinase family. We studied the spatial distribution of amino acids around phosphorserines, phosphothreonines, and phosphotyrosines to extract signature 3D-profiles. Characteristic spatial distributions of amino acid residue types around phosphorylation sites were indeed discernable, especially when kinase-family-specific target sites were analyzed. To test the added value of using spatial information for the computational prediction of phosphorylation sites, Support Vector Machines were applied using both sequence as well as structural information. When compared to sequence-only based prediction methods, a small but consistent performance improvement was obtained when the prediction was informed by 3D-context information. Conclusion: While local one-dimensional amino acid sequence information was observed to harbor most of the discriminatory power, spatial context information was identified as relevant for the recognition of kinases and their cognate target sites and can be used for an improved prediction of phosphorylation sites. A web-based service (Phos3D) implementing the developed structurebased P-site prediction method has been made available at http://phos3d.mpimp-golm.mpg.de.

Modeling biological networks by action languages via answer set programming (2008)

Dworschak, Steve ; Grell, Susanne ; Nikiforova, Victoria J. ; Schaub, Torsten H. ; Selbig, Joachim

We describe an approach to modeling biological networks by action languages via answer set programming. To this end, we propose an action language for modeling biological networks, building on previous work by Baral et al. We introduce its syntax and semantics along with a translation into answer set programming, an efficient Boolean Constraint Programming Paradigm. Finally, we describe one of its applications, namely, the sulfur starvation response-pathway of the model plant Arabidopsis thaliana and sketch the functionality of our system and its usage.

Phenomic prediction of maize hybrids (2016)

Edlich-Muth, Christian ; Muraya, Moses M. ; Altmann, Thomas ; Selbig, Joachim

Phenomic experiments are carried out in large-scale plant phenotyping facilities that acquire a large number of pictures of hundreds of plants simultaneously. With the aid of automated image processing, the data are converted into genotype-feature matrices that cover many consecutive days of development. Here, we explore the possibility of predicting the biomass of the fully grown plant from early developmental stage image-derived features. We performed phenomic experiments on 195 inbred and 382 hybrid maizes varieties and followed their progress from 16 days after sowing (DAS) to 48 DAS with 129 image-derived features. By applying sparse regression methods, we show that 73% of the variance in hybrid fresh weight of fully-grown plants is explained by about 20 features at the three-leaf-stage or earlier. Dry weight prediction explained over 90% of the variance. When phenomic features of parental inbred lines were used as predictors of hybrid biomass, the proportion of variance explained was 42 and 45%, for fresh weight and dry weight models consisting of 35 and 36 features, respectively. These models were very robust, showing only a small amount of variation in performance over the time scale of the experiment. We also examined mid-parent heterosis in phenomic features. Feature heterosis displayed a large degree of variance which resulted in prediction performance that was less robust than models of either parental or hybrid predictors. Our results show that phenomic prediction is a viable alternative to genomic and metabolic prediction of hybrid performance. In particular, the utility of early-stage parental lines is very encouraging. (C) 2016 Elsevier Ireland Ltd. All rights reserved.

Deducing hybrid performance from parental metabolic profiles of young primary roots of maize by using a multivariate diallel approach (2014)

Feher, Kristen ; Lisec, Jan ; Roemisch-Margl, Lilla ; Selbig, Joachim ; Gierl, Alfons ; Piepho, Hans-Peter ; Nikoloski, Zoran ; Willmitzer, Lothar

Threshold extraction in metabolite concentration data (2004)

Flöter, André ; Nicolas, Jacques ; Schaub, Torsten H. ; Selbig, Joachim

Motivation: Continued development of analytical techniques based on gas chromatography and mass spectrometry now facilitates the generation of larger sets of metabolite concentration data. An important step towards the understanding of metabolite dynamics is the recognition of stable states where metabolite concentrations exhibit a simple behaviour. Such states can be characterized through the identification of significant thresholds in the concentrations. But general techniques for finding discretization thresholds in continuous data prove to be practically insufficient for detecting states due to the weak conditional dependences in concentration data. Results: We introduce a method of recognizing states in the framework of decision tree induction. It is based upon a global analysis of decision forests where stability and quality are evaluated. It leads to the detection of thresholds that are both comprehensible and robust. Applied to metabolite concentration data, this method has led to the discovery of hidden states in the corresponding variables. Some of these reflect known properties of the biological experiments, and others point to putative new states

Threshold extraction in metabolite concentration data (2003)

Flöter, André ; Nicolas, Jacques ; Schaub, Torsten H. ; Selbig, Joachim

Finding metabolic pathways in decision forests (2004)

Flöter, André ; Selbig, Joachim ; Schaub, Torsten H.

Systematic analysis of stability patterns in plant primary metabolism (2012)

Girbig, Dorothee ; Grimbs, Sergio ; Selbig, Joachim

Metabolic networks are characterized by complex interactions and regulatory mechanisms between many individual components. These interactions determine whether a steady state is stable to perturbations. Structural kinetic modeling (SKM) is a framework to analyze the stability of metabolic steady states that allows the study of the system Jacobian without requiring detailed knowledge about individual rate equations. Stability criteria can be derived by generating a large number of structural kinetic models (SK-models) with randomly sampled parameter sets and evaluating the resulting Jacobian matrices. Until now, SKM experiments applied univariate tests to detect the network components with the largest influence on stability. In this work, we present an extended SKM approach relying on supervised machine learning to detect patterns of enzyme-metabolite interactions that act together in an orchestrated manner to ensure stability. We demonstrate its application on a detailed SK-model of the Calvin-Benson cycle and connected pathways. The identified stability patterns are highly complex reflecting that changes in dynamic properties depend on concerted interactions between several network components. In total, we find more patterns that reliably ensure stability than patterns ensuring instability. This shows that the design of this system is strongly targeted towards maintaining stability. We also investigate the effect of allosteric regulators revealing that the tendency to stability is significantly increased by including experimentally determined regulatory mechanisms that have not yet been integrated into existing kinetic models.

A MATLAB toolbox for structural kinetic modeling (2012)

Girbig, Dorothee ; Selbig, Joachim ; Grimbs, Sergio

Structural kinetic modeling (SKM) enables the analysis of dynamical properties of metabolic networks solely based on topological information and experimental data. Current SKM-based experiments are hampered by the time-intensive process of assigning model parameters and choosing appropriate sampling intervals for MonteCarlo experiments. We introduce a toolbox for the automatic and efficient construction and evaluation of structural kinetic models (SK models). Quantitative and qualitative analyses of network stability properties are performed in an automated manner. We illustrate the model building and analysis process in detailed example scripts that provide toolbox implementations of previously published literature models.

Modelling biological networks by action languages via set programming (2006)

Grell, Susanne ; Schaub, Torsten H. ; Selbig, Joachim

Spatiotemporal dynamics of the Calvin cycle multistationarity and symmetry breaking instabilities (2011)

Grimbs, Sergio ; Arnold, Anne ; Koseska, Aneta ; Kurths, Jürgen ; Selbig, Joachim ; Nikoloski, Zoran

The possibility of controlling the Calvin cycle has paramount implications for increasing the production of biomass. Multistationarity, as a dynamical feature of systems, is the first obvious candidate whose control could find biotechnological applications. Here we set out to resolve the debate on the multistationarity of the Calvin cycle. Unlike the existing simulation-based studies, our approach is based on a sound mathematical framework, chemical reaction network theory and algebraic geometry, which results in provable results for the investigated model of the Calvin cycle in which we embed a hierarchy of realistic kinetic laws. Our theoretical findings demonstrate that there is a possibility for multistationarity resulting from two sources, homogeneous and inhomogeneous instabilities, which partially settle the debate on multistability of the Calvin cycle. In addition, our tractable analytical treatment of the bifurcation parameters can be employed in the design of validation experiments.

A "Crossomics" study analysing variability of different components in peripheral blood of healthy caucasoid individuals (2012)

Gruden, Kristina ; Hren, Matjaz ; Herman, Ana ; Blejec, Andrej ; Albrecht, Tanja ; Selbig, Joachim ; Bauer, Christian G. ; Schuchardt, Johannes ; Or-Guil, Michal ; Zupancic, Klemen ; Svajger, Urban ; Stabuc, Borut ; Ihan, Alojz ; Kopitar, Andreja Natasa ; Ravnikar, Maja ; Knezevic, Miomir ; Rozman, Primoz ; Jeras, Matjaz

Background: Different immunotherapy approaches for the treatment of cancer and autoimmune diseases are being developed and tested in clinical studies worldwide. Their resulting complex experimental data should be properly evaluated, therefore reliable normal healthy control baseline values are indispensable. Methodology/Principal Findings: To assess intra- and inter-individual variability of various biomarkers, peripheral blood of 16 age and gender equilibrated healthy volunteers was sampled on 3 different days within a period of one month. Complex "crossomics'' analyses of plasma metabolite profiles, antibody concentrations and lymphocyte subset counts as well as whole genome expression profiling in CD4(+)T and NK cells were performed. Some of the observed age, gender and BMI dependences are in agreement with the existing knowledge, like negative correlation between sex hormone levels and age or BMI related increase in lipids and soluble sugars. Thus we can assume that the distribution of all 39.743 analysed markers is well representing the normal Caucasoid population. All lymphocyte subsets, 20% of metabolites and less than 10% of genes, were identified as highly variable in our dataset. Conclusions/Significance: Our study shows that the intra- individual variability was at least two-fold lower compared to the inter-individual one at all investigated levels, showing the importance of personalised medicine approach from yet another perspective.

The expression of Wnt-inhibitor DKK1 (Dickkopf 1) is determined by intercellular crosstalk and hypoxia in human malignant gliomas (2014)

Guo, Ke-Tai ; Fu, Peng ; Juerchott, Kathrin ; Motaln, Helena ; Selbig, Joachim ; Lah, Tamara T. ; Tonn, Jörg-Christian ; Schichor, Christian

Objective Wnt signalling pathways regulate proliferation, motility and survival in a variety of human cell types. Dickkopf 1 (DKK1) gene codes for a secreted Wnt inhibitory factor. It functions as tumour suppressor gene in breast cancer and as a pro-apoptotic factor in glioma cells. In this study, we aimed to demonstrate whether the different expression of DKK1 in human glioma-derived cells is dependent on microenvironmental factors like hypoxia and regulated by the intercellular crosstalk with bone-marrow-derived mesenchymal stem cells (bmMSCs). Methods Glioma cell line U87-MG, three cell lines from human glioblastoma grade IV (glioma-derived mesenchymal stem cells) and three bmMSCs were selected for the experiment. The expression of DKK1 in cell lines under normoxic/hypoxic environment or co-culture condition was measured using real-time PCR and enzyme-linked immunoadsorbent assay. The effect of DKK1 on cell migration and proliferation was evaluated by in vitro wound healing assays and sulphorhodamine assays, respectively. Results Glioma-derived cells U87-MG displayed lower DKK1 expression compared with bmMSCs. Hypoxia led to an overexpression of DKK1 in bmMSCs and U87-MG when compared to normoxic environment, whereas co-culture of U87-MG with bmMSCs induced the expression of DKK1 in both cell lines. Exogenous recombinant DKK1 inhibited cell migration on all cell lines, but did not have a significant effect on cell proliferation of bmMSCs and glioma cell lines. Conclusion In this study, we showed for the first time that the expression of DKK1 was hypoxia dependent in human malignant glioma cell lines. The induction of DKK1 by intracellular crosstalk or hypoxia stimuli sheds light on the intense adaption of glial tumour cells to environmental alterations.

Isolation and characterization of bone marrow-derived progenitor cells from malignant gliomas (2012)

Guo, Ke-Tai ; Jürchott, Kathrin ; Fu, Peng ; Selbig, Joachim ; Eigenbrod, Sabina ; Tonn, Jörg-Christian ; Schichor, Christian

Background: Malignant gliomas are highly-vascularised tumours. Neoangiogenesis is a crucial factor in the malignant behaviour of tumour and prognosis of patients. Several mechanisms are suspected to lead to neoangiogenesis, one of them is the recruitment of multipotent progenitor cells towards the tumour. Factors such as Vascular endothelial growth factor-A (VEGF-A) were described to recruit bone marrow-derived endothelial progenitor cells (EPCs) to the glioma stroma and vasculature. Little is known about isolating EPCs from normal or malignant tissues. Materials and Methods: In this study, we addressed the topic of characterization of tumour-isolated EPCs and re-defined the clonal relationship between EPCs and hematopoietic stem cells (HSCs) in gliomas. We first checked public gene expression data of glioma for putative marker expression, pointing towards a prevalence of EPCs and HSCs in glioma. Immunohistochemical staining of glioma tissue confirmed the higher expression of these progenitor markers in glioma tissue. EPCs and HSCs were consequently isolated and characterized at the phenotypic and functional levels. We applied a new isolation method, for the first time, to specimen from patients with high grade glioma including seven grade IV glioblastoma, five-grade III astrocytoma, and three grade III oligoastrocytoma. Results: In all samples, we were able to isolate the tumour-derived EPCs, which were positive for characteristic markers: CD31, CD34 and VEGFR2. The EPCs formed capillary networks in vitro and had the ability to take up acetylated low-density lipoprotein. Glioma-derived HSCs were positive for CD34 and CD45, but they were unable to form a capillary network in vitro. These findings on tumour-derived EPCs/HSCs were in concordance with the results, derived from peripheral blood of healthy volunteers. Conclusion: In our study, we established a new method for EPC/HSC isolation from human gliomas, defined the contribution of EPCs and HSCs to the tumour tissue, and highlighted the intense in vivo tumour host interaction.

Improved heterosis prediction by combining information on DNA- and metabolic markers (2009)

Gärtner, Tanja ; Steinfath, Matthias ; Andorf, Sandra ; Lisec, Jan ; Meyer, Rhonda C. ; Altmann, Thomas ; Willmitzer, Lothar ; Selbig, Joachim

Background: Hybrids represent a cornerstone in the success story of breeding programs. The fundamental principle underlying this success is the phenomenon of hybrid vigour, or heterosis. It describes an advantage of the offspring as compared to the two parental lines with respect to parameters such as growth and resistance against abiotic or biotic stress. Dominance, overdominance or epistasis based models are commonly used explanations. Conclusion/Significance: The heterosis level is clearly a function of the combination of the parents used for offspring production. This results in a major challenge for plant breeders, as usually several thousand combinations of parents have to be tested for identifying the best combinations. Thus, any approach to reliably predict heterosis levels based on properties of the parental lines would be highly beneficial for plant breeding. Methodology/Principal Findings: Recently, genetic data have been used to predict heterosis. Here we show that a combination of parental genetic and metabolic markers, identified via feature selection and minimum-description-length based regression methods, significantly improves the prediction of biomass heterosis in resulting offspring. These findings will help furthering our understanding of the molecular basis of heterosis, revealing, for instance, the presence of nonlinear genotype-phenotype relationships. In addition, we describe a possible approach for accelerated selection in plant breeding.

Exploiting gene families for phylogenomic analysis of myzostomid transcriptome data (2012)

Hartmann, Stefanie ; Helm, Conrad ; Nickel, Birgit ; Meyer, Matthias ; Struck, Torsten H. ; Tiedemann, Ralph ; Selbig, Joachim ; Bleidorn, Christoph

Background: In trying to understand the evolutionary relationships of organisms, the current flood of sequence data offers great opportunities, but also reveals new challenges with regard to data quality, the selection of data for subsequent analysis, and the automation of steps that were once done manually for single-gene analyses. Even though genome or transcriptome data is available for representatives of most bilaterian phyla, some enigmatic taxa still have an uncertain position in the animal tree of life. This is especially true for myzostomids, a group of symbiotic ( or parasitic) protostomes that are either placed with annelids or flatworms. Methodology: Based on similarity criteria, Illumina-based transcriptome sequences of one myzostomid were compared to protein sequences of one additional myzostomid and 29 reference metazoa and clustered into gene families. These families were then used to investigate the phylogenetic position of Myzostomida using different approaches: Alignments of 989 sequence families were concatenated, and the resulting superalignment was analyzed under a Maximum Likelihood criterion. We also used all 1,878 gene trees with at least one myzostomid sequence for a supertree approach: the individual gene trees were computed and then reconciled into a species tree using gene tree parsimony. Conclusions: Superalignments require strictly orthologous genes, and both the gene selection and the widely varying amount of data available for different taxa in our dataset may cause anomalous placements and low bootstrap support. In contrast, gene tree parsimony is designed to accommodate multilocus gene families and therefore allows a much more comprehensive data set to be analyzed. Results of this supertree approach showed a well-resolved phylogeny, in which myzostomids were part of the annelid radiation, and major bilaterian taxa were found to be monophyletic.

Introductory Bioinformatics (2009)

Hartmann, Stefanie ; Selbig, Joachim

Analysis of phylogenetic signal in protostomial intron patterns using Mutual Information (2013)

Hill, Natascha ; Leow, Alexander ; Bleidorn, Christoph ; Groth, Detlef ; Tiedemann, Ralph ; Selbig, Joachim ; Hartmann, Stefanie

Many deep evolutionary divergences still remain unresolved, such as those among major taxa of the Lophotrochozoa. As alternative phylogenetic markers, the intron-exon structure of eukaryotic genomes and the patterns of absence and presence of spliceosomal introns appear to be promising. However, given the potential homoplasy of intron presence, the phylogenetic analysis of this data using standard evolutionary approaches has remained a challenge. Here, we used Mutual Information (MI) to estimate the phylogeny of Protostomia using gene structure data, and we compared these results with those obtained with Dollo Parsimony. Using full genome sequences from nine Metazoa, we identified 447 groups of orthologous sequences with 21,732 introns in 4,870 unique intron positions. We determined the shared absence and presence of introns in the corresponding sequence alignments and have made this data available in "IntronBase", a web-accessible and downloadable SQLite database. Our results obtained using Dollo Parsimony are obviously misled through systematic errors that arise from multiple intron loss events, but extensive filtering of data improved the quality of the estimated phylogenies. Mutual Information, in contrast, performs better with larger datasets, but at the same time it requires a complete data set, which is difficult to obtain for orthologs from a large number of taxa. Nevertheless, Mutual Information-based distances proved to be useful in analyzing this kind of data, also because the estimation of MI-based distances is independent of evolutionary models and therefore no pre-definitions of ancestral and derived character states are necessary.

A distinct metabolic signature predictsdevelopment of fasting plasma glucose (2012)

Hische, Manuela ; Larhlimi, Abdelhalim ; Schwarz, Franziska ; Fischer-Rosinský, Antje ; Bobbert, Thomas ; Assmann, Anke ; Catchpole, Gareth S. ; Pfeiffer, Andreas F. H. ; Willmitzer, Lothar ; Selbig, Joachim ; Spranger, Joachim

Background High blood glucose and diabetes are amongst the conditions causing the greatest losses in years of healthy life worldwide. Therefore, numerous studies aim to identify reliable risk markers for development of impaired glucose metabolism and type 2 diabetes. However, the molecular basis of impaired glucose metabolism is so far insufficiently understood. The development of so called 'omics' approaches in the recent years promises to identify molecular markers and to further understand the molecular basis of impaired glucose metabolism and type 2 diabetes. Although univariate statistical approaches are often applied, we demonstrate here that the application of multivariate statistical approaches is highly recommended to fully capture the complexity of data gained using high-throughput methods. Methods We took blood plasma samples from 172 subjects who participated in the prospective Metabolic Syndrome Berlin Potsdam follow-up study (MESY-BEPO Follow-up). We analysed these samples using Gas Chromatography coupled with Mass Spectrometry (GC-MS), and measured 286 metabolites. Furthermore, fasting glucose levels were measured using standard methods at baseline, and after an average of six years. We did correlation analysis and built linear regression models as well as Random Forest regression models to identify metabolites that predict the development of fasting glucose in our cohort. Results We found a metabolic pattern consisting of nine metabolites that predicted fasting glucose development with an accuracy of 0.47 in tenfold cross-validation using Random Forest regression. We also showed that adding established risk markers did not improve the model accuracy. However, external validation is eventually desirable. Although not all metabolites belonging to the final pattern are identified yet, the pattern directs attention to amino acid metabolism, energy metabolism and redox homeostasis. Conclusions We demonstrate that metabolites identified using a high-throughput method (GC-MS) perform well in predicting the development of fasting plasma glucose over several years. Notably, not single, but a complex pattern of metabolites propels the prediction and therefore reflects the complexity of the underlying molecular mechanisms. This result could only be captured by application of multivariate statistical approaches. Therefore, we highly recommend the usage of statistical methods that seize the complexity of the information given by high-throughput methods.

Decision trees as a simple-to-use and reliable tool to identify individuals with impaired glucose metabolism or type 2 diabetes mellitus (2010)

Hische, Manuela ; Luis-Dominguez, Olga ; Pfeiffer, Andreas F. H. ; Schwarz, Peter E. ; Selbig, Joachim ; Spranger, Joachim

Objective: The prevalence of unknown impaired fasting glucose (IFG), impaired glucose tolerance (IGT), or type 2 diabetes mellitus (T2DM) is high. Numerous studies demonstrated that IFG, IGT, or T2DM are associated with increased cardiovascular risk, therefore an improved identification strategy would be desirable. The objective of this study was to create a simple and reliable tool to identify individuals with impaired glucose metabolism (IGM). Design and methods: A cohort of 1737 individuals (1055 controls, 682 with previously unknown IGM) was screened by 75 g oral glucose tolerance test (OGTT). Supervised machine learning was used to automatically generate decision trees to identify individuals with IGM. To evaluate the accuracy of identification, a tenfold cross-validation was performed. Resulting trees were subsequently re-evaluated in a second, independent cohort of 1998 individuals (1253 controls, 745 unknown IGM). Results: A clinical decision tree included age and systolic blood pressure (sensitivity 89.3%, specificity 37.4%, and positive predictive value (PPV) 48.0%), while a tree based on clinical and laboratory data included fasting glucose and systolic blood pressure (sensitivity 89.7%, specificity 54.6%, and PPV 56.2%). The inclusion of additional parameters did not improve test quality. The external validation approach confirmed the presented decision trees. Conclusion: We proposed a simple tool to identify individuals with existing IGM. From a practical perspective, fasting blood glucose and blood pressure measurements should be regularly measured in all individuals presenting in outpatient clinics. An OGTT appears to be useful only if the subjects are older than 48 years or show abnormalities in fasting glucose or blood pressure.

Refine

Has Fulltext

Author

Year of publication

Document Type

Language

Is part of the Bibliography

Keywords

Institute

68 search hits