TY - JOUR A1 - Steuer, Ralf A1 - Humburg, Peter A1 - Selbig, Joachim T1 - Validation and functional annotation of expression-based clusters based on gene ontology JF - BMC bioinformatics N2 - Background: The biological interpretation of large-scale gene expression data is one of the paramount challenges in current bioinformatics. In particular, placing the results in the context of other available functional genomics data, such as existing bio-ontologies, has already provided substantial improvement for detecting and categorizing genes of interest. One common approach is to look for functional annotations that are significantly enriched within a group or cluster of genes, as compared to a reference group. Results: In this work, we suggest the information-theoretic concept of mutual information to investigate the relationship between groups of genes, as given by data-driven clustering, and their respective functional categories. Drawing upon related approaches (Gibbons and Roth, Genome Research 12: 1574-1581, 2002), we seek to quantify to what extent individual attributes are sufficient to characterize a given group or cluster of genes. Conclusion: We show that the mutual information provides a systematic framework to assess the relationship between groups or clusters of genes and their functional annotations in a quantitative way. Within this framework, the mutual information allows us to address and incorporate several important issues, such as the interdependence of functional annotations and combinatorial combinations of attributes. It thus supplements and extends the conventional search for overrepresented attributes within a group or cluster of genes. In particular taking combinations of attributes into account, the mutual information opens the way to uncover specific functional descriptions of a group of genes or clustering result. All datasets and functional annotations used in this study are publicly available. All scripts used in the analysis are provided as additional files. Y1 - 2006 U6 - https://doi.org/10.1186/1471-2105-7-380 SN - 1471-2105 VL - 7 IS - 380 PB - BioMed Central CY - London ER - TY - JOUR A1 - Scholz, Matthias A1 - Kaplan, F. A1 - Guy, C. L. A1 - Kopka, Joachim A1 - Selbig, Joachim T1 - Non-linear PCA : a missing data approach N2 - Motivation: Visualizing and analysing the potential non-linear structure of a dataset is becoming an important task in molecular biology. This is even more challenging when the data have missing values. Results: Here, we propose an inverse model that performs non-linear principal component analysis (NLPCA) from incomplete datasets. Missing values are ignored while optimizing the model, but can be estimated afterwards. Results are shown for both artificial and experimental datasets. In contrast to linear methods, non-linear methods were able to give better missing value estimations for non-linear structured data. Application: We applied this technique to a time course of metabolite data from a cold stress experiment on the model plant Arabidopsis thaliana, and could approximate the mapping function from any time point to the metabolite responses. Thus, the inverse NLPCA provides greatly improved information for better understanding the complex response to cold stress Y1 - 2005 SN - 1367-4803 ER - TY - JOUR A1 - Lisec, Jan A1 - Steinfath, Matthias A1 - Meyer, Rhonda C. A1 - Selbig, Joachim A1 - Melchinger, Albrecht E. A1 - Willmitzer, Lothar A1 - Altmann, Thomas T1 - Identification of heterotic metabolite QTL in Arabidopsis thaliana RIL and IL populations N2 - Two mapping populations of a cross between the Arabidopsis thaliana accessions Col-0 and C24 were cultivated and analyzed with respect to the levels of 181 metabolites to elucidate the biological phenomenon of heterosis at the metabolic level. The relative mid-parent heterosis in the F-1 hybrids was <20% for most metabolic traits. The first mapping population consisting of 369 recombinant inbred lines (RILs) and their test cross progeny with both parents allowed us to determine the position and effect of 147 quantitative trait loci (QTL) for metabolite absolute mid-parent heterosis (aMPH). Furthermore, we identified 153 and 83 QTL for augmented additive (Z(1)) and dominance effects (Z(2)), respectively. We identified putative candidate genes for these QTL using the ARACYC database (http://www.arabidopsis.org/ biocyc), and calculated the average degree of dominance, which was within the dominance and over-dominance range for most metabolites. Analyzing a second population of 41 introgression lines (ILs) and their test crosses with the recurrent parent, we identified 634 significant differences in metabolite levels. Nine per cent of these effects were classified as over-dominant, according to the mode of inheritance. A comparison of both approaches suggested epistasis as a major contributor to metabolite heterosis in Arabidopsis. A linear combination of metabolite levels was shown to significantly correlate with biomass heterosis (r = 0.62). Y1 - 2009 UR - http://www3.interscience.wiley.com/cgi-bin/issn?DESCRIPTOR=PRINTISSN&VALUE=0960-7412 U6 - https://doi.org/10.1111/j.1365-313X.2009.03910.x SN - 0960-7412 ER - TY - BOOK A1 - Hartmann, Stefanie A1 - Selbig, Joachim T1 - Introductory Bioinformatics Y1 - 2009 SN - 978-3-8370-5189-6 PB - Books on Demand CY - Norderstedt ER - TY - JOUR A1 - Grell, Susanne A1 - Schaub, Torsten H. A1 - Selbig, Joachim T1 - Modelling biological networks by action languages via set programming Y1 - 2006 UR - http://www.cs.uni-potsdam.de/wv/pdfformat/gebsch06c.pdf U6 - https://doi.org/10.1007/11799573 SN - 0302-9743 ER - TY - JOUR A1 - Flöter, André A1 - Selbig, Joachim A1 - Schaub, Torsten H. T1 - Finding metabolic pathways in decision forests Y1 - 2004 SN - 3-540-23221-4 ER - TY - JOUR A1 - Flöter, André A1 - Nicolas, Jacques A1 - Schaub, Torsten H. A1 - Selbig, Joachim T1 - Threshold extraction in metabolite concentration data N2 - Motivation: Continued development of analytical techniques based on gas chromatography and mass spectrometry now facilitates the generation of larger sets of metabolite concentration data. An important step towards the understanding of metabolite dynamics is the recognition of stable states where metabolite concentrations exhibit a simple behaviour. Such states can be characterized through the identification of significant thresholds in the concentrations. But general techniques for finding discretization thresholds in continuous data prove to be practically insufficient for detecting states due to the weak conditional dependences in concentration data. Results: We introduce a method of recognizing states in the framework of decision tree induction. It is based upon a global analysis of decision forests where stability and quality are evaluated. It leads to the detection of thresholds that are both comprehensible and robust. Applied to metabolite concentration data, this method has led to the discovery of hidden states in the corresponding variables. Some of these reflect known properties of the biological experiments, and others point to putative new states Y1 - 2004 ER - TY - JOUR A1 - Flöter, André A1 - Nicolas, Jacques A1 - Schaub, Torsten H. A1 - Selbig, Joachim T1 - Threshold extraction in metabolite concentration data Y1 - 2003 UR - http://www.cs.uni-potsdam.de/wv/pdfformat/floeterGCB2003.pdf ER - TY - JOUR A1 - Cordes, Frank A1 - Kaiser, Rolf A1 - Selbig, Joachim T1 - Bioinformatics approach to predicting HIV drug resistance N2 - The emergence of drug resistance remains one of the most challenging issues in the treatment of HIV-1 infection. The extreme replication dynamics of HIV facilitates its escape from the selective pressure exerted by the human immune system and by the applied combination drug therapy. This article reviews computational methods whose combined use can support the design of optimal antiretroviral therapies based on viral genotypic and phenotypic data. Genotypic assays are based on the analysis of mutations associated with reduced drug susceptibility, but are difficult to interpret due to the numerous mutations and mutational patterns that confer drug resistance. Phenotypic resistance or susceptibility can be experimentally evaluated by measuring the inhibition of the viral replication in cell culture assays. However, this procedure is expensive and time consuming Y1 - 2006 UR - http://www.expert-reviews.com/loi/erm U6 - https://doi.org/10.1586/14737159.6.2.207 SN - 1473-7159 ER - TY - JOUR A1 - Childs, Dorothee A1 - Grimbs, Sergio A1 - Selbig, Joachim T1 - Refined elasticity sampling for Monte Carlo-based identification of stabilizing network patterns JF - Bioinformatics N2 - Motivation: Structural kinetic modelling (SKM) is a framework to analyse whether a metabolic steady state remains stable under perturbation, without requiring detailed knowledge about individual rate equations. It provides a representation of the system's Jacobian matrix that depends solely on the network structure, steady state measurements, and the elasticities at the steady state. For a measured steady state, stability criteria can be derived by generating a large number of SKMs with randomly sampled elasticities and evaluating the resulting Jacobian matrices. The elasticity space can be analysed statistically in order to detect network positions that contribute significantly to the perturbation response. Here, we extend this approach by examining the kinetic feasibility of the elasticity combinations created during Monte Carlo sampling. Results: Using a set of small example systems, we show that the majority of sampled SKMs would yield negative kinetic parameters if they were translated back into kinetic models. To overcome this problem, a simple criterion is formulated that mitigates such infeasible models. After evaluating the small example pathways, the methodology was used to study two steady states of the neuronal TCA cycle and the intrinsic mechanisms responsible for their stability or instability. The findings of the statistical elasticity analysis confirm that several elasticities are jointly coordinated to control stability and that the main source for potential instabilities are mutations in the enzyme alpha-ketoglutarate dehydrogenase. Y1 - 2015 U6 - https://doi.org/10.1093/bioinformatics/btv243 SN - 1367-4803 SN - 1460-2059 VL - 31 IS - 12 SP - 214 EP - 220 PB - Oxford Univ. Press CY - Oxford ER - TY - JOUR A1 - Catchpole, Gareth A1 - Platzer, Alexander A1 - Weikert, Cornelia A1 - Kempkensteffen, Carsten A1 - Johannsen, Manfred A1 - Krause, Hans A1 - Jung, Klaus A1 - Miller, Kurt A1 - Willmitzer, Lothar A1 - Selbig, Joachim A1 - Weikert, Steffen T1 - Metabolic profiling reveals key metabolic features of renal cell carcinoma JF - Journal of cellular and molecular medicine : a journal of translational medicine N2 - Recent evidence suggests that metabolic changes play a pivotal role in the biology of cancer and in particular renal cell carcinoma (RCC). Here, a global metabolite profiling approach was applied to characterize the metabolite pool of RCC and normal renal tissue. Advanced decision tree models were applied to characterize the metabolic signature of RCC and to explore features of metastasized tumours. The findings were validated in a second independent dataset. Vitamin E derivates and metabolites of glucose, fatty acid, and inositol phosphate metabolism determined the metabolic profile of RCC. alpha-tocopherol, hippuric acid, myoinositol, fructose-1-phosphate and glucose-1-phosphate contributed most to the tumour/normal discrimination and all showed pronounced concentration changes in RCC. The identified metabolic profile was characterized by a low recognition error of only 5% for tumour versus normal samples. Data on metastasized tumours suggested a key role for metabolic pathways involving arachidonic acid, free fatty acids, proline, uracil and the tricarboxylic acid cycle. These results illustrate the potential of mass spectroscopy based metabolomics in conjunction with sophisticated data analysis methods to uncover the metabolic phenotype of cancer. Differentially regulated metabolites, such as vitamin E compounds, hippuric acid and myoinositol, provide leads for the characterization of novel pathways in RCC. KW - kidney cancer KW - metabolism KW - metabolomics KW - metastasis Y1 - 2011 U6 - https://doi.org/10.1111/j.1582-4934.2009.00939.x SN - 1582-1838 VL - 15 IS - 1 SP - 109 EP - 118 PB - Wiley-Blackwell CY - Malden ER - TY - JOUR A1 - Beerenwinkel, Niko A1 - Sing, Tobias A1 - Lengauer, Thomas A1 - Rahnenfuhrer, Joerg A1 - Roomp, Kirsten A1 - Savenkov, Igor A1 - Fischer, Roman A1 - Hoffmann, Daniel A1 - Selbig, Joachim A1 - Korn, Klaus A1 - Walter, Hauke A1 - Berg, Thomas A1 - Braun, Patrick A1 - Faetkenheuer, Gerd A1 - Oette, Mark A1 - Rockstroh, Juergen A1 - Kupfer, Bernd A1 - Kaiser, Rolf A1 - Daeumer, Martin T1 - Computational methods for the design of effective therapies against drug resistant HIV strains N2 - The development of drug resistance is a major obstacle to successful treatment of HIV infection. The extraordinary replication dynamics of HIV facilitates its escape from selective pressure exerted by the human immune system and by combination drug therapy. We have developed several computational methods whose combined use can support the design of optimal antiretroviral therapies based on viral genomic data Y1 - 2005 ER -