publish.UP Search

A "Crossomics" study analysing variability of different components in peripheral blood of healthy caucasoid individuals (2012)

Gruden, Kristina ; Hren, Matjaz ; Herman, Ana ; Blejec, Andrej ; Albrecht, Tanja ; Selbig, Joachim ; Bauer, Christian G. ; Schuchardt, Johannes ; Or-Guil, Michal ; Zupancic, Klemen ; Svajger, Urban ; Stabuc, Borut ; Ihan, Alojz ; Kopitar, Andreja Natasa ; Ravnikar, Maja ; Knezevic, Miomir ; Rozman, Primoz ; Jeras, Matjaz

Background: Different immunotherapy approaches for the treatment of cancer and autoimmune diseases are being developed and tested in clinical studies worldwide. Their resulting complex experimental data should be properly evaluated, therefore reliable normal healthy control baseline values are indispensable. Methodology/Principal Findings: To assess intra- and inter-individual variability of various biomarkers, peripheral blood of 16 age and gender equilibrated healthy volunteers was sampled on 3 different days within a period of one month. Complex "crossomics'' analyses of plasma metabolite profiles, antibody concentrations and lymphocyte subset counts as well as whole genome expression profiling in CD4(+)T and NK cells were performed. Some of the observed age, gender and BMI dependences are in agreement with the existing knowledge, like negative correlation between sex hormone levels and age or BMI related increase in lipids and soluble sugars. Thus we can assume that the distribution of all 39.743 analysed markers is well representing the normal Caucasoid population. All lymphocyte subsets, 20% of metabolites and less than 10% of genes, were identified as highly variable in our dataset. Conclusions/Significance: Our study shows that the intra- individual variability was at least two-fold lower compared to the inter-individual one at all investigated levels, showing the importance of personalised medicine approach from yet another perspective.

A MATLAB toolbox for structural kinetic modeling (2012)

Girbig, Dorothee ; Selbig, Joachim ; Grimbs, Sergio

Structural kinetic modeling (SKM) enables the analysis of dynamical properties of metabolic networks solely based on topological information and experimental data. Current SKM-based experiments are hampered by the time-intensive process of assigning model parameters and choosing appropriate sampling intervals for MonteCarlo experiments. We introduce a toolbox for the automatic and efficient construction and evaluation of structural kinetic models (SK models). Quantitative and qualitative analyses of network stability properties are performed in an automated manner. We illustrate the model building and analysis process in detailed example scripts that provide toolbox implementations of previously published literature models.

A new network model explains the evolution of plant-specific metabolic networks (2009)

Nikoloski, Zoran ; May, Patrick ; Selbig, Joachim

Analysis of phylogenetic signal in protostomial intron patterns using Mutual Information (2013)

Hill, Natascha ; Leow, Alexander ; Bleidorn, Christoph ; Groth, Detlef ; Tiedemann, Ralph ; Selbig, Joachim ; Hartmann, Stefanie

Many deep evolutionary divergences still remain unresolved, such as those among major taxa of the Lophotrochozoa. As alternative phylogenetic markers, the intron-exon structure of eukaryotic genomes and the patterns of absence and presence of spliceosomal introns appear to be promising. However, given the potential homoplasy of intron presence, the phylogenetic analysis of this data using standard evolutionary approaches has remained a challenge. Here, we used Mutual Information (MI) to estimate the phylogeny of Protostomia using gene structure data, and we compared these results with those obtained with Dollo Parsimony. Using full genome sequences from nine Metazoa, we identified 447 groups of orthologous sequences with 21,732 introns in 4,870 unique intron positions. We determined the shared absence and presence of introns in the corresponding sequence alignments and have made this data available in "IntronBase", a web-accessible and downloadable SQLite database. Our results obtained using Dollo Parsimony are obviously misled through systematic errors that arise from multiple intron loss events, but extensive filtering of data improved the quality of the estimated phylogenies. Mutual Information, in contrast, performs better with larger datasets, but at the same time it requires a complete data set, which is difficult to obtain for orthologs from a large number of taxa. Nevertheless, Mutual Information-based distances proved to be useful in analyzing this kind of data, also because the estimation of MI-based distances is independent of evolutionary models and therefore no pre-definitions of ancestral and derived character states are necessary.

Bioinformatics approach to predicting HIV drug resistance (2006)

Cordes, Frank ; Kaiser, Rolf ; Selbig, Joachim

The emergence of drug resistance remains one of the most challenging issues in the treatment of HIV-1 infection. The extreme replication dynamics of HIV facilitates its escape from the selective pressure exerted by the human immune system and by the applied combination drug therapy. This article reviews computational methods whose combined use can support the design of optimal antiretroviral therapies based on viral genotypic and phenotypic data. Genotypic assays are based on the analysis of mutations associated with reduced drug susceptibility, but are difficult to interpret due to the numerous mutations and mutational patterns that confer drug resistance. Phenotypic resistance or susceptibility can be experimentally evaluated by measuring the inhibition of the viral replication in cell culture assays. However, this procedure is expensive and time consuming

Biological cluster evaluation for gene function prediction (2014)

Klie, Sebastian ; Nikoloski, Zoran ; Selbig, Joachim

Recent advances in high-throughput omics techniques render it possible to decode the function of genes by using the "guilt-by-association" principle on biologically meaningful clusters of gene expression data. However, the existing frameworks for biological evaluation of gene clusters are hindered by two bottleneck issues: (1) the choice for the number of clusters, and (2) the external measures which do not take in consideration the structure of the analyzed data and the ontology of the existing biological knowledge. Here, we address the identified bottlenecks by developing a novel framework that allows not only for biological evaluation of gene expression clusters based on existing structured knowledge, but also for prediction of putative gene functions. The proposed framework facilitates propagation of statistical significance at each of the following steps: (1) estimating the number of clusters, (2) evaluating the clusters in terms of novel external structural measures, (3) selecting an optimal clustering algorithm, and (4) predicting gene functions. The framework also includes a method for evaluation of gene clusters based on the structure of the employed ontology. Moreover, our method for obtaining a probabilistic range for the number of clusters is demonstrated valid on synthetic data and available gene expression profiles from Saccharomyces cerevisiae. Finally, we propose a network-based approach for gene function prediction which relies on the clustering of optimal score and the employed ontology. Our approach effectively predicts gene function on the Saccharomyces cerevisiae data set and is also employed to obtain putative gene functions for an Arabidopsis thaliana data set.

Comparison of metabolite profiles in U87 glioma cells and mesenchymal stem cells (2011)

Juerchott, Kathrin ; Guo, Ke-Tai ; Catchpole, Gareth ; Feher, Kristen ; Willmitzer, Lothar ; Schichor, Christian ; Selbig, Joachim

Gas chromatography-mass spectrometry (GC-MS) profiles were generated from U87 glioma cells and human mesenchymal stem cells (hMSC). 37 metabolites representing glycolysis intermediates, TCA cycle metabolites, amino acids and lipids were selected for a detailed analysis. The concentrations of these. metabolites were compared and Pearson correlation coefficients were used to calculate the relationship between pairs of metabolites. Metabolite profiles and correlation patterns differ significantly between the two cell lines. These profiles can be considered as a signature of the underlying biochemical system and provide snap-shots of the metabolism in mesenchymal stem cells and tumor cells.

Complexity of automated gene annotation (2011)

Nikoloski, Zoran ; Grimbs, Sergio ; Klie, Sebastian ; Selbig, Joachim

Integration of high-throughput data with functional annotation by graph-theoretic methods has been postulated as promising way to unravel the function of unannotated genes. Here, we first review the existing graph-theoretic approaches for automated gene function annotation and classify them into two categories with respect to their relation to two instances of transductive learning on networks - with dynamic costs and with constant costs - depending on whether or not ontological relationship between functional terms is employed. The determined categories allow to characterize the computational complexity of the existing approaches and establish the relation to classical graph-theoretic problems, such as bisection and multiway cut. In addition, our results point out that the ontological form of the structured functional knowledge does not lower the complexity of the transductive learning with dynamic costs - one of the key problems in modern systems biology. The NP-hardness of automated gene annotation renders the development of heuristic or approximation algorithms a priority for additional research.

Computational methods for the design of effective therapies against drug resistant HIV strains (2005)

Beerenwinkel, Niko ; Sing, Tobias ; Lengauer, Thomas ; Rahnenfuhrer, Joerg ; Roomp, Kirsten ; Savenkov, Igor ; Fischer, Roman ; Hoffmann, Daniel ; Selbig, Joachim ; Korn, Klaus ; Walter, Hauke ; Berg, Thomas ; Braun, Patrick ; Faetkenheuer, Gerd ; Oette, Mark ; Rockstroh, Juergen ; Kupfer, Bernd ; Kaiser, Rolf ; Daeumer, Martin

The development of drug resistance is a major obstacle to successful treatment of HIV infection. The extraordinary replication dynamics of HIV facilitates its escape from selective pressure exerted by the human immune system and by combination drug therapy. We have developed several computational methods whose combined use can support the design of optimal antiretroviral therapies based on viral genomic data

Corn hybrids display lower metabolite variability and complex metabolite inheritance patterns (2011)

Lisec, Jan ; Römisch-Margl, Lilla ; Nikoloski, Zoran ; Piepho, Hans-Peter ; Giavalisco, Patrick ; Selbig, Joachim ; Gierl, Alfons ; Willmitzer, Lothar

We conducted a comparative analysis of the root metabolome of six parental maize inbred lines and their 14 corresponding hybrids showing fresh weight heterosis. We demonstrated that the metabolic profiles not only exhibit distinct features for each hybrid line compared with its parental lines, but also separate reciprocal hybrids. Reconstructed metabolic networks, based on robust correlations between metabolic profiles, display a higher network density in most hybrids as compared with the corresponding inbred lines. With respect to metabolite level inheritance, additive, dominant and overdominant patterns are observed with no specific overrepresentation. Despite the observed complexity of the inheritance pattern, for the majority of metabolites the variance observed in all 14 hybrids is lower compared with inbred lines. Deviations of metabolite levels from the average levels of the hybrids correlate negatively with biomass, which could be applied for developing predictors of hybrid performance based on characteristics of metabolite patterns.

Data integration for identification of important transcription factors of STAT6-mediated cell fate decisions (2016)

Jargosch, M. ; Kroeger, S. ; Gralinska, E. ; Klotz, Ulrike ; Fang, Z. ; Chen, W. ; Leser, U. ; Selbig, Joachim ; Groth, Detlef ; Baumgrass, Ria

Data integration has become a useful strategy for uncovering new insights into complex biological networks. We studied whether this approach can help to delineate the signal transducer and activator of transcription 6 (STAT6)-mediated transcriptional network driving T helper (Th) 2 cell fate decisions. To this end, we performed an integrative analysis of publicly available RNA-seq data of Stat6-knockout mouse studies together with STAT6 ChIP-seq data and our own gene expression time series data during Th2 cell differentiation. We focused on transcription factors (TFs), cytokines, and cytokine receptors and delineated 59 positively and 41 negatively STAT6-regulated genes, which were used to construct a transcriptional network around STAT6. The network illustrates that important and well-known TFs for Th2 cell differentiation are positively regulated by STAT6 and act either as activators for Th2 cells (e.g., Gata3, Atf3, Satb1, Nfil3, Maf, and Pparg) or as suppressors for other Th cell subpopulations such as Th1 (e.g., Ar), Th17 (e.g., Etv6), or iTreg (e.g., Stat3 and Hifla) cells. Moreover, our approach reveals 11 TFs (e.g., Atf5, Creb3l2, and Asb2) with unknown functions in Th cell differentiation. This fact together with the observed enrichment of asthma risk genes among those regulated by STAT6 underlines the potential value of the data integration strategy used here. Thus, our results clearly support the opinion that data integration is a useful tool to delineate complex physiological processes.

Decision trees as a simple-to-use and reliable tool to identify individuals with impaired glucose metabolism or type 2 diabetes mellitus (2010)

Hische, Manuela ; Luis-Dominguez, Olga ; Pfeiffer, Andreas F. H. ; Schwarz, Peter E. ; Selbig, Joachim ; Spranger, Joachim

Objective: The prevalence of unknown impaired fasting glucose (IFG), impaired glucose tolerance (IGT), or type 2 diabetes mellitus (T2DM) is high. Numerous studies demonstrated that IFG, IGT, or T2DM are associated with increased cardiovascular risk, therefore an improved identification strategy would be desirable. The objective of this study was to create a simple and reliable tool to identify individuals with impaired glucose metabolism (IGM). Design and methods: A cohort of 1737 individuals (1055 controls, 682 with previously unknown IGM) was screened by 75 g oral glucose tolerance test (OGTT). Supervised machine learning was used to automatically generate decision trees to identify individuals with IGM. To evaluate the accuracy of identification, a tenfold cross-validation was performed. Resulting trees were subsequently re-evaluated in a second, independent cohort of 1998 individuals (1253 controls, 745 unknown IGM). Results: A clinical decision tree included age and systolic blood pressure (sensitivity 89.3%, specificity 37.4%, and positive predictive value (PPV) 48.0%), while a tree based on clinical and laboratory data included fasting glucose and systolic blood pressure (sensitivity 89.7%, specificity 54.6%, and PPV 56.2%). The inclusion of additional parameters did not improve test quality. The external validation approach confirmed the presented decision trees. Conclusion: We proposed a simple tool to identify individuals with existing IGM. From a practical perspective, fasting blood glucose and blood pressure measurements should be regularly measured in all individuals presenting in outpatient clinics. An OGTT appears to be useful only if the subjects are older than 48 years or show abnormalities in fasting glucose or blood pressure.

Deducing hybrid performance from parental metabolic profiles of young primary roots of maize by using a multivariate diallel approach (2014)

Feher, Kristen ; Lisec, Jan ; Roemisch-Margl, Lilla ; Selbig, Joachim ; Gierl, Alfons ; Piepho, Hans-Peter ; Nikoloski, Zoran ; Willmitzer, Lothar

Discovering plant metabolic biomarkers for phenotype prediction using an untargeted approach (2010)

Steinfath, Matthias ; Strehmel, Nadine ; Peters, Rolf ; Schauer, Nicolas ; Groth, Detlef ; Hummel, Jan ; Steup, Martin ; Selbig, Joachim ; Kopka, Joachim ; Geigenberger, Peter ; Dongen, Joost T. van

Biomarkers are used to predict phenotypical properties before these features become apparent and, therefore, are valuable tools for both fundamental and applied research. Diagnostic biomarkers have been discovered in medicine many decades ago and are now commonly applied. While this is routine in the field of medicine, it is of surprise that in agriculture this approach has never been investigated. Up to now, the prediction of phenotypes in plants was based on growing plants and assaying the organs of interest in a time intensive process. For the first time, we demonstrate in this study the application of metabolomics to predict agronomic important phenotypes of a crop plant that was grown in different environments. Our procedure consists of established techniques to screen untargeted for a large amount of metabolites in parallel, in combination with machine learning methods. By using this combination of metabolomics and biomathematical tools metabolites were identified that can be used as biomarkers to improve the prediction of traits. The predictive metabolites can be selected and used subsequently to develop fast, targeted and low-cost diagnostic biomarker assays that can be implemented in breeding programs or quality assessment analysis. The identified metabolic biomarkers allow for the prediction of crop product quality. Furthermore, marker-assisted selection can benefit from the discovery of metabolic biomarkers when other molecular markers come to its limitation. The described marker selection method was developed for potato tubers, but is generally applicable to any crop and trait as it functions independently of genomic information.

Estimating mutual information using B-spline functions : an improved similarity measure for analysing gene expression data (2004)

Daub, Carsten O. ; Steuer, Ralf ; Selbig, Joachim ; Kloska, Sebastian

Background: The information theoretic concept of mutual information provides a general framework to evaluate dependencies between variables. In the context of the clustering of genes with similar patterns of expression it has been suggested as a general quantity of similarity to extend commonly used linear measures. Since mutual information is defined in terms of discrete variables, its application to continuous data requires the use of binning procedures, which can lead to significant numerical errors for datasets of small or moderate size. Results: In this work, we propose a method for the numerical estimation of mutual information from continuous data. We investigate the characteristic properties arising from the application of our algorithm and show that our approach outperforms commonly used algorithms: The significance, as a measure of the power of distinction from random correlation, is significantly increased. This concept is subsequently illustrated on two large-scale gene expression datasets and the results are compared to those obtained using other similarity measures. A C++ source code of our algorithm is available for non- commercial use from kloska@scienion.de upon request. Conclusion: The utilisation of mutual information as similarity measure enables the detection of non-linear correlations in gene expression datasets. Frequently applied linear correlation measures, which are often used on an ad-hoc basis without further justification, are thereby extended

Evolutionary significance of metabolic network properties (2012)

Basler, Georg ; Grimbs, Sergio ; Ebenhöh, Oliver ; Selbig, Joachim ; Nikoloski, Zoran

Complex networks have been successfully employed to represent different levels of biological systems, ranging from gene regulation to protein-protein interactions and metabolism. Network-based research has mainly focused on identifying unifying structural properties, such as small average path length, large clustering coefficient, heavy-tail degree distribution and hierarchical organization, viewed as requirements for efficient and robust system architectures. However, for biological networks, it is unclear to what extent these properties reflect the evolutionary history of the represented systems. Here, we show that the salient structural properties of six metabolic networks from all kingdoms of life may be inherently related to the evolution and functional organization of metabolism by employing network randomization under mass balance constraints. Contrary to the results from the common Markov-chain switching algorithm, our findings suggest the evolutionary importance of the small-world hypothesis as a fundamental design principle of complex networks. The approach may help us to determine the biologically meaningful properties that result from evolutionary pressure imposed on metabolism, such as the global impact of local reaction knockouts. Moreover, the approach can be applied to test to what extent novel structural properties can be used to draw biologically meaningful hypothesis or predictions from structure alone.

Exercise training alters DNA methylation patterns in genes related to muscle growth and differentiation in mice (2015)

Kanzleiter, Timo ; Jaehnert, Markus ; Schulze, Gunnar ; Selbig, Joachim ; Hallahan, Nicole ; Schwenk, Robert Wolfgang ; Schürmann, Annette

The adaptive response of skeletal muscle to exercise training is tightly controlled and therefore requires transcriptional regulation. DNA methylation is an epigenetic mechanism known to modulate gene expression, but its contribution to exercise-induced adaptations in skeletal muscle is not well studied. Here, we describe a genome-wide analysis of DNA methylation in muscle of trained mice (n = 3). Compared with sedentary controls, 2,762 genes exhibited differentially methylated CpGs (P < 0.05, meth diff >5%, coverage > 10) in their putative promoter regions. Alignment with gene expression data (n = 6) revealed 200 genes with a negative correlation between methylation and expression changes in response to exercise training. The majority of these genes were related to muscle growth and differentiation, and a minor fraction involved in metabolic regulation. Among the candidates were genes that regulate the expression of myogenic regulatory factors (Plexin A2) as well as genes that participate in muscle hypertrophy (Igfbp4) and motor neuron innervation (Dok7). Interestingly, a transcription factor binding site enrichment study discovered significantly enriched occurrence of CpG methylation in the binding sites of the myogenic regulatory factors MyoD and myogenin. These findings suggest that DNA methylation is involved in the regulation of muscle adaptation to regular exercise training.

Exploiting gene families for phylogenomic analysis of myzostomid transcriptome data (2012)

Hartmann, Stefanie ; Helm, Conrad ; Nickel, Birgit ; Meyer, Matthias ; Struck, Torsten H. ; Tiedemann, Ralph ; Selbig, Joachim ; Bleidorn, Christoph

Background: In trying to understand the evolutionary relationships of organisms, the current flood of sequence data offers great opportunities, but also reveals new challenges with regard to data quality, the selection of data for subsequent analysis, and the automation of steps that were once done manually for single-gene analyses. Even though genome or transcriptome data is available for representatives of most bilaterian phyla, some enigmatic taxa still have an uncertain position in the animal tree of life. This is especially true for myzostomids, a group of symbiotic ( or parasitic) protostomes that are either placed with annelids or flatworms. Methodology: Based on similarity criteria, Illumina-based transcriptome sequences of one myzostomid were compared to protein sequences of one additional myzostomid and 29 reference metazoa and clustered into gene families. These families were then used to investigate the phylogenetic position of Myzostomida using different approaches: Alignments of 989 sequence families were concatenated, and the resulting superalignment was analyzed under a Maximum Likelihood criterion. We also used all 1,878 gene trees with at least one myzostomid sequence for a supertree approach: the individual gene trees were computed and then reconciled into a species tree using gene tree parsimony. Conclusions: Superalignments require strictly orthologous genes, and both the gene selection and the widely varying amount of data available for different taxa in our dataset may cause anomalous placements and low bootstrap support. In contrast, gene tree parsimony is designed to accommodate multilocus gene families and therefore allows a much more comprehensive data set to be analyzed. Results of this supertree approach showed a well-resolved phylogeny, in which myzostomids were part of the annelid radiation, and major bilaterian taxa were found to be monophyletic.

F2C2: a fast tool for the computation of flux coupling in genome-scale metabolic networks (2012)

Larhlimi, Abdelhalim ; David, Laszlo ; Selbig, Joachim ; Bockmayr, Alexander

Background: Flux coupling analysis (FCA) has become a useful tool in the constraint-based analysis of genome-scale metabolic networks. FCA allows detecting dependencies between reaction fluxes of metabolic networks at steady-state. On the one hand, this can help in the curation of reconstructed metabolic networks by verifying whether the coupling between reactions is in agreement with the experimental findings. On the other hand, FCA can aid in defining intervention strategies to knock out target reactions. Results: We present a new method F2C2 for FCA, which is orders of magnitude faster than previous approaches. As a consequence, FCA of genome-scale metabolic networks can now be performed in a routine manner. Conclusions: We propose F2C2 as a fast tool for the computation of flux coupling in genome-scale metabolic networks. F2C2 is freely available for non-commercial use at https://sourceforge.net/projects/f2c2/files/.

Finding metabolic pathways in decision forests (2004)

Flöter, André ; Selbig, Joachim ; Schaub, Torsten H.

Refine

Has Fulltext

Author

Year of publication

Document Type

Language

Is part of the Bibliography

Keywords

Institute

52 search hits