Refine
Year of publication
Language
- English (95)
Is part of the Bibliography
- yes (95)
Keywords
- Arabidopsis thaliana (8)
- Network clustering (5)
- Protein complexes (3)
- Species comparison (3)
- respiration (3)
- Ascophyllum nodosum (2)
- Coherent partition (2)
- Graph partitions (2)
- GxE interaction (2)
- Metabolic networks (2)
The study of non-coding RNA genes has received increased attention in recent years fuelled by accumulating evidence that larger portions of genomes than previously acknowledged are transcribed into RNA molecules of mostly unknown function, as well as the discovery of novel non-coding RNA types and functional RNA elements. Here, we demonstrate that specific properties of graphs that represent the predicted RNA secondary structure reflect functional information. We introduce a computational algorithm and an associated web-based tool (GraPPLE) for classifying non-coding RNA molecules as functional and, furthermore, into Rfam families based on their graph properties. Unlike sequence-similarity-based methods and covariance models, GraPPLE is demonstrated to be more robust with regard to increasing sequence divergence, and when combined with existing methods, leads to a significant improvement of prediction accuracy. Furthermore, graph properties identified as most informative are shown to provide an understanding as to what particular structural features render RNA molecules functional. Thus, GraPPLE may offer a valuable computational filtering tool to identify potentially interesting RNA molecules among large candidate datasets.
Background: Biological systems adapt to changing environments by reorganizing their cellula r and physiological program with metabolites representing one important response level. Different stresses lead to both conserved and specific responses on the metabolite level which should be reflected in the underl ying metabolic network. Methodology/Principal Findings: Starting from experimental data obtained by a GC-MS based high-throughput metabolic profiling technology we here develop an approach that: (1) extracts network representations from metabolic conditiondependent data by using pairwise correlations, (2) determines the sets of stable and condition-dependent correlations based on a combination of statistical significance and homogeneity tests, and (3) can identify metabolites related to the stress response, which goes beyond simple ob servation s about the changes of metabolic concentrations. The approach was tested with Escherichia colias a model organism observed under four different environmental stress conditions (cold stress, heat stress, oxidative stress, lactose diau xie) and control unperturbed conditions. By constructing the stable network component, which displays a scale free topology and small-world characteristics, we demonstrated that: (1) metabolite hubs in this reconstructed correlation networks are significantly enriched for those contained in biochemical networks such as EcoCyc, (2) particular components of the stable network are enriched for functionally related biochemical path ways, and (3) ind ependently of the response scale, based on their importance in the reorganization of the cor relation network a set of metabolites can be identified which represent hypothetical candidates for adjusting to a stress-specific response. Conclusions/Significance: Network-based tools allowed the identification of stress-dependent and general metabolic correlation networks. This correlation-network-ba sed approach does not rely on major changes in concentration to identify metabolites important for st ress adaptation, but rather on the changes in network properties with respect to metabolites. This should represent a useful complementary technique in addition to more classical approaches.
Motivation: Network-centered studies in systems biology attempt to integrate the topological properties of biological networks with experimental data in order to make predictions and posit hypotheses. For any topology-based prediction, it is necessary to first assess the significance of the analyzed property in a biologically meaningful context. Therefore, devising network null models, carefully tailored to the topological and biochemical constraints imposed on the network, remains an important computational problem.
Results: We first review the shortcomings of the existing generic sampling scheme-switch randomization-and explain its unsuitability for application to metabolic networks. We then devise a novel polynomial-time algorithm for randomizing metabolic networks under the (bio)chemical constraint of mass balance. The tractability of our method follows from the concept of mass equivalence classes, defined on the representation of compounds in the vector space over chemical elements. We finally demonstrate the uniformity of the proposed method on seven genome-scale metabolic networks, and empirically validate the theoretical findings. The proposed method allows a biologically meaningful estimation of significance for metabolic network properties.
Identifying causal links (couplings) is a fundamental problem that facilitates the understanding of emerging structures in complex networks. We propose and analyze inner composition alignment-a novel, permutation-based asymmetric association measure to detect regulatory links from very short time series, currently applied to gene expression. The measure can be used to infer the direction of couplings, detect indirect (superfluous) links, and account for autoregulation. Applications to the gene regulatory network of E. coli are presented.
Background: Inferring regulatory interactions between genes from transcriptomics time-resolved data, yielding reverse engineered gene regulatory networks, is of paramount importance to systems biology and bioinformatics studies. Accurate methods to address this problem can ultimately provide a deeper insight into the complexity, behavior, and functions of the underlying biological systems. However, the large number of interacting genes coupled with short and often noisy time-resolved read-outs of the system renders the reverse engineering a challenging task. Therefore, the development and assessment of methods which are computationally efficient, robust against noise, applicable to short time series data, and preferably capable of reconstructing the directionality of the regulatory interactions remains a pressing research problem with valuable applications.
Results: Here we perform the largest systematic analysis of a set of similarity measures and scoring schemes within the scope of the relevance network approach which are commonly used for gene regulatory network reconstruction from time series data. In addition, we define and analyze several novel measures and schemes which are particularly suitable for short transcriptomics time series. We also compare the considered 21 measures and 6 scoring schemes according to their ability to correctly reconstruct such networks from short time series data by calculating summary statistics based on the corresponding specificity and sensitivity. Our results demonstrate that rank and symbol based measures have the highest performance in inferring regulatory interactions. In addition, the proposed scoring scheme by asymmetric weighting has shown to be valuable in reducing the number of false positive interactions. On the other hand, Granger causality as well as information-theoretic measures, frequently used in inference of regulatory networks, show low performance on the short time series analyzed in this study.
Conclusions: Our study is intended to serve as a guide for choosing a particular combination of similarity measures and scoring schemes suitable for reconstruction of gene regulatory networks from short time series data. We show that further improvement of algorithms for reverse engineering can be obtained if one considers measures that are rooted in the study of symbolic dynamics or ranks, in contrast to the application of common similarity measures which do not consider the temporal character of the employed data. Moreover, we establish that the asymmetric weighting scoring scheme together with symbol based measures (for low noise level) and rank based measures (for high noise level) are the most suitable choices.
The Calvin-Benson cycle (CBC) provides the precursors for biomass synthesis necessary for plant growth. The dynamic behavior and yield of the CBC depend on the environmental conditions and regulation of the cellular state. Accurate quantitative models hold the promise of identifying the key determinants of the tightly regulated CBC function and their effects on the responses in future climates. We provide an integrative analysis of the largest compendium of existing models for photosynthetic processes. Based on the proposed ranking, our framework facilitates the discovery of best-performing models with regard to metabolomics data and of candidates for metabolic engineering.
Integration of high-throughput data with functional annotation by graph-theoretic methods has been postulated as promising way to unravel the function of unannotated genes. Here, we first review the existing graph-theoretic approaches for automated gene function annotation and classify them into two categories with respect to their relation to two instances of transductive learning on networks - with dynamic costs and with constant costs - depending on whether or not ontological relationship between functional terms is employed. The determined categories allow to characterize the computational complexity of the existing approaches and establish the relation to classical graph-theoretic problems, such as bisection and multiway cut. In addition, our results point out that the ontological form of the structured functional knowledge does not lower the complexity of the transductive learning with dynamic costs - one of the key problems in modern systems biology. The NP-hardness of automated gene annotation renders the development of heuristic or approximation algorithms a priority for additional research.
Spatiotemporal dynamics of the Calvin cycle multistationarity and symmetry breaking instabilities
(2011)
The possibility of controlling the Calvin cycle has paramount implications for increasing the production of biomass. Multistationarity, as a dynamical feature of systems, is the first obvious candidate whose control could find biotechnological applications. Here we set out to resolve the debate on the multistationarity of the Calvin cycle. Unlike the existing simulation-based studies, our approach is based on a sound mathematical framework, chemical reaction network theory and algebraic geometry, which results in provable results for the investigated model of the Calvin cycle in which we embed a hierarchy of realistic kinetic laws. Our theoretical findings demonstrate that there is a possibility for multistationarity resulting from two sources, homogeneous and inhomogeneous instabilities, which partially settle the debate on multistability of the Calvin cycle. In addition, our tractable analytical treatment of the bifurcation parameters can be employed in the design of validation experiments.
Large-scale co-expression approach to dissect secondary cell wall formation across plant species
(2011)
Plant cell walls are complex composites largely consisting of carbohydrate-based polymers, and are generally divided into primary and secondary walls based on content and characteristics. Cellulose microfibrils constitute a major component of both primary and secondary cell walls and are synthesized at the plasma membrane by cellulose synthase (CESA) complexes. Several studies in Arabidopsis have demonstrated the power of co-expression analyses to identify new genes associated with secondary wall cellulose biosynthesis. However, across-species comparative co-expression analyses remain largely unexplored. Here, we compared co-expressed gene vicinity networks of primary and secondary wall CESAsin Arabidopsis, barley, rice, poplar, soybean, Medicago, and wheat, and identified gene families that are consistently co-regulated with cellulose biosynthesis. In addition to the expected polysaccharide acting enzymes, we also found many gene families associated with cytoskeleton, signaling, transcriptional regulation, oxidation, and protein degradation. Based on these analyses, we selected and biochemically analyzed T-DNA insertion lines corresponding to approximately twenty genes from gene families that re-occur in the co-expressed gene vicinity networks of secondary wall CESAs across the seven species. We developed a statistical pipeline using principal component analysis and optimal clustering based on silhouette width to analyze sugar profiles. One of the mutants, corresponding to a pinoresinol reductase gene, displayed disturbed xylem morphology and held lower levels of lignin molecules. We propose that this type of large-scale co-expression approach, coupled with statistical analysis of the cell wall contents, will be useful to facilitate rapid knowledge transfer across plant species.