### Refine

#### Keywords

- Metabolic networks (2)
- algorithms (2)
- metabolic networks (2)
- respiration (2)
- 3D imaging (1)
- Algebraic geometry (1)
- Arabidopsis (1)
- Arabidopsis thaliana (1)
- Bifurcation parameters (1)
- Biomass (1)

Robustness of biochemical systems has become one of the central questions in systems biology although it is notoriously difficult to formally capture its multifaceted nature. Maintenance of normal system function depends not only on the stoichiometry of the underlying interrelated components, but also on the multitude of kinetic parameters. Invariant flux ratios, obtained within flux coupling analysis, as well as invariant complex ratios, derived within chemical reaction network theory, can characterize robust properties of a system at steady state. However, the existing formalisms for the description of these invariants do not provide full characterization as they either only focus on the flux-centric or the concentration-centric view. Here we develop a novel mathematical framework which combines both views and thereby overcomes the limitations of the classical methodologies. Our unified framework will be helpful in analyzing biologically important system properties.

Time hierarchies, arising as a result of interactions between system's components, represent a ubiquitous property of dynamical biological systems. In addition, biological systems have been attributed switch-like properties modulating the response to various stimuli across different organisms and environmental conditions. Therefore, establishing the interplay between these features of system dynamics renders itself a challenging question of practical interest in biology. Existing methods are suitable for systems with one stable steady state employed as a well-defined reference. In such systems, the characterization of the time hierarchies has already been used for determining the components that contribute to the dynamics of biological systems. However, the application of these methods to bistable nonlinear systems is impeded due to their inherent dependence on the reference state, which in this case is no longer unique. Here, we extend the applicability of the reference-state analysis by proposing, analyzing, and applying a novel method, which allows investigation of the time hierarchies in systems exhibiting bistability. The proposed method is in turn used in identifying the components, other than reactions, which determine the systemic dynamical properties. We demonstrate that in biological systems of varying levels of complexity and spanning different biological levels, the method can be effectively employed for model simplification while ensuring preservation of qualitative dynamical properties (i.e., bistability). Finally, by establishing a connection between techniques from nonlinear dynamics and multivariate statistics, the proposed approach provides the basis for extending reference-based analysis to bistable systems.

Background: Reconstruction of genome-scale metabolic networks has resulted in models capable of reproducing experimentally observed biomass yield/growth rates and predicting the effect of alterations in metabolism for biotechnological applications. The existing studies rely on modifying the metabolic network of an investigated organism by removing or inserting reactions taken either from evolutionary similar organisms or from databases of biochemical reactions (e.g., KEGG). A potential disadvantage of these knowledge-driven approaches is that the result is biased towards known reactions, as such approaches do not account for the possibility of including novel enzymes, together with the reactions they catalyze.
Results: Here, we explore the alternative of increasing biomass yield in three model organisms, namely Bacillus subtilis, Escherichia coil, and Hordeum vulgare, by applying small, chemically feasible network modifications. We use the predicted and experimentally confirmed growth rates of the wild-type networks as reference values and determine the effect of inserting mass-balanced, thermodynamically feasible reactions on predictions of growth rate by using flux balance analysis.
Conclusions: While many replacements of existing reactions naturally lead to a decrease or complete loss of biomass production ability, in all three investigated organisms we find feasible modifications which facilitate a significant increase in this biological function. We focus on modifications with feasible chemical properties and a significant increase in biomass yield. The results demonstrate that small modifications are sufficient to substantially alter biomass yield in the three organisms. The method can be used to predict the effect of targeted modifications on the yield of any set of metabolites (e.g., ethanol), thus providing a computational framework for synthetic metabolic engineering.

Motivation: Network-centered studies in systems biology attempt to integrate the topological properties of biological networks with experimental data in order to make predictions and posit hypotheses. For any topology-based prediction, it is necessary to first assess the significance of the analyzed property in a biologically meaningful context. Therefore, devising network null models, carefully tailored to the topological and biochemical constraints imposed on the network, remains an important computational problem.
Results: We first review the shortcomings of the existing generic sampling scheme-switch randomization-and explain its unsuitability for application to metabolic networks. We then devise a novel polynomial-time algorithm for randomizing metabolic networks under the (bio)chemical constraint of mass balance. The tractability of our method follows from the concept of mass equivalence classes, defined on the representation of compounds in the vector space over chemical elements. We finally demonstrate the uniformity of the proposed method on seven genome-scale metabolic networks, and empirically validate the theoretical findings. The proposed method allows a biologically meaningful estimation of significance for metabolic network properties.

We investigate the properties of a recently introduced asymmetric association measure, called inner composition alignment (IOTA), aimed at inferring regulatory links (couplings). We show that the measure can be used to determine the direction of coupling, detect superfluous links, and to account for autoregulation. In addition, the measure can be extended to infer the type of regulation (positive or negative). The capabilities of IOTA to correctly infer couplings together with their directionality are compared against Kendall's rank correlation for time series of different lengths, particularly focussing on biological examples. We demonstrate that an extended version of the measure, bidirectional inner composition alignment (biIOTA), increases the accuracy of the network reconstruction for short time series. Finally, we discuss the applicability of the measure to infer couplings in chaotic systems.

Complex networks have been successfully employed to represent different levels of biological systems, ranging from gene regulation to protein-protein interactions and metabolism. Network-based research has mainly focused on identifying unifying structural properties, such as small average path length, large clustering coefficient, heavy-tail degree distribution and hierarchical organization, viewed as requirements for efficient and robust system architectures. However, for biological networks, it is unclear to what extent these properties reflect the evolutionary history of the represented systems. Here, we show that the salient structural properties of six metabolic networks from all kingdoms of life may be inherently related to the evolution and functional organization of metabolism by employing network randomization under mass balance constraints. Contrary to the results from the common Markov-chain switching algorithm, our findings suggest the evolutionary importance of the small-world hypothesis as a fundamental design principle of complex networks. The approach may help us to determine the biologically meaningful properties that result from evolutionary pressure imposed on metabolism, such as the global impact of local reaction knockouts. Moreover, the approach can be applied to test to what extent novel structural properties can be used to draw biologically meaningful hypothesis or predictions from structure alone.

The tricarboxylic acid (TCA) cycle is a crucial component of respiratory metabolism in both photosynthetic and heterotrophic plant organs. All of the major genes of the tomato TCA cycle have been cloned recently, allowing the generation of a suite of transgenic plants in which the majority of the enzymes in the pathway are progressively decreased. Investigations of these plants have provided an almost complete view of the distribution of control in this important pathway. Our studies suggest that citrate synthase, aconitase, isocitrate dehydrogenase, succinyl CoA ligase, succinate dehydrogenase, fumarase and malate dehydrogenase have control coefficients flux for respiration of -0.4, 0.964, -0.123, 0.0008, 0.289, 0.601 and 1.76, respectively; while 2-oxoglutarate dehydrogenase is estimated to have a control coefficient of 0.786 in potato tubers. These results thus indicate that the control of this pathway is distributed among malate dehydrogenase, aconitase, fumarase, succinate dehydrogenase and 2-oxoglutarate dehydrogenase. The unusual distribution of control estimated here is consistent with specific non-cyclic flux mode and cytosolic bypasses that operate in illuminated leaves. These observations are discussed in the context of known regulatory properties of the enzymes and some illustrative examples of how the pathway responds to environmental change are given.

Integration of high-throughput data with functional annotation by graph-theoretic methods has been postulated as promising way to unravel the function of unannotated genes. Here, we first review the existing graph-theoretic approaches for automated gene function annotation and classify them into two categories with respect to their relation to two instances of transductive learning on networks - with dynamic costs and with constant costs - depending on whether or not ontological relationship between functional terms is employed. The determined categories allow to characterize the computational complexity of the existing approaches and establish the relation to classical graph-theoretic problems, such as bisection and multiway cut. In addition, our results point out that the ontological form of the structured functional knowledge does not lower the complexity of the transductive learning with dynamic costs - one of the key problems in modern systems biology. The NP-hardness of automated gene annotation renders the development of heuristic or approximation algorithms a priority for additional research.

Recent advances in high-throughput omics techniques render it possible to decode the function of genes by using the "guilt-by-association" principle on biologically meaningful clusters of gene expression data. However, the existing frameworks for biological evaluation of gene clusters are hindered by two bottleneck issues: (1) the choice for the number of clusters, and (2) the external measures which do not take in consideration the structure of the analyzed data and the ontology of the existing biological knowledge. Here, we address the identified bottlenecks by developing a novel framework that allows not only for biological evaluation of gene expression clusters based on existing structured knowledge, but also for prediction of putative gene functions. The proposed framework facilitates propagation of statistical significance at each of the following steps: (1) estimating the number of clusters, (2) evaluating the clusters in terms of novel external structural measures, (3) selecting an optimal clustering algorithm, and (4) predicting gene functions. The framework also includes a method for evaluation of gene clusters based on the structure of the employed ontology. Moreover, our method for obtaining a probabilistic range for the number of clusters is demonstrated valid on synthetic data and available gene expression profiles from Saccharomyces cerevisiae. Finally, we propose a network-based approach for gene function prediction which relies on the clustering of optimal score and the employed ontology. Our approach effectively predicts gene function on the Saccharomyces cerevisiae data set and is also employed to obtain putative gene functions for an Arabidopsis thaliana data set.

Natural genetic diversity provides a powerful tool to study the complex interrelationship between metabolism and growth. Profiling of metabolic traits combined with network-based and statistical analyses allow the comparison of conditions and identification of sets of traits that predict biomass. However, it often remains unclear why a particular set of metabolites is linked with biomass and to what extent the predictive model is applicable beyond a particular growth condition. A panel of 97 genetically diverse Arabidopsis (Arabidopsis thaliana) accessions was grown in near-optimal carbon and nitrogen supply, restricted carbon supply, and restricted nitrogen supply and analyzed for biomass and 54 metabolic traits. Correlation-based metabolic networks were generated from the genotype-dependent variation in each condition to reveal sets of metabolites that show coordinated changes across accessions. The networks were largely specific for a single growth condition. Partial least squares regression from metabolic traits allowed prediction of biomass within and, slightly more weakly, across conditions (cross-validated Pearson correlations in the range of 0.27-0.58 and 0.21-0.51 and P values in the range of <0.001-<0.13 and <0.001-<0.023, respectively). Metabolic traits that correlate with growth or have a high weighting in the partial least squares regression were mainly condition specific and often related to the resource that restricts growth under that condition. Linear mixed-model analysis using the combined metabolic traits from all growth conditions as an input indicated that inclusion of random effects for the conditions improves predictions of biomass. Thus, robust prediction of biomass across a range of conditions requires condition-specific measurement of metabolic traits to take account of environment-dependent changes of the underlying networks.