Refine
Year of publication
Language
- English (101)
Is part of the Bibliography
- yes (101) (remove)
Keywords
- Arabidopsis thaliana (8)
- Network clustering (5)
- Protein complexes (3)
- Species comparison (3)
- respiration (3)
- Ascophyllum nodosum (2)
- Coherent partition (2)
- Graph partitions (2)
- GxE interaction (2)
- Metabolic networks (2)
A large-scale metabolic quantitative trait loci (mQTL) analysis was performed on the well-characterized Solanum pennellii introgression lines to investigate the genomic regions associated with secondary metabolism in tomato fruit pericarp. In total, 679 mQTLs were detected across the 76 introgression lines. Heritability analyses revealed that mQTLs of secondary metabolism were less affected by environment than mQTLs of primary metabolism. Network analysis allowed us to assess the interconnectivity of primary and secondary metabolism as well as to compare and contrast their respective associations with morphological traits. Additionally, we applied a recently established real-time quantitative PCR platform to gain insight into transcriptional control mechanisms of a subset of the mQTLs, including those for hydroxycinnamates, acyl-sugar, naringenin chalcone, and a range of glycoalkaloids. Intriguingly, many of these compounds displayed a dominant-negative mode of inheritance, which is contrary to the conventional wisdom that secondary metabolite contents decreased on domestication. We additionally performed an exemplary evaluation of two candidate genes for glycolalkaloid mQTLs via the use of virus-induced gene silencing. The combined data of this study were compared with previous results on primary metabolism obtained from the same material and to other studies of natural variance of secondary metabolism.
Coherent network partitions
(2019)
Graph clustering is widely applied in the analysis of cellular networks reconstructed from large-scale data or obtained from experimental evidence. Here we introduce a new type of graph clustering based on the concept of coherent partition. A coherent partition of a graph G is a partition of the vertices of G that yields only disconnected subgraphs in the complement of G. The coherence number of G is then the size of the smallest edge cut inducing a coherent partition. A coherent partition of G is optimal if the size of the inducing edge cut is the coherence number of G. Given a graph G, we study coherent partitions and the coherence number in connection to (bi)clique partitions and the (bi)clique cover number. We show that the problem of finding the coherence number is NP-hard, but is of polynomial time complexity for trees. We also discuss the relation between coherent partitions and prominent graph clustering quality measures.
Coherent network partitions
(2021)
We continue to study coherent partitions of graphs whereby the vertex set is partitioned into subsets that induce biclique spanned subgraphs. The problem of identifying the minimum number of edges to obtain biclique spanned connected components (CNP), called the coherence number, is NP-hard even on bipartite graphs. Here, we propose a graph transformation geared towards obtaining an O (log n)-approximation algorithm for the CNP on a bipartite graph with n vertices. The transformation is inspired by a new characterization of biclique spanned subgraphs. In addition, we study coherent partitions on prime graphs, and show that finding coherent partitions reduces to the problem of finding coherent partitions in a prime graph. Therefore, these results provide future directions for approximation algorithms for the coherence number of a given graph.
Irradiance from sunlight changes in a sinusoidal manner during the day, with irregular fluctuations due to clouds, and light-dark shifts at dawn and dusk are gradual. Experiments in controlled environments typically expose plants to constant irradiance during the day and abrupt light-dark transitions. To compare the effects on metabolism of sunlight versus artificial light regimes, Arabidopsis thaliana plants were grown in a naturally illuminated greenhouse around the vernal equinox, and in controlled environment chambers with a 12-h photoperiod and either constant or sinusoidal light profiles, using either white fluorescent tubes or light-emitting diodes (LEDs) tuned to a sunlight-like spectrum as the light source. Rosettes were sampled throughout a 24-h diurnal cycle for metabolite analysis. The diurnal metabolite profiles revealed that carbon and nitrogen metabolism differed significantly between sunlight and artificial light conditions. The variability of sunlight within and between days could be a factor underlying these differences. Pairwise comparisons of the artificial light sources (fluorescent versus LED) or the light profiles (constant versus sinusoidal) showed much smaller differences. The data indicate that energy-efficient LED lighting is an acceptable alternative to fluorescent lights, but results obtained from plants grown with either type of artificial lighting might not be representative of natural conditions.
Integrative studies of plant growth require spatially and temporally resolved information from high-throughput imaging systems. However, analysis and interpretation of conventional two-dimensional images is complicated by the three-dimensional nature of shoot architecture and by changes in leaf position over time, termed hyponasty. To solve this problem, Phytotyping(4D) uses a light-field camera that simultaneously provides a focus image and a depth image, which contains distance information about the object surface. Our automated pipeline segments the focus images, integrates depth information to reconstruct the three-dimensional architecture, and analyses time series to provide information about the relative expansion rate, the timing of leaf appearance, hyponastic movement, and shape for individual leaves and the whole rosette. Phytotyping(4D) was calibrated and validated using discs of known sizes, and plants tilted at various orientations. Information from this analysis was integrated into the pipeline to allow error assessment during routine operation. To illustrate the utility of Phytotyping(4D), we compare diurnal changes in Arabidopsis thaliana wild-type Col-0 and the starchless pgm mutant. Compared to Col-0, pgm showed very low relative expansion rate in the second half of the night, a transiently increased relative expansion rate at the onset of light period, and smaller hyponastic movement including delayed movement after dusk, both at the level of the rosette and individual leaves. Our study introduces light-field camera systems as a tool to accurately measure morphological and growth-related features in plants.
Significance Statement Phytotyping(4D) is a non-invasive and accurate imaging system that combines a 3D light-field camera with an automated pipeline, which provides validated measurements of growth, movement, and other morphological features at the rosette and single-leaf level. In a case study in which we investigated the link between starch and growth, we demonstrated that Phytotyping(4D) is a key step towards bridging the gap between phenotypic observations and the rich genetic and metabolic knowledge.
The tricarboxylic acid (TCA) cycle is a crucial component of respiratory metabolism in both photosynthetic and heterotrophic plant organs. All of the major genes of the tomato TCA cycle have been cloned recently, allowing the generation of a suite of transgenic plants in which the majority of the enzymes in the pathway are progressively decreased. Investigations of these plants have provided an almost complete view of the distribution of control in this important pathway. Our studies suggest that citrate synthase, aconitase, isocitrate dehydrogenase, succinyl CoA ligase, succinate dehydrogenase, fumarase and malate dehydrogenase have control coefficients flux for respiration of -0.4, 0.964, -0.123, 0.0008, 0.289, 0.601 and 1.76, respectively; while 2-oxoglutarate dehydrogenase is estimated to have a control coefficient of 0.786 in potato tubers. These results thus indicate that the control of this pathway is distributed among malate dehydrogenase, aconitase, fumarase, succinate dehydrogenase and 2-oxoglutarate dehydrogenase. The unusual distribution of control estimated here is consistent with specific non-cyclic flux mode and cytosolic bypasses that operate in illuminated leaves. These observations are discussed in the context of known regulatory properties of the enzymes and some illustrative examples of how the pathway responds to environmental change are given.
Metabolic engineering of microalgae offers a promising solution for sustainable biofuel production, and rational design of engineering strategies can be improved by employing metabolic models that integrate enzyme turnover numbers. However, the coverage of turnover numbers for Chlamydomonas reinhardtii, a model eukaryotic microalga accessible to metabolic engineering, is 17-fold smaller compared to the heterotrophic cell factory Saccharomyces cerevisiae. Here we generate quantitative protein abundance data of Chlamydomonas covering 2337 to 3708 proteins in various growth conditions to estimate in vivo maximum apparent turnover numbers. Using constrained-based modeling we provide proxies for in vivo turnover numbers of 568 reactions, representing a 10-fold increase over the in vitro data for Chlamydomonas. Integration of the in vivo estimates instead of in vitro values in a metabolic model of Chlamydomonas improved the accuracy of enzyme usage predictions. Our results help in extending the knowledge on uncharacterized enzymes and improve biotechnological applications of Chlamydomonas.
The photosynthetic carbon metabolism, including the Calvin-Benson cycle, is the primary pathway in C-3-plants, producing starch and sucrose from CO2. Understanding the interplay between regulation and efficiency of this pathway requires the development of mathematical models which would explain the observed dynamics of metabolic transformations. Here, we address this question by casting the existing models of Calvin-Benson cycle and the end-product processes into an analysis framework which not only facilitates the comparison of the different models, but also allows for their ranking with respect to chosen criteria, including stability, sensitivity, robustness and/or compliance with experimental data. The importance of the photosynthetic carbon metabolism for the increase of plant biomass has resulted in many models with various levels of detail. We provide the largest compendium of 15 existing, well-investigated models together with a comprehensive classification as well as a ranking framework to determine the best-performing models for metabolic engineering and planning of in silica experiments. The classification can be additionally used, based on the model structure, as a tool to identify the models which match best the experimental design. The provided ranking is just one alternative to score models and, by changing the weighting factor, this framework also could be applied for selection of other criteria of interest.
The Calvin-Benson cycle (CBC) provides the precursors for biomass synthesis necessary for plant growth. The dynamic behavior and yield of the CBC depend on the environmental conditions and regulation of the cellular state. Accurate quantitative models hold the promise of identifying the key determinants of the tightly regulated CBC function and their effects on the responses in future climates. We provide an integrative analysis of the largest compendium of existing models for photosynthetic processes. Based on the proposed ranking, our framework facilitates the discovery of best-performing models with regard to metabolomics data and of candidates for metabolic engineering.
Motivation: Network-centered studies in systems biology attempt to integrate the topological properties of biological networks with experimental data in order to make predictions and posit hypotheses. For any topology-based prediction, it is necessary to first assess the significance of the analyzed property in a biologically meaningful context. Therefore, devising network null models, carefully tailored to the topological and biochemical constraints imposed on the network, remains an important computational problem.
Results: We first review the shortcomings of the existing generic sampling scheme-switch randomization-and explain its unsuitability for application to metabolic networks. We then devise a novel polynomial-time algorithm for randomizing metabolic networks under the (bio)chemical constraint of mass balance. The tractability of our method follows from the concept of mass equivalence classes, defined on the representation of compounds in the vector space over chemical elements. We finally demonstrate the uniformity of the proposed method on seven genome-scale metabolic networks, and empirically validate the theoretical findings. The proposed method allows a biologically meaningful estimation of significance for metabolic network properties.
Methodological and technological advances have recently paved the way for metabolic flux profiling in higher organisms, like plants. However, in comparison with omics technologies, flux profiling has yet to provide comprehensive differential flux maps at a genome-scale and in different cell types, tissues, and organs. Here we highlight the recent advances in technologies to gather metabolic labeling patterns and flux profiling approaches. We provide an opinion of how recent local flux profiling approaches can be used in conjunction with the constraint-based modeling framework to arrive at genome-scale flux maps. In addition, we point at approaches which use metabolomics data without introduction of label to predict either non-steady state fluxes in a time-series experiment or flux changes in different experimental scenarios. The combination of these developments allows an experimentally feasible approach for flux-based large-scale systems biology studies.
Complex networks have been successfully employed to represent different levels of biological systems, ranging from gene regulation to protein-protein interactions and metabolism. Network-based research has mainly focused on identifying unifying structural properties, such as small average path length, large clustering coefficient, heavy-tail degree distribution and hierarchical organization, viewed as requirements for efficient and robust system architectures. However, for biological networks, it is unclear to what extent these properties reflect the evolutionary history of the represented systems. Here, we show that the salient structural properties of six metabolic networks from all kingdoms of life may be inherently related to the evolution and functional organization of metabolism by employing network randomization under mass balance constraints. Contrary to the results from the common Markov-chain switching algorithm, our findings suggest the evolutionary importance of the small-world hypothesis as a fundamental design principle of complex networks. The approach may help us to determine the biologically meaningful properties that result from evolutionary pressure imposed on metabolism, such as the global impact of local reaction knockouts. Moreover, the approach can be applied to test to what extent novel structural properties can be used to draw biologically meaningful hypothesis or predictions from structure alone.
Background: Reconstruction of genome-scale metabolic networks has resulted in models capable of reproducing experimentally observed biomass yield/growth rates and predicting the effect of alterations in metabolism for biotechnological applications. The existing studies rely on modifying the metabolic network of an investigated organism by removing or inserting reactions taken either from evolutionary similar organisms or from databases of biochemical reactions (e.g., KEGG). A potential disadvantage of these knowledge-driven approaches is that the result is biased towards known reactions, as such approaches do not account for the possibility of including novel enzymes, together with the reactions they catalyze.
Results: Here, we explore the alternative of increasing biomass yield in three model organisms, namely Bacillus subtilis, Escherichia coil, and Hordeum vulgare, by applying small, chemically feasible network modifications. We use the predicted and experimentally confirmed growth rates of the wild-type networks as reference values and determine the effect of inserting mass-balanced, thermodynamically feasible reactions on predictions of growth rate by using flux balance analysis.
Conclusions: While many replacements of existing reactions naturally lead to a decrease or complete loss of biomass production ability, in all three investigated organisms we find feasible modifications which facilitate a significant increase in this biological function. We focus on modifications with feasible chemical properties and a significant increase in biomass yield. The results demonstrate that small modifications are sufficient to substantially alter biomass yield in the three organisms. The method can be used to predict the effect of targeted modifications on the yield of any set of metabolites (e.g., ethanol), thus providing a computational framework for synthetic metabolic engineering.
Analysis of biological networks requires assessing the statistical significance of network-based predictions by using a realistic null model. However, the existing network null model, switch randomization, is unsuitable for metabolic networks, as it does not include physical constraints and generates unrealistic reactions. We present JMassBalance, a tool for mass-balanced randomization and analysis of metabolic networks. The tool allows efficient generation of large sets of randomized networks under the physical constraint of mass balance. In addition, various structural properties of the original and randomized networks can be calculated, facilitating the identification of the salient properties of metabolic networks with a biologically meaningful null model.
The actin cytoskeleton is an essential intracellular filamentous structure that underpins cellular transport and cytoplasmic streaming in plant cells. However, the system-level properties of actin-based cellular trafficking remain tenuous, largely due to the inability to quantify key features of the actin cytoskeleton. Here, we developed an automated image-based, network-driven framework to accurately segment and quantify actin cytoskeletal structures and Golgi transport. We show that the actin cytoskeleton in both growing and elongated hypocotyl cells has structural properties facilitating efficient transport. Our findings suggest that the erratic movement of Golgi is a stable cellular phenomenon that might optimize distribution efficiency of cell material. Moreover, we demonstrate that Golgi transport in hypocotyl cells can be accurately predicted from the actin network topology alone. Thus, our framework provides quantitative evidence for system-wide coordination of cellular transport in plant cells and can be readily applied to investigate cytoskeletal organization and transport in other organisms.
As autotrophic organisms, plants capture light energy to convert carbon dioxide into ATP, nicotinamide adenine dinucleotide phosphate (NADPH), and sugars, which are essential for the biosynthesis of building blocks, storage, and growth. At night, metabolism and growth can be sustained by mobilizing carbon (C) reserves. In response to changing environmental conditions, such as light-dark cycles, the small-molecule regulation of enzymatic activities is critical for reprogramming cellular metabolism. We have recently demonstrated that proteogenic dipeptides, protein degradation products, act as metabolic switches at the interface of proteostasis and central metabolism in both plants and yeast. Dipeptides accumulate in response to the environmental changes and act via direct binding and regulation of critical enzymatic activities, enabling C flux distribution. Here, we provide evidence pointing to the involvement of dipeptides in the metabolic rewiring characteristics for the day-night cycle in plants. Specifically, we measured the abundance of 13 amino acids and 179 dipeptides over short- (SD) and long-day (LD) diel cycles, each with different light intensities. Of the measured dipeptides, 38 and eight were characterized by day-night oscillation in SD and LD, respectively, reaching maximum accumulation at the end of the day and then gradually falling in the night. Not only the number of dipeptides, but also the amplitude of the oscillation was higher in SD compared with LD conditions. Notably, rhythmic dipeptides were enriched in the glucogenic amino acids that can be converted into glucose. Considering the known role of Target of Rapamycin (TOR) signaling in regulating both autophagy and metabolism, we subsequently investigated whether diurnal fluctuations of dipeptides levels are dependent on the TOR Complex (TORC). The Raptor1b mutant (raptor1b), known for the substantial reduction of TOR kinase activity, was characterized by the augmented accumulation of dipeptides, which is especially pronounced under LD conditions. We were particularly intrigued by the group of 16 dipeptides, which, based on their oscillation under SD conditions and accumulation in raptor1b, can be associated with limited C availability or photoperiod. By mining existing protein-metabolite interaction data, we delineated putative protein interactors for a representative dipeptide Pro-Gln. The obtained list included enzymes of C and amino acid metabolism, which are also linked to the TORC-mediated metabolic network. Based on the obtained results, we speculate that the diurnal accumulation of dipeptides contributes to its metabolic adaptation in response to changes in C availability. We hypothesize that dipeptides would act as alternative respiratory substrates and by directly modulating the activity of the focal enzymes.
The study of non-coding RNA genes has received increased attention in recent years fuelled by accumulating evidence that larger portions of genomes than previously acknowledged are transcribed into RNA molecules of mostly unknown function, as well as the discovery of novel non-coding RNA types and functional RNA elements. Here, we demonstrate that specific properties of graphs that represent the predicted RNA secondary structure reflect functional information. We introduce a computational algorithm and an associated web-based tool (GraPPLE) for classifying non-coding RNA molecules as functional and, furthermore, into Rfam families based on their graph properties. Unlike sequence-similarity-based methods and covariance models, GraPPLE is demonstrated to be more robust with regard to increasing sequence divergence, and when combined with existing methods, leads to a significant improvement of prediction accuracy. Furthermore, graph properties identified as most informative are shown to provide an understanding as to what particular structural features render RNA molecules functional. Thus, GraPPLE may offer a valuable computational filtering tool to identify potentially interesting RNA molecules among large candidate datasets.
Young Genes out of the Male: An Insight from Evolutionary Age Analysis of the Pollen Transcriptome
(2015)
The birth of new genes in genomes is an important evolutionary event. Several studies reveal that new genes in animals tend to be preferentially expressed in male reproductive tissues such as testis (Betran et al., 2002; Begun et al., 2007; Dubruille et al., 2012), and thus an "out of testis' hypothesis for the emergence of new genes has been proposed (Vinckenbosch et al., 2006; Kaessmann, 2010). However, such phenomena have not been examined in plant species. Here, by employing a phylostratigraphic method, we dated the origin of protein-coding genes in rice and Arabidopsis thaliana and observed a number of young genes in both species. These young genes tend to encode short extracellular proteins, which may be involved in rapid evolving processes, such as reproductive barriers, species specification, and antimicrobial processes. Further analysis of transcriptome age indexes across different tissues revealed that male reproductive cells express a phylogenetically younger transcriptome than other plant tissues. Compared with sporophytic tissues, the young transcriptomes of the male gametophyte displayed greater complexity and diversity, which included a higher ratio of anti-sense and inter-genic transcripts, reflecting a pervasive transcription state that facilitated the emergence of new genes. Here, we propose that pollen may act as an "innovation incubator' for the birth of de novo genes. With cases of male-biased expression of young genes reported in animals, the "new genes out of the male' model revealed a common evolutionary force that drives reproductive barriers, species specification, and the upgrading of defensive mechanisms against pathogens.
The use of automated tools to reconstruct lipid metabolic pathways is not warranted in plants. Here, the authors construct Plant Lipid Module for Arabidopsis rosette using constraint-based modeling, demonstrate its integration in other plant metabolic models, and use it to dissect the genetic architecture of lipid metabolism.
Lipids play fundamental roles in regulating agronomically important traits. Advances in plant lipid metabolism have until recently largely been based on reductionist approaches, although modulation of its components can have system-wide effects. However, existing models of plant lipid metabolism provide lumped representations, hindering detailed study of component modulation. Here, we present the Plant Lipid Module (PLM) which provides a mechanistic description of lipid metabolism in the Arabidopsis thaliana rosette. We demonstrate that the PLM can be readily integrated in models of A. thaliana Col-0 metabolism, yielding accurate predictions (83%) of single lethal knock-outs and 75% concordance between measured transcript and predicted flux changes under extended darkness. Genome-wide associations with fluxes obtained by integrating the PLM in diel condition- and accession-specific models identify up to 65 candidate genes modulating A. thaliana lipid metabolism. Using mutant lines, we validate up to 40% of the candidates, paving the way for identification of metabolic gene function based on models capturing natural variability in metabolism.
Bridging metabolomics with plant phenotypic responses is challenging. Multivariate analyses account for the existing dependencies among metabolites, and regression models in particular capture such dependencies in search for association with a given trait. However, special care should be undertaken with metabolomics data. Here we propose a modeling workflow that considers all caveats imposed by such large data sets.
Maize is the cereal crop with the highest production worldwide, and its oil is a key energy resource. Improving the quantity and quality of maize oil requires a better understanding of lipid metabolism. To predict the function of maize genes involved in lipid biosynthesis, we assembled transcriptomic and lipidomic data sets from leaves of B73 and the high-oil line By804 in two distinct time-series experiments. The integrative analysis based on high-dimensional regularized regression yielded lipid-transcript associations indirectly validated by Gene Ontology and promoter motif enrichment analyses. The co-localization of lipid-transcript associations using the genetic mapping of lipid traits in leaves and seedlings of a B73 x By804 recombinant inbred line population uncovered 323 genes involved in the metabolism of phospholipids, galactolipids, sulfolipids and glycerolipids. The resulting association network further supported the involvement of 50 gene candidates in modulating levels of representatives from multiple acyl-lipid classes. Therefore, the proposed approach provides high-confidence candidates for experimental testing in maize and model plant species.
Maize (Zea mays L.) is a staple food whose production relies on seed stocks that largely comprise hybrid varieties. Therefore, knowledge about the molecular determinants of hybrid performance (HP) in the field can be used to devise better performing hybrids to address the demands for sustainable increase in yield. Here, we propose and test a classification-driven framework that uses metabolic profiles from in vitro grown young roots of parental lines from the Dent x Flint maize heterotic pattern to predict field HP. We identify parental analytes that best predict the metabolic inheritance patterns in 328 hybrids. We then demonstrate that these analytes are also predictive of field HP (0.64 >= r >= 0.79) and discriminate hybrids of good performance (accuracy of 87.50%). Therefore, our approach provides a cost-effective solution for hybrid selection programs.
Plants have adapted to the diurnal light-dark cycle by establishing elaborate transcriptional programs that coordinate many metabolic, physiological, and developmental responses to the external environment. These transcriptional programs have been studied in only a few species, and their function and conservation across algae and plants is currently unknown. We performed a comparative transcriptome analysis of the diurnal cycle of nine members of Archaeplastida, and we observed that, despite large phylogenetic distances and dramatic differences in morphology and lifestyle, diurnal transcriptional programs of these organisms are similar. Expression of genes related to cell division and the majority of biological pathways depends on the time of day in unicellular algae but we did not observe such patterns at the tissue level in multicellular land plants. Hence, our study provides evidence for the universality of diurnal gene expression and elucidates its evolutionary history among different photosynthetic eukaryotes.
Spatiotemporal dynamics of the Calvin cycle multistationarity and symmetry breaking instabilities
(2011)
The possibility of controlling the Calvin cycle has paramount implications for increasing the production of biomass. Multistationarity, as a dynamical feature of systems, is the first obvious candidate whose control could find biotechnological applications. Here we set out to resolve the debate on the multistationarity of the Calvin cycle. Unlike the existing simulation-based studies, our approach is based on a sound mathematical framework, chemical reaction network theory and algebraic geometry, which results in provable results for the investigated model of the Calvin cycle in which we embed a hierarchy of realistic kinetic laws. Our theoretical findings demonstrate that there is a possibility for multistationarity resulting from two sources, homogeneous and inhomogeneous instabilities, which partially settle the debate on multistability of the Calvin cycle. In addition, our tractable analytical treatment of the bifurcation parameters can be employed in the design of validation experiments.
Recent advances in gene function prediction rely on ensemble approaches that integrate results from multiple inference methods to produce superior predictions. Yet, these developments remain largely unexplored in plants. We have explored and compared two methods to integrate 10 gene co-function networks for Arabidopsis thaliana and demonstrate how the integration of these networks produces more accurate gene function predictions for a larger fraction of genes with unknown function. These predictions were used to identify genes involved in mitochondrial complex I formation, and for five of them, we confirmed the predictions experimentally. The ensemble predictions are provided as a user-friendly online database, EnsembleNet. The methods presented here demonstrate that ensemble gene function prediction is a powerful method to boost prediction performance, whereas the EnsembleNet database provides a cutting-edge community tool to guide experimentalists.
Trade-offs are inherent to biochemical networks governing diverse cellular functions, from gene expression to metabolism. Yet, trade-offs between fluxes of biochemical reactions in a metabolic network have not been formally studied. Here, we introduce the concept of absolute flux trade-offs and devise a constraint-based approach, termed FluTO, to identify and enumerate flux trade-offs in a given genome-scale metabolic network. By employing the metabolic networks of Escherichia coli and Saccharomyces cerevisiae, we demonstrate that the flux trade-offs are specific to carbon sources provided but that reactions involved in the cofactor and prosthetic group biosynthesis are present in trade-offs across all carbon sources supporting growth. We also show that absolute flux trade-offs depend on the biomass reaction used to model the growth of Arabidopsis thaliana under different carbon and nitrogen conditions. The identified flux trade-offs reflect the tight coupling between nitrogen, carbon, and sulphur metabolisms in leaves of C-3 plants. Altogether, FluTO provides the means to explore the space of alternative metabolic routes reflecting the constraints imposed by inherent flux trade-offs in large-scale metabolic networks.
We investigate the properties of a recently introduced asymmetric association measure, called inner composition alignment (IOTA), aimed at inferring regulatory links (couplings). We show that the measure can be used to determine the direction of coupling, detect superfluous links, and to account for autoregulation. In addition, the measure can be extended to infer the type of regulation (positive or negative). The capabilities of IOTA to correctly infer couplings together with their directionality are compared against Kendall's rank correlation for time series of different lengths, particularly focussing on biological examples. We demonstrate that an extended version of the measure, bidirectional inner composition alignment (biIOTA), increases the accuracy of the network reconstruction for short time series. Finally, we discuss the applicability of the measure to infer couplings in chaotic systems.
Background: Inferring regulatory interactions between genes from transcriptomics time-resolved data, yielding reverse engineered gene regulatory networks, is of paramount importance to systems biology and bioinformatics studies. Accurate methods to address this problem can ultimately provide a deeper insight into the complexity, behavior, and functions of the underlying biological systems. However, the large number of interacting genes coupled with short and often noisy time-resolved read-outs of the system renders the reverse engineering a challenging task. Therefore, the development and assessment of methods which are computationally efficient, robust against noise, applicable to short time series data, and preferably capable of reconstructing the directionality of the regulatory interactions remains a pressing research problem with valuable applications.
Results: Here we perform the largest systematic analysis of a set of similarity measures and scoring schemes within the scope of the relevance network approach which are commonly used for gene regulatory network reconstruction from time series data. In addition, we define and analyze several novel measures and schemes which are particularly suitable for short transcriptomics time series. We also compare the considered 21 measures and 6 scoring schemes according to their ability to correctly reconstruct such networks from short time series data by calculating summary statistics based on the corresponding specificity and sensitivity. Our results demonstrate that rank and symbol based measures have the highest performance in inferring regulatory interactions. In addition, the proposed scoring scheme by asymmetric weighting has shown to be valuable in reducing the number of false positive interactions. On the other hand, Granger causality as well as information-theoretic measures, frequently used in inference of regulatory networks, show low performance on the short time series analyzed in this study.
Conclusions: Our study is intended to serve as a guide for choosing a particular combination of similarity measures and scoring schemes suitable for reconstruction of gene regulatory networks from short time series data. We show that further improvement of algorithms for reverse engineering can be obtained if one considers measures that are rooted in the study of symbolic dynamics or ranks, in contrast to the application of common similarity measures which do not consider the temporal character of the employed data. Moreover, we establish that the asymmetric weighting scoring scheme together with symbol based measures (for low noise level) and rank based measures (for high noise level) are the most suitable choices.
Background: Inferring regulatory interactions between genes from transcriptomics time-resolved data, yielding reverse engineered gene regulatory networks, is of paramount importance to systems biology and bioinformatics studies. Accurate methods to address this problem can ultimately provide a deeper insight into the complexity, behavior, and functions of the underlying biological systems. However, the large number of interacting genes coupled with short and often noisy time-resolved read-outs of the system renders the reverse engineering a challenging task. Therefore, the development and assessment of methods which are computationally efficient, robust against noise, applicable to short time series data, and preferably capable of reconstructing the directionality of the regulatory interactions remains a pressing research problem with valuable applications.
Results: Here we perform the largest systematic analysis of a set of similarity measures and scoring schemes within the scope of the relevance network approach which are commonly used for gene regulatory network reconstruction from time series data. In addition, we define and analyze several novel measures and schemes which are particularly suitable for short transcriptomics time series. We also compare the considered 21 measures and 6 scoring schemes according to their ability to correctly reconstruct such networks from short time series data by calculating summary statistics based on the corresponding specificity and sensitivity. Our results demonstrate that rank and symbol based measures have the highest performance in inferring regulatory interactions. In addition, the proposed scoring scheme by asymmetric weighting has shown to be valuable in reducing the number of false positive interactions. On the other hand, Granger causality as well as information-theoretic measures, frequently used in inference of regulatory networks, show low performance on the short time series analyzed in this study.
Conclusions: Our study is intended to serve as a guide for choosing a particular combination of similarity measures and scoring schemes suitable for reconstruction of gene regulatory networks from short time series data. We show that further improvement of algorithms for reverse engineering can be obtained if one considers measures that are rooted in the study of symbolic dynamics or ranks, in contrast to the application of common similarity measures which do not consider the temporal character of the employed data. Moreover, we establish that the asymmetric weighting scoring scheme together with symbol based measures (for low noise level) and rank based measures (for high noise level) are the most suitable choices.
Identifying causal links (couplings) is a fundamental problem that facilitates the understanding of emerging structures in complex networks. We propose and analyze inner composition alignment-a novel, permutation-based asymmetric association measure to detect regulatory links from very short time series, currently applied to gene expression. The measure can be used to infer the direction of couplings, detect indirect (superfluous) links, and account for autoregulation. Applications to the gene regulatory network of E. coli are presented.
Quantification of reaction fluxes of metabolic networks can help us understand how the integration of different metabolic pathways determines cellular functions. Yet, intracellular fluxes cannot be measured directly but are estimated with metabolic flux analysis (MFA), which relies on the patterns of isotope labeling of metabolites in the network. The application of MFA also requires a stoichiometric model with atom mappings that are currently not available for the majority of large-scale metabolic network models, particularly of plants. While automated approaches such as the Reaction Decoder Toolkit (RDT) can produce atom mappings for individual reactions, tracing the flow of individual atoms of the entire reactions across a metabolic model remains challenging. Here we establish an automated workflow to obtain reliable atom mappings for large-scale metabolic models by refining the outcome of RDT, and apply the workflow to metabolic models of Arabidopsis thaliana. We demonstrate the accuracy of RDT through a comparative analysis with atom mappings from a large database of biochemical reactions, MetaCyc. We further show the utility of our automated workflow by simulating N-15 isotope enrichment and identifying nitrogen (N)-containing metabolites which show enrichment patterns that are informative for flux estimation in future N-15-MFA studies of A. thaliana. The automated workflow established in this study can be readily expanded to other species for which metabolic models have been established and the resulting atom mappings will facilitate MFA and graph-theoretic structural analyses with large-scale metabolic networks.
Plastid ribosomes are very similar in structure and function to the ribosomes of their bacterial ancestors. Since ribosome biogenesis is not thermodynamically favorable under biological conditions it requires the activity of many assembly factors. Here we have characterized a homolog of bacterial RsgA in Arabidopsis thaliana and show that it can complement the bacterial homolog. Functional characterization of a strong mutant in Arabidopsis revealed that the protein is essential for plant viability, while a weak mutant produced dwarf, chlorotic plants that incorporated immature pre-16S ribosomal RNA into translating ribosomes. Physiological analysis of the mutant plants revealed smaller, but more numerous, chloroplasts in the mesophyll cells, reduction of chlorophyll a and b, depletion of proplastids from the rib meristem and decreased photosynthetic electron transport rate and efficiency. Comparative RNA sequencing and proteomic analysis of the weak mutant and wild-type plants revealed that various biotic stress-related, transcriptional regulation and post-transcriptional modification pathways were repressed in the mutant. Intriguingly, while nuclear- and chloroplast-encoded photosynthesis-related proteins were less abundant in the mutant, the corresponding transcripts were increased, suggesting an elaborate compensatory mechanism, potentially via differentially active retrograde signaling pathways. To conclude, this study reveals a chloroplast ribosome assembly factor and outlines the transcriptomic and proteomic responses of the compensatory mechanism activated during decreased chloroplast function. Significance Statement AtRsgA is an assembly factor necessary for maturation of the small subunit of the chloroplast ribosome. Depletion of AtRsgA leads to dwarfed, chlorotic plants, a decrease of mature 16S rRNA and smaller, but more numerous, chloroplasts. Large-scale transcriptomic and proteomic analysis revealed that chloroplast-encoded and -targeted proteins were less abundant, while the corresponding transcripts were increased in the mutant. We analyze the transcriptional responses of several retrograde signaling pathways to suggest the mechanism underlying this compensatory response.
Understanding the strategies employed by plant species that live in extreme environments offers the possibility to discover stress tolerance mechanisms. We studied the physiological, antioxidant and metabolic responses to three temperature conditions (4, 15, and 23 degrees C) of Colobanthus quitensis (CQ), one of the only two native vascular species in Antarctica. We also employed Dianthus chinensis (DC), to assess the effects of the treatments in a non-Antarctic species from the same family. Using fused LASSO modelling, we associated physiological and biochemical antioxidant responses with primary metabolism. This approach allowed us to highlight the metabolic pathways driving the response specific to CQ. Low temperature imposed dramatic reductions in photosynthesis (up to 88%) but not in respiration (sustaining rates of 3.0-4.2 mu mol CO2 m(-2) s(-1)) in CQ, and no change in the physiological stress parameters was found. Its notable antioxidant capacity and mitochondrial cytochrome respiratory activity (20 and two times higher than DC, respectively), which ensure ATP production even at low temperature, was significantly associated with sulphur-containing metabolites and polyamines. Our findings potentially open new biotechnological opportunities regarding the role of antioxidant compounds and respiratory mechanisms associated with sulphur metabolism in stress tolerance strategies to low temperature.
Background: There are alternative substrates to the mitochondrial respiration.
Results: Data-driven model-based analysis renders predictions of alternative substrates to the mitochondrial respiration.
Conclusion: Metabolomics data in conjunction with flux-based models can discriminate among hypotheses based on enzymology alone.
Significance: This analysis provides a basic framework for in silico studies of alternative pathways in metabolism.
Dynamic regulatory on/off minimization for biological systems under internal temporal perturbations
(2012)
Background: Flux balance analysis (FBA) together with its extension, dynamic FBA, have proven instrumental for analyzing the robustness and dynamics of metabolic networks by employing only the stoichiometry of the included reactions coupled with adequately chosen objective function. In addition, under the assumption of minimization of metabolic adjustment, dynamic FBA has recently been employed to analyze the transition between metabolic states.
Results: Here, we propose a suite of novel methods for analyzing the dynamics of (internally perturbed) metabolic networks and for quantifying their robustness with limited knowledge of kinetic parameters. Following the biochemically meaningful premise that metabolite concentrations exhibit smooth temporal changes, the proposed methods rely on minimizing the significant fluctuations of metabolic profiles to predict the time-resolved metabolic state, characterized by both fluxes and concentrations. By conducting a comparative analysis with a kinetic model of the Calvin-Benson cycle and a model of plant carbohydrate metabolism, we demonstrate that the principle of regulatory on/off minimization coupled with dynamic FBA can accurately predict the changes in metabolic states.
Conclusions: Our methods outperform the existing dynamic FBA-based modeling alternatives, and could help in revealing the mechanisms for maintaining robustness of dynamic processes in metabolic networks over time.
Dynamic regulatory on/off minimization for biological systems under internal temporal perturbations
(2012)
Background: Flux balance analysis (FBA) together with its extension, dynamic FBA, have proven instrumental for analyzing the robustness and dynamics of metabolic networks by employing only the stoichiometry of the included reactions coupled with adequately chosen objective function. In addition, under the assumption of minimization of metabolic adjustment, dynamic FBA has recently been employed to analyze the transition between metabolic states.
Results: Here, we propose a suite of novel methods for analyzing the dynamics of (internally perturbed) metabolic networks and for quantifying their robustness with limited knowledge of kinetic parameters. Following the biochemically meaningful premise that metabolite concentrations exhibit smooth temporal changes, the proposed methods rely on minimizing the significant fluctuations of metabolic profiles to predict the time-resolved metabolic state, characterized by both fluxes and concentrations. By conducting a comparative analysis with a kinetic model of the Calvin-Benson cycle and a model of plant carbohydrate metabolism, we demonstrate that the principle of regulatory on/off minimization coupled with dynamic FBA can accurately predict the changes in metabolic states.
Conclusions: Our methods outperform the existing dynamic FBA-based modeling alternatives, and could help in revealing the mechanisms for maintaining robustness of dynamic processes in metabolic networks over time.
Recent advances in high-throughput omics techniques render it possible to decode the function of genes by using the "guilt-by-association" principle on biologically meaningful clusters of gene expression data. However, the existing frameworks for biological evaluation of gene clusters are hindered by two bottleneck issues: (1) the choice for the number of clusters, and (2) the external measures which do not take in consideration the structure of the analyzed data and the ontology of the existing biological knowledge. Here, we address the identified bottlenecks by developing a novel framework that allows not only for biological evaluation of gene expression clusters based on existing structured knowledge, but also for prediction of putative gene functions. The proposed framework facilitates propagation of statistical significance at each of the following steps: (1) estimating the number of clusters, (2) evaluating the clusters in terms of novel external structural measures, (3) selecting an optimal clustering algorithm, and (4) predicting gene functions. The framework also includes a method for evaluation of gene clusters based on the structure of the employed ontology. Moreover, our method for obtaining a probabilistic range for the number of clusters is demonstrated valid on synthetic data and available gene expression profiles from Saccharomyces cerevisiae. Finally, we propose a network-based approach for gene function prediction which relies on the clustering of optimal score and the employed ontology. Our approach effectively predicts gene function on the Saccharomyces cerevisiae data set and is also employed to obtain putative gene functions for an Arabidopsis thaliana data set.
Characterization of maximal enzyme catalytic rates in central metabolism of Arabidopsis thaliana
(2020)
Availability of plant-specific enzyme kinetic data is scarce, limiting the predictive power of metabolic models and precluding identification of genetic factors of enzyme properties. Enzyme kinetic data are measuredin vitro, often under non-physiological conditions, and conclusions elicited from modeling warrant caution. Here we estimate maximalin vivocatalytic rates for 168 plant enzymes, including photosystems I and II, cytochrome-b6f complex, ATP-citrate synthase, sucrose-phosphate synthase as well as enzymes from amino acid synthesis with previously undocumented enzyme kinetic data in BRENDA. The estimations are obtained by integrating condition-specific quantitative proteomics data, maximal rates of selected enzymes, growth measurements fromArabidopsis thalianarosette with and fluxes through canonical pathways in a constraint-based model of leaf metabolism. In comparison to findings inEscherichia coli, we demonstrate weaker concordance between the plant-specificin vitroandin vivoenzyme catalytic rates due to a low degree of enzyme saturation. This is supported by the finding that concentrations of nicotinamide adenine dinucleotide (phosphate), adenosine triphosphate and uridine triphosphate, calculated based on our maximalin vivocatalytic rates, and available quantitative metabolomics data are below reportedKMvalues and, therefore, indicate undersaturation of respective enzymes. Our findings show that genome-wide profiling of enzyme kinetic properties is feasible in plants, paving the way for understanding resource allocation.
Understanding the complexity of metabolic networks has implications for manipulation of their functions. The complexity of metabolic networks can be characterized by identifying multireaction dependencies that are challenging to determine due to the sheer number of combinations to consider. Here, we propose the concept of concordant complexes that captures multireaction dependencies and can be efficiently determined from the algebraic structure and operational constraints of metabolic networks. The concordant complexes imply the existence of concordance modules based on which the apparent complexity of 12 metabolic networks of organisms from all kingdoms of life can be reduced by at least 78%. A comparative analysis against an ensemble of randomized metabolic networks shows that the metabolic network of Escherichia coli contains fewer concordance modules and is, therefore, more tightly coordinated than expected by chance. Together, our findings demonstrate that metabolic networks are considerably simpler than what can be perceived from their structure alone.
Successfully designed and implemented plant-specific synthetic metabolic pathways hold promise to increase crop yield and nutritional value. Advances in synthetic biology have already demonstrated the capacity to design artificial biological pathways whose behavior can be predicted and controlled in microbial systems. However, the transfer of these advances to model plants and crops faces the lack of characterization of plant cellular pathways and increased complexity due to compartmentalization and multicellularity. Modern computational developments provide the means to test the feasibility of plant synthetic metabolic pathways despite gaps in the accumulated knowledge of plant metabolism. Here, we provide a succinct systematic review of optimization-based and retrobiosynthesis approaches that can be used to design and in silico test synthetic metabolic pathways in large-scale plant context-specific metabolic models. In addition, by surveying the existing case studies, we highlight the challenges that these approaches face when applied to plants. Emphasis is placed on understanding the effect that metabolic designs can have on native metabolism, particularly with respect to metabolite concentrations and thermodynamics of biochemical reactions. In addition, we discuss the computational developments that may help to transform the identified challenges into opportunities for plant synthetic biology.
Cells and organelles are not homogeneous but include microcompartments that alter the spatiotemporal characteristics of cellular processes. The effects of microcompartmentation on metabolic pathways are however difficult to study experimentally. The pyrenoid is a microcompartment that is essential for a carbon concentrating mechanism (CCM) that improves the photosynthetic performance of eukaryotic algae. Using Chlamydomonas reinhardtii, we obtained experimental data on photosynthesis, metabolites, and proteins in CCM-induced and CCM-suppressed cells. We then employed a computational strategy to estimate how fluxes through the Calvin-Benson cycle are compartmented between the pyrenoid and the stroma. Our model predicts that ribulose-1,5-bisphosphate (RuBP), the substrate of Rubisco, and 3-phosphoglycerate (3PGA), its product, diffuse in and out of the pyrenoid, respectively, with higher fluxes in CCM-induced cells. It also indicates that there is no major diffusional barrier to metabolic flux between the pyrenoid and stroma. Our computational approach represents a stepping stone to understanding microcompartmentalized CCM in other organisms.
Cells and organelles are not homogeneous but include microcompartments that alter the spatiotemporal characteristics of cellular processes. The effects of microcompartmentation on metabolic pathways are however difficult to study experimentally. The pyrenoid is a microcompartment that is essential for a carbon concentrating mechanism (CCM) that improves the photosynthetic performance of eukaryotic algae. Using Chlamydomonas reinhardtii, we obtained experimental data on photosynthesis, metabolites, and proteins in CCM-induced and CCM-suppressed cells. We then employed a computational strategy to estimate how fluxes through the Calvin-Benson cycle are compartmented between the pyrenoid and the stroma. Our model predicts that ribulose-1,5-bisphosphate (RuBP), the substrate of Rubisco, and 3-phosphoglycerate (3PGA), its product, diffuse in and out of the pyrenoid, respectively, with higher fluxes in CCM-induced cells. It also indicates that there is no major diffusional barrier to metabolic flux between the pyrenoid and stroma. Our computational approach represents a stepping stone to understanding microcompartmentalized CCM in other organisms.
Large-scale biochemical models are of increasing sizes due to the consideration of interacting organisms and tissues. Model reduction approaches that preserve the flux phenotypes can simplify the analysis and predictions of steady-state metabolic phenotypes. However, existing approaches either restrict functionality of reduced models or do not lead to significant decreases in the number of modelled metabolites. Here, we introduce an approach for model reduction based on the structural property of balancing of complexes that preserves the steady-state fluxes supported by the network and can be efficiently determined at genome scale. Using two large-scale mass-action kinetic models of Escherichia coli, we show that our approach results in a substantial reduction of 99% of metabolites. Applications to genome-scale metabolic models across kingdoms of life result in up to 55% and 85% reduction in the number of metabolites when arbitrary and mass-action kinetics is assumed, respectively. We also show that predictions of the specific growth rate from the reduced models match those based on the original models. Since steady-state flux phenotypes from the original model are preserved in the reduced, the approach paves the way for analysing other metabolic phenotypes in large-scale biochemical networks.
The ability of an organism to change its phenotype in response to different environments, termed plasticity, is a particularly important characteristic to enable sessile plants to adapt to rapid changes in their surroundings. Plasticity is a quantitative trait that can provide a fitness advantage and mitigate negative effects due to environmental perturbations. Yet, its genetic basis is not fully understood. Alongside technological limitations, the main challenge in studying plasticity has been the selection of suitable approaches for quantification of phenotypic plasticity. Here, we propose a categorization of the existing quantitative measures of phenotypic plasticity into nominal and relative approaches. Moreover, we highlight the recent advances in the understanding of the genetic architecture underlying phenotypic plasticity in plants. We identify four pillars for future research to uncover the genetic basis of phenotypic plasticity, with emphasis on development of computational approaches and theories. These developments will allow us to perform specific experiments to validate the causal genes for plasticity and to discover their role in plant fitness and evolution.
Understanding the structure of reaction networks along with the underlying kinetics that lead to particular concentration readouts of the participating components is the first step toward optimization and control of (bio-)chemical processes. Yet, solutions to the problem of inferring the structure of reaction networks, i.e., characterizing the stoichiometry of the participating reactions provided concentration profiles of the participating components, remain elusive. Here, we present an approach to infer the stoichiometric subspace of a chemical reaction network from steady-state concentration data profiles obtained from a continuous isothermal reactor. The subsequent problem of finding reactions consistent with the observed subspace is cast as a series of mixed-integer linear programs whose solution generates potential reaction vectors together with a measure of their likelihood. We demonstrate the efficiency and applicability of the proposed approach using data obtained from synthetic reaction networks and from a well-established biological model for the Calvin-Benson cycle. Furthermore, we investigate the effect of missing information, in the form of unmeasured species or insufficient diversity within the data set, on the ability to accurately reconstruct the network reactions. The proposed framework is, in principle, applicable to many other reaction systems, thus providing future extensions to understanding reaction networks guiding chemical reactors and complex biological mixtures. (C) 2019 Author(s).
Motivation: Metabolic engineering aims at modulating the capabilities of metabolic networks by changing the activity of biochemical reactions. The existing constraint-based approaches for metabolic engineering have proven useful, but are limited only to reactions catalogued in various pathway databases.
Results: We consider the alternative of designing synthetic strategies which can be used not only to characterize the maximum theoretically possible product yield but also to engineer networks with optimal conversion capability by using a suitable biochemically feasible reaction called 'stoichiometric capacitance'. In addition, we provide a theoretical solution for decomposing a given stoichiometric capacitance over a set of known enzymatic reactions. We determine the stoichiometric capacitance for genome-scale metabolic networks of 10 organisms from different kingdoms of life and examine its implications for the alterations in flux variability patterns. Our empirical findings suggest that the theoretical capacity of metabolic networks comes at a cost of dramatic system's changes.
Describing the determinants of robustness of biological systems has become one of the central questions in systems biology. Despite the increasing research efforts, it has proven difficult to arrive at a unifying definition for this important concept. We argue that this is due to the multifaceted nature of the concept of robustness and the possibility to formally capture it at different levels of systemic formalisms (e.g, topology and dynamic behavior). Here we provide a comprehensive review of the existing definitions of robustness pertaining to metabolic networks. As kinetic approaches have been excellently reviewed elsewhere, we focus on definitions of robustness proposed within graph-theoretic and constraint-based formalisms.