TY  - JOUR
A1  - Calderan-Rodrigues, Maria Juliana
A1  - Luzarowski, Marcin
A1  - Monte-Bello, Carolina Cassano
A1  - Minen, Romina Ines
A1  - Zühlke, Boris M.
A1  - Nikoloski, Zoran
A1  - Skirycz, Aleksandra
A1  - Caldana, Camila
T1  - Proteogenic dipeptides are characterized by diel fluctuations and target of rapamycin complex-signaling dependency in the model plant Arabidopsis thaliana
JF  - Frontiers in plant science : FPLS
N2  - As autotrophic organisms, plants capture light energy to convert carbon dioxide into ATP, nicotinamide adenine dinucleotide phosphate (NADPH), and sugars, which are essential for the biosynthesis of building blocks, storage, and growth. At night, metabolism and growth can be sustained by mobilizing carbon (C) reserves. In response to changing environmental conditions, such as light-dark cycles, the small-molecule regulation of enzymatic activities is critical for reprogramming cellular metabolism. We have recently demonstrated that proteogenic dipeptides, protein degradation products, act as metabolic switches at the interface of proteostasis and central metabolism in both plants and yeast. Dipeptides accumulate in response to the environmental changes and act via direct binding and regulation of critical enzymatic activities, enabling C flux distribution. Here, we provide evidence pointing to the involvement of dipeptides in the metabolic rewiring characteristics for the day-night cycle in plants. Specifically, we measured the abundance of 13 amino acids and 179 dipeptides over short- (SD) and long-day (LD) diel cycles, each with different light intensities. Of the measured dipeptides, 38 and eight were characterized by day-night oscillation in SD and LD, respectively, reaching maximum accumulation at the end of the day and then gradually falling in the night. Not only the number of dipeptides, but also the amplitude of the oscillation was higher in SD compared with LD conditions. Notably, rhythmic dipeptides were enriched in the glucogenic amino acids that can be converted into glucose. Considering the known role of Target of Rapamycin (TOR) signaling in regulating both autophagy and metabolism, we subsequently investigated whether diurnal fluctuations of dipeptides levels are dependent on the TOR Complex (TORC). The Raptor1b mutant (raptor1b), known for the substantial reduction of TOR kinase activity, was characterized by the augmented accumulation of dipeptides, which is especially pronounced under LD conditions. We were particularly intrigued by the group of 16 dipeptides, which, based on their oscillation under SD conditions and accumulation in raptor1b, can be associated with limited C availability or photoperiod. By mining existing protein-metabolite interaction data, we delineated putative protein interactors for a representative dipeptide Pro-Gln. The obtained list included enzymes of C and amino acid metabolism, which are also linked to the TORC-mediated metabolic network. Based on the obtained results, we speculate that the diurnal accumulation of dipeptides contributes to its metabolic adaptation in response to changes in C availability. We hypothesize that dipeptides would act as alternative respiratory substrates and by directly modulating the activity of the focal enzymes.
KW  - dipeptide
KW  - diel cycle
KW  - metabolism
KW  - TOR signaling
KW  - protein-metabolite
KW  - interactions
KW  - carbon limitation
KW  - amino acid
Y1  - 2021
U6  - https://doi.org/10.3389/fpls.2021.758933
SN  - 1664-462X
VL  - 12
PB  - Frontiers Media
CY  - Lausanne
ER  - 
TY  - JOUR
A1  - Huß, Sebastian
A1  - Judd, Rika Siedah
A1  - Koper, Kaan
A1  - Maeda, Hiroshi A.
A1  - Nikoloski, Zoran
T1  - An automated workflow that generates atom mappings for large-scale metabolic models and its application to Arabidopsis thaliana
JF  - The plant journal
N2  - Quantification of reaction fluxes of metabolic networks can help us understand how the integration of different metabolic pathways determines cellular functions. Yet, intracellular fluxes cannot be measured directly but are estimated with metabolic flux analysis (MFA), which relies on the patterns of isotope labeling of metabolites in the network. The application of MFA also requires a stoichiometric model with atom mappings that are currently not available for the majority of large-scale metabolic network models, particularly of plants. While automated approaches such as the Reaction Decoder Toolkit (RDT) can produce atom mappings for individual reactions, tracing the flow of individual atoms of the entire reactions across a metabolic model remains challenging. Here we establish an automated workflow to obtain reliable atom mappings for large-scale metabolic models by refining the outcome of RDT, and apply the workflow to metabolic models of Arabidopsis thaliana. We demonstrate the accuracy of RDT through a comparative analysis with atom mappings from a large database of biochemical reactions, MetaCyc. We further show the utility of our automated workflow by simulating N-15 isotope enrichment and identifying nitrogen (N)-containing metabolites which show enrichment patterns that are informative for flux estimation in future N-15-MFA studies of A. thaliana. The automated workflow established in this study can be readily expanded to other species for which metabolic models have been established and the resulting atom mappings will facilitate MFA and graph-theoretic structural analyses with large-scale metabolic networks.
KW  - atom mapping
KW  - genome-scale metabolic model
KW  - isotopic labeling
KW  - metabolic
KW  - flux analysis
KW  - technical advance
Y1  - 2022
U6  - https://doi.org/10.1111/tpj.15903
SN  - 0960-7412
SN  - 1365-313X
VL  - 111
IS  - 5
SP  - 1486
EP  - 1500
PB  - Wiley-Blackwell
CY  - Oxford [u.a.]
ER  - 
TY  - JOUR
A1  - Omranian, Sara
A1  - Angeleska, Angela
A1  - Nikoloski, Zoran
T1  - Efficient and accurate identification of protein complexes from protein-protein interaction networks based on the clustering coefficient
JF  - Computational and structural biotechnology journal
N2  - Identification of protein complexes from protein-protein interaction (PPI) networks is a key problem in PPI mining, solved by parameter-dependent approaches that suffer from small recall rates. Here we introduce GCC-v, a family of efficient, parameter-free algorithms to accurately predict protein complexes using the (weighted) clustering coefficient of proteins in PPI networks. Through comparative analyses with gold standards and PPI networks from Escherichia coli, Saccharomyces cerevisiae, and Homo sapiens, we demonstrate that GCC-v outperforms twelve state-of-the-art approaches for identification of protein complexes with respect to twelve performance measures in at least 85.71% of scenarios. We also show that GCC-v results in the exact recovery of similar to 35% of protein complexes in a pan-plant PPI network and discover 144 new protein complexes in Arabidopsis thaliana, with high support from GO semantic similarity. Our results indicate that findings from GCC-v are robust to network perturbations, which has direct implications to assess the impact of the PPI network quality on the predicted protein complexes. (C) 2021 The Author(s). Published by Elsevier B.V. on behalf of Research Network of Computational and Structural Biotechnology.
KW  - Protein complexes
KW  - Protein-protein interaction
KW  - Network clustering
KW  - Species comparison
Y1  - 2021
U6  - https://doi.org/10.1016/j.csbj.2021.09.014
SN  - 2001-0370
VL  - 19
SP  - 5255
EP  - 5263
PB  - Elsevier
CY  - Amsterdam
ER  - 
TY  - JOUR
A1  - Tong, Hao
A1  - Nankar, Amol N.
A1  - Liu, Jintao
A1  - Todorova, Velichka
A1  - Ganeva, Daniela
A1  - Grozeva, Stanislava
A1  - Tringovska, Ivanka
A1  - Pasev, Gancho
A1  - Radeva-Ivanova, Vesela
A1  - Gechev, Tsanko
A1  - Kostova, Dimitrina
A1  - Nikoloski, Zoran
T1  - Genomic prediction of morphometric and colorimetric traits in Solanaceous fruits
JF  - Horticulture research
N2  - Selection of high-performance lines with respect to traits of interest is a key step in plant breeding. Genomic prediction allows to determine the genomic estimated breeding values of unseen lines for trait of interest using genetic markers, e.g. single-nucleotide polymorphisms (SNPs), and machine learning approaches, which can therefore shorten breeding cycles, referring to genomic selection (GS). Here, we applied GS approaches in two populations of Solanaceous crops, i.e. tomato and pepper, to predict morphometric and colorimetric traits. The traits were measured by using scoring-based conventional descriptors (CDs) as well as by Tomato Analyzer (TA) tool using the longitudinally and latitudinally cut fruit images. The GS performance was assessed in cross-validations of classification-based and regression-based machine learning models for CD and TA traits, respectively. The results showed the usage of TA traits and tag SNPs provide a powerful combination to predict morphology and color-related traits of Solanaceous fruits. The highest predictability of 0.89 was achieved for fruit width in pepper, with an average predictability of 0.69 over all traits. The multi-trait GS models are of slightly better predictability than single-trait models for some colorimetric traits in pepper. While model validation performs poorly on wild tomato accessions, the usage as many as one accession per wild species in the training set can increase the transferability of models to unseen populations for some traits (e.g. fruit shape for which predictability in unseen scenario increased from zero to 0.6). Overall, GS approaches can assist the selection of high-performance Solanaceous fruits in crop breeding.
Y1  - 2022
U6  - https://doi.org/10.1093/hr/uhac072
SN  - 2052-7276
VL  - 9
PB  - Oxford Univ. Press
CY  - Cary
ER  - 
TY  - JOUR
A1  - Omranian, Sara
A1  - Nikoloski, Zoran
A1  - Grimm, Dominik G.
T1  - Computational identification of protein complexes from network interactions: Present state, challenges, and the way forward
BT  - present state, challenges, and the way forward
JF  - Computational and structural biotechnology journal
N2  - Physically interacting proteins form macromolecule complexes that drive diverse cellular processes. Advances in experimental techniques that capture interactions between proteins provide us with protein-protein interaction (PPI) networks from several model organisms. These datasets have enabled the prediction and other computational analyses of protein complexes. Here we provide a systematic review of the state-of-the-art algorithms for protein complex prediction from PPI networks proposed in the past two decades. The existing approaches that solve this problem are categorized into three groups, including: cluster-quality-based, node affinity-based, and network embedding-based approaches, and we compare and contrast the advantages and disadvantages. We further include a comparative analysis by computing the performance of eighteen methods based on twelve well-established performance measures on four widely used benchmark protein-protein interaction networks. Finally, the limitations and drawbacks of both, current data and approaches, along with the potential solutions in this field are discussed, with emphasis on the points that pave the way for future research efforts in this field. (c) 2022 The Author(s). Published by Elsevier B.V. on behalf of Research Network of Computational and Structural Biotechnology. This is an open access article under the CC BY license (http://creativecommons. org/licenses/by/4.0/).
KW  - Protein Complex Prediction
KW  - Protein-Protein interaction network
KW  - Network
KW  - Clustering Algorithms
KW  - Network embedding
Y1  - 2022
U6  - https://doi.org/10.1016/j.csbj.2022.05.049
SN  - 2001-0370
VL  - 20
SP  - 2699
EP  - 2712
PB  - Research Network of Computational and Structural Biotechnology (RNCSB)
CY  - Gotenburg
ER  - 
TY  - JOUR
A1  - Angeleska, Angela
A1  - Omranian, Sara
A1  - Nikoloski, Zoran
T1  - Coherent network partitions
BT  - Characterizations with cographs and prime graphs
JF  - Theoretical computer science : the journal of the EATCS
N2  - We continue to study coherent partitions of graphs whereby the vertex set is partitioned into subsets that induce biclique spanned subgraphs. The problem of identifying the minimum number of edges to obtain biclique spanned connected components (CNP), called the coherence number, is NP-hard even on bipartite graphs. Here, we propose a graph transformation geared towards obtaining an O (log n)-approximation algorithm for the CNP on a bipartite graph with n vertices. The transformation is inspired by a new characterization of biclique spanned subgraphs. In addition, we study coherent partitions on prime graphs, and show that finding coherent partitions reduces to the problem of finding coherent partitions in a prime graph. Therefore, these results provide future directions for approximation algorithms for the coherence number of a given graph.
KW  - Graph partitions
KW  - Network clustering
KW  - Cographs
KW  - Coherent partition
KW  - Prime graphs
Y1  - 2021
U6  - https://doi.org/10.1016/j.tcs.2021.10.002
SN  - 0304-3975
VL  - 894
SP  - 3
EP  - 11
PB  - Elsevier
CY  - Amsterdam [u.a.]
ER  - 
TY  - JOUR
A1  - Wendering, Philipp
A1  - Nikoloski, Zoran
T1  - COMMIT
BT  - Consideration of metabolite leakage and community composition improves microbial community reconstructions
JF  - PLoS Computational Biology : a new community journal / publ. by the Public Library of Science (PLoS) in association with the International Society for Computational Biology (ISCB)
N2  - Composition and functions of microbial communities affect important traits in diverse hosts, from crops to humans. Yet, mechanistic understanding of how metabolism of individual microbes is affected by the community composition and metabolite leakage is lacking. Here, we first show that the consensus of automatically generated metabolic reconstructions improves the quality of the draft reconstructions, measured by comparison to reference models. We then devise an approach for gap filling, termed COMMIT, that considers metabolites for secretion based on their permeability and the composition of the community. By applying COMMIT with two soil communities from the Arabidopsis thaliana culture collection, we could significantly reduce the gap-filling solution in comparison to filling gaps in individual reconstructions without affecting the genomic support. Inspection of the metabolic interactions in the soil communities allows us to identify microbes with community roles of helpers and beneficiaries. Therefore, COMMIT offers a versatile fully automated solution for large-scale modelling of microbial communities for diverse biotechnological applications. <br /> Author summaryMicrobial communities are important in ecology, human health, and crop productivity. However, detailed information on the interactions within natural microbial communities is hampered by the community size, lack of detailed information on the biochemistry of single organisms, and the complexity of interactions between community members. Metabolic models are comprised of biochemical reaction networks based on the genome annotation, and can provide mechanistic insights into community functions. Previous analyses of microbial community models have been performed with high-quality reference models or models generated using a single reconstruction pipeline. However, these models do not contain information on the composition of the community that determines the metabolites exchanged between the community members. In addition, the quality of metabolic models is affected by the reconstruction approach used, with direct consequences on the inferred interactions between community members. Here, we use fully automated consensus reconstructions from four approaches to arrive at functional models with improved genomic support while considering the community composition. We applied our pipeline to two soil communities from the Arabidopsis thaliana culture collection, providing only genome sequences. Finally, we show that the obtained models have 90% genomic support and demonstrate that the derived interactions are corroborated by independent computational predictions.
Y1  - 2022
U6  - https://doi.org/10.1371/journal.pcbi.1009906
SN  - 1553-734X
SN  - 1553-7358
VL  - 18
IS  - 3
PB  - Public Library of Science
CY  - San Fransisco
ER  - 
TY  - JOUR
A1  - Tong, Hao
A1  - Küken, Anika
A1  - Razaghi-Moghadam, Zahra
A1  - Nikoloski, Zoran
T1  - Characterization of effects of genetic variants via genome-scale metabolic modelling
JF  - Cellular and molecular life sciences : CMLS
N2  - Genome-scale metabolic networks for model plants and crops in combination with approaches from the constraint-based modelling framework have been used to predict metabolic traits and design metabolic engineering strategies for their manipulation. With the advances in technologies to generate large-scale genotyping data from natural diversity panels and other populations, genome-wide association and genomic selection have emerged as statistical approaches to determine genetic variants associated with and predictive of traits. Here, we review recent advances in constraint-based approaches that integrate genetic variants in genome-scale metabolic models to characterize their effects on reaction fluxes. Since some of these approaches have been applied in organisms other than plants, we provide a critical assessment of their applicability particularly in crops. In addition, we further dissect the inferred effects of genetic variants with respect to reaction rate constants, abundances of enzymes, and concentrations of metabolites, as main determinants of reaction fluxes and relate them with their combined effects on complex traits, like growth. Through this systematic review, we also provide a roadmap for future research to increase the predictive power of statistical approaches by coupling them with mechanistic models of metabolism.
KW  - Single-nucleotide polymorphisms
KW  - Metabolic models
KW  - Genome-wide
KW  - association studies
KW  - Genomic selection
Y1  - 2021
U6  - https://doi.org/10.1007/s00018-021-03844-4
SN  - 1420-682X
SN  - 1420-9071
VL  - 78
IS  - 12
SP  - 5123
EP  - 5138
PB  - Springer International Publishing AG
CY  - Cham
ER  - 
TY  - JOUR
A1  - Mbebi, Alain J.
A1  - Tong, Hao
A1  - Nikoloski, Zoran
T1  - L-2,L-1-norm regularized multivariate regression model with applications to genomic prediction
JF  - Bioinformatics
N2  - Motivation: 
Genomic selection (GS) is currently deemed the most effective approach to speed up breeding of agricultural varieties. It has been recognized that consideration of multiple traits in GS can improve accuracy of prediction for traits of low heritability. However, since GS forgoes statistical testing with the idea of improving predictions, it does not facilitate mechanistic understanding of the contribution of particular single nucleotide polymorphisms (SNP). 

Results: 
Here, we propose a L-2,L-1-norm regularized multivariate regression model and devise a fast and efficient iterative optimization algorithm, called L-2,L-1-joint, applicable in multi-trait GS. The usage of the L-2,L-1-norm facilitates variable selection in a penalized multivariate regression that considers the relation between individuals, when the number of SNPs is much larger than the number of individuals. The capacity for variable selection allows us to define master regulators that can be used in a multi-trait GS setting to dissect the genetic architecture of the analyzed traits. Our comparative analyses demonstrate that the proposed model is a favorable candidate compared to existing state-of-the-art approaches. Prediction and variable selection with datasets from Brassica napus, wheat and Arabidopsis thaliana diversity panels are conducted to further showcase the performance of the proposed model.
Y1  - 2021
U6  - https://doi.org/10.1093/bioinformatics/btab212
SN  - 1367-4803
SN  - 1460-2059
VL  - 37
IS  - 18
SP  - 2896
EP  - 2904
PB  - Oxford Univ. Press
CY  - Oxford
ER  - 
TY  - JOUR
A1  - Lyall, Rafe
A1  - Nikoloski, Zoran
A1  - Gechev, Tsanko
T1  - Comparative analysis of ROS network genes in extremophile Eukaryotes
JF  - International journal of molecular sciences
N2  - The reactive oxygen species (ROS) gene network, consisting of both ROS-generating and detoxifying enzymes, adjusts ROS levels in response to various stimuli. We performed a cross-kingdom comparison of ROS gene networks to investigate how they have evolved across all Eukaryotes, including protists, fungi, plants and animals. We included the genomes of 16 extremotolerant Eukaryotes to gain insight into ROS gene evolution in organisms that experience extreme stress conditions. Our analysis focused on ROS genes found in all Eukaryotes (such as catalases, superoxide dismutases, glutathione reductases, peroxidases and glutathione peroxidase/peroxiredoxins) as well as those specific to certain groups, such as ascorbate peroxidases, dehydroascorbate/monodehydroascorbate reductases in plants and other photosynthetic organisms. ROS-producing NADPH oxidases (NOX) were found in most multicellular organisms, although several NOX-like genes were identified in unicellular or filamentous species. However, despite the extreme conditions experienced by extremophile species, we found no evidence for expansion of ROS-related gene families in these species compared to other Eukaryotes. Tardigrades and rotifers do show ROS gene expansions that could be related to their extreme lifestyles, although a high rate of lineage-specific horizontal gene transfer events, coupled with recent tetraploidy in rotifers, could explain this observation. This suggests that the basal Eukaryotic ROS scavenging systems are sufficient to maintain ROS homeostasis even under the most extreme conditions.
KW  - ROS
KW  - extremotolerance
KW  - resurrection plants
Y1  - 2020
U6  - https://doi.org/10.3390/ijms21239131
SN  - 1422-0067
VL  - 21
IS  - 23
PB  - Molecular Diversity Preservation International (MDPI)
CY  - Basel
ER  -