Refine
Has Fulltext
- no (55) (remove)
Year of publication
Document Type
- Article (52)
- Review (2)
- Monograph/Edited Volume (1)
Language
- English (55)
Is part of the Bibliography
- yes (55)
Keywords
- metabolomics (3)
- Glioma (2)
- Quantitative Trait Locus (2)
- Quantitative Trait Locus analysis (2)
- gene expression (2)
- heterosis (2)
- recombinant inbred line (2)
- Algebraic geometry (1)
- Arabidopsis thaliana (1)
- Bifurcation parameters (1)
Background: Protein phosphorylation is an important post-translational modification influencing many aspects of dynamic cellular behavior. Site-specific phosphorylation of amino acid residues serine, threonine, and tyrosine can have profound effects on protein structure, activity, stability, and interaction with other biomolecules. Phosphorylation sites can be affected in diverse ways in members of any species, one such way is through single nucleotide polymorphisms (SNPs). The availability of large numbers of experimentally identified phosphorylation sites, and of natural variation datasets in Arabidopsis thaliana prompted us to analyze the effect of non-synonymous SNPs (nsSNPs) onto phosphorylation sites.
Results: From the analyses of 7,178 experimentally identified phosphorylation sites we found that: (i) Proteins with multiple phosphorylation sites occur more often than expected by chance. (ii) Phosphorylation hotspots show a preference to be located outside conserved domains. (iii) nsSNPs affected experimental phosphorylation sites as much as the corresponding non-phosphorylated amino acid residues. (iv) Losses of experimental phosphorylation sites by nsSNPs were identified in 86 A. thaliana proteins, among them receptor proteins were overrepresented.
These results were confirmed by similar analyses of predicted phosphorylation sites in A. thaliana. In addition, predicted threonine phosphorylation sites showed a significant enrichment of nsSNPs towards asparagines and a significant depletion of the synonymous substitution. Proteins in which predicted phosphorylation sites were affected by nsSNPs (loss and gain), were determined to be mainly receptor proteins, stress response proteins and proteins involved in nucleotide and protein binding. Proteins involved in metabolism, catalytic activity and biosynthesis were less affected.
Conclusions: We analyzed more than 7,100 experimentally identified phosphorylation sites in almost 4,300 protein-coding loci in silico, thus constituting the largest phosphoproteomics dataset for A. thaliana available to date. Our findings suggest a relatively high variability in the presence or absence of phosphorylation sites between different natural accessions in receptor and other proteins involved in signal transduction. Elucidating the effect of phosphorylation sites affected by nsSNPs on adaptive responses represents an exciting research goal for the future.
The main objective of this study was to identify genomic regions involved in biomass heterosis using QTL, generation means, and mode-of-inheritance classification analyses. In a modified North Carolina Design III we backcrossed 429 recombinant inbred line and 140 introgression line populations to the two parental accessions, C24 and Col-0, whose F 1 hybrid exhibited 44% heterosis for biomass. Mid-parent heterosis in the RILs ranged from −31 to 99% for dry weight and from −58 to 143% for leaf area. We detected ten genomic positions involved in biomass heterosis at an early developmental stage, individually explaining between 2.4 and 15.7% of the phenotypic variation. While overdominant gene action was prevalent in heterotic QTL, our results suggest that a combination of dominance, overdominance and epistasis is involved in biomass heterosis in this Arabidopsis cross.
Prediction of hybrid biomass in Arabidopsis thaliana by selected parental SNP and metabolic markers
(2009)
A recombinant inbred line (RIL) population, derived from two Arabidopsis thaliana accessions, and the corresponding testcrosses with these two original accessions were used for the development and validation of machine learning models to predict the biomass of hybrids. Genetic and metabolic information of the RILs served as predictors. Feature selection reduced the number of variables (genetic and metabolic markers) in the models by more than 80% without impairing the predictive power. Thus, potential biomarkers have been revealed. Metabolites were shown to bear information on inherited macroscopic phenotypes. This proof of concept could be interesting for breeders. The example population exhibits substantial mid-parent biomass heterosis. The results of feature selection could therefore be used to shed light on the origin of heterosis. In this respect, mainly dominance effects were detected.
To develop and investigate detailed mathematical models of metabolic processes is one of the primary challenges in systems biology. However, despite considerable advance in the topological analysis of metabolic networks, kinetic modeling is still often severely hampered by inadequate knowledge of the enzyme-kinetic rate laws and their associated parameter values. Here we propose a method that aims to give a quantitative account of the dynamical capabilities of a metabolic system, without requiring any explicit information about the functional form of the rate equations. Our approach is based on constructing a local linear model at each point in parameter space, such that each element of the model is either directly experimentally accessible or amenable to a straightforward biochemical interpretation. This ensemble of local linear models, encompassing all possible explicit kinetic models, then allows for a statistical exploration of the comprehensive parameter space. The method is exemplified on two paradigmatic metabolic systems: the glycolytic pathway of yeast and a realistic-scale representation of the photosynthetic Calvin cycle.
Background: The biological interpretation of large-scale gene expression data is one of the paramount challenges in current bioinformatics. In particular, placing the results in the context of other available functional genomics data, such as existing bio-ontologies, has already provided substantial improvement for detecting and categorizing genes of interest. One common approach is to look for functional annotations that are significantly enriched within a group or cluster of genes, as compared to a reference group. Results: In this work, we suggest the information-theoretic concept of mutual information to investigate the relationship between groups of genes, as given by data-driven clustering, and their respective functional categories. Drawing upon related approaches (Gibbons and Roth, Genome Research 12: 1574-1581, 2002), we seek to quantify to what extent individual attributes are sufficient to characterize a given group or cluster of genes. Conclusion: We show that the mutual information provides a systematic framework to assess the relationship between groups or clusters of genes and their functional annotations in a quantitative way. Within this framework, the mutual information allows us to address and incorporate several important issues, such as the interdependence of functional annotations and combinatorial combinations of attributes. It thus supplements and extends the conventional search for overrepresented attributes within a group or cluster of genes. In particular taking combinations of attributes into account, the mutual information opens the way to uncover specific functional descriptions of a group of genes or clustering result. All datasets and functional annotations used in this study are publicly available. All scripts used in the analysis are provided as additional files.
Aims/hypothesis Polycystic ovary syndrome (PCOS) is a risk factor of type 2 diabetes. Screening for impaired glucose metabolism (IGM) with an OGTT has been recommended, but this is relatively time-consuming and inconvenient. Thus, a strategy that could minimise the need for an OGTT would be beneficial. Materials and methods Consecutive PCOS patients (n=118) with fasting glucose < 6.1 mmol/l were included in the study. Parameters derived from medical history, clinical examination and fasting blood samples were assessed by decision tree modelling for their ability to discriminate women with IGM (2-h OGTT value >= 7.8 mmol/l) from those with NGT. Results According to the OGTT results, 93 PCOS women had NGT and 25 had IGM. The best decision tree consisted of HOMA-IR, the proinsulin:insulin ratio, proinsulin, 17-OH progesterone and the ratio of luteinising hormone:follicle-stimulating hormone. This tree identified 69 women with NGT. The remaining 49 women included all women with IGM (100% sensitivity, 74% specificity to detect IGM). Pruning this tree to three levels still identified 53 women with NGT (100% sensitivity, 57% specificity to detect IGM). Restricting the data matrix used for tree modelling to medical history and clinical parameters produced a tree using BMI, waist circumference and WHR. Pruning this tree to two levels separated 27 women with NGT (100% sensitivity, 29% specificity to detect IGM). The validity of both trees was tested by a leave-10%-out cross-validation. Conclusions/interpretation Decision trees are useful tools for separating PCOS women with NGT from those with IGM. They can be used for stratifying the metabolic screening of PCOS women, whereby the number of OGTTs can be markedly reduced.
Data integration has become a useful strategy for uncovering new insights into complex biological networks. We studied whether this approach can help to delineate the signal transducer and activator of transcription 6 (STAT6)-mediated transcriptional network driving T helper (Th) 2 cell fate decisions. To this end, we performed an integrative analysis of publicly available RNA-seq data of Stat6-knockout mouse studies together with STAT6 ChIP-seq data and our own gene expression time series data during Th2 cell differentiation. We focused on transcription factors (TFs), cytokines, and cytokine receptors and delineated 59 positively and 41 negatively STAT6-regulated genes, which were used to construct a transcriptional network around STAT6. The network illustrates that important and well-known TFs for Th2 cell differentiation are positively regulated by STAT6 and act either as activators for Th2 cells (e.g., Gata3, Atf3, Satb1, Nfil3, Maf, and Pparg) or as suppressors for other Th cell subpopulations such as Th1 (e.g., Ar), Th17 (e.g., Etv6), or iTreg (e.g., Stat3 and Hifla) cells. Moreover, our approach reveals 11 TFs (e.g., Atf5, Creb3l2, and Asb2) with unknown functions in Th cell differentiation. This fact together with the observed enrichment of asthma risk genes among those regulated by STAT6 underlines the potential value of the data integration strategy used here. Thus, our results clearly support the opinion that data integration is a useful tool to delineate complex physiological processes.
analysis
(2016)
The development of ‘omics’ technologies has progressed to address complex biological questions that underlie various plant functions thereby producing copious amounts of data. The need to assimilate large amounts of data into biologically meaningful interpretations has necessitated the development of statistical methods to integrate multidimensional information. Throughout this review, we provide examples of recent outcomes of ‘omics’ data integration together with an overview of available statistical methods and tools.
Phenomic experiments are carried out in large-scale plant phenotyping facilities that acquire a large number of pictures of hundreds of plants simultaneously. With the aid of automated image processing, the data are converted into genotype-feature matrices that cover many consecutive days of development. Here, we explore the possibility of predicting the biomass of the fully grown plant from early developmental stage image-derived features. We performed phenomic experiments on 195 inbred and 382 hybrid maizes varieties and followed their progress from 16 days after sowing (DAS) to 48 DAS with 129 image-derived features. By applying sparse regression methods, we show that 73% of the variance in hybrid fresh weight of fully-grown plants is explained by about 20 features at the three-leaf-stage or earlier. Dry weight prediction explained over 90% of the variance. When phenomic features of parental inbred lines were used as predictors of hybrid biomass, the proportion of variance explained was 42 and 45%, for fresh weight and dry weight models consisting of 35 and 36 features, respectively. These models were very robust, showing only a small amount of variation in performance over the time scale of the experiment. We also examined mid-parent heterosis in phenomic features. Feature heterosis displayed a large degree of variance which resulted in prediction performance that was less robust than models of either parental or hybrid predictors. Our results show that phenomic prediction is a viable alternative to genomic and metabolic prediction of hybrid performance. In particular, the utility of early-stage parental lines is very encouraging. (C) 2016 Elsevier Ireland Ltd. All rights reserved.
The adaptive response of skeletal muscle to exercise training is tightly controlled and therefore requires transcriptional regulation. DNA methylation is an epigenetic mechanism known to modulate gene expression, but its contribution to exercise-induced adaptations in skeletal muscle is not well studied. Here, we describe a genome-wide analysis of DNA methylation in muscle of trained mice (n = 3). Compared with sedentary controls, 2,762 genes exhibited differentially methylated CpGs (P < 0.05, meth diff >5%, coverage > 10) in their putative promoter regions. Alignment with gene expression data (n = 6) revealed 200 genes with a negative correlation between methylation and expression changes in response to exercise training. The majority of these genes were related to muscle growth and differentiation, and a minor fraction involved in metabolic regulation. Among the candidates were genes that regulate the expression of myogenic regulatory factors (Plexin A2) as well as genes that participate in muscle hypertrophy (Igfbp4) and motor neuron innervation (Dok7). Interestingly, a transcription factor binding site enrichment study discovered significantly enriched occurrence of CpG methylation in the binding sites of the myogenic regulatory factors MyoD and myogenin. These findings suggest that DNA methylation is involved in the regulation of muscle adaptation to regular exercise training.