Refine
Document Type
- Doctoral Thesis (2)
- Postprint (1)
Language
- English (3)
Is part of the Bibliography
- yes (3)
Keywords
- differential gene expression (3) (remove)
Background: For heterogeneous tissues, such as blood, measurements of gene expression are confounded by relative proportions of cell types involved. Conclusions have to rely on estimation of gene expression signals for homogeneous cell populations, e.g. by applying micro-dissection, fluorescence activated cell sorting, or in-silico deconfounding. We studied feasibility and validity of a non-negative matrix decomposition algorithm using experimental gene expression data for blood and sorted cells from the same donor samples. Our objective was to optimize the algorithm regarding detection of differentially expressed genes and to enable its use for classification in the difficult scenario of reversely regulated genes. This would be of importance for the identification of candidate biomarkers in heterogeneous tissues.
Results: Experimental data and simulation studies involving noise parameters estimated from these data revealed that for valid detection of differential gene expression, quantile normalization and use of non-log data are optimal. We demonstrate the feasibility of predicting proportions of constituting cell types from gene expression data of single samples, as a prerequisite for a deconfounding-based classification approach. Classification cross-validation errors with and without using deconfounding results are reported as well as sample-size dependencies. Implementation of the algorithm, simulation and analysis scripts are available.
Conclusions: The deconfounding algorithm without decorrelation using quantile normalization on non-log data is proposed for biomarkers that are difficult to detect, and for cases where confounding by varying proportions of cell types is the suspected reason. In this case, a deconfounding ranking approach can be used as a powerful alternative to, or complement of, other statistical learning approaches to define candidate biomarkers for molecular diagnosis and prediction in biomedicine, in realistically noisy conditions and with moderate sample sizes.
Throughout their lifetime plants need to adapt to temperature changes. Plants adapt to nonfreezing cold temperatures in a process called cold priming (cold acclimation) and lose the acquired freezing tolerance during warmer temperatures through deacclimation. The alternation of both processes is essential for plants to achieve optimal fitness in response to different temperature conditions. Cold acclimation has been extensively studied, however, little is known about the regulation of deacclimation. This thesis elucidates the process of deacclimation on a physiological and molecular level in Arabidopsis thaliana. Electrolyte leakage measurements during cold acclimation and up to four days of deacclimation enabled the identification of four knockout mutants (hra1, lbd41, mbf1c and jub1) with a slower rate of deacclimation compared to the wild type. A transcriptomic study using RNA-Sequencing in A. thaliana Col-0, jub1 and mbf1c identified the importance of the inhibition of stress responsive and Jasmonate-ZIM-domain genes as well as the regulation of cell wall modifications during deacclimation. Moreover, measurements of alcohol dehydrogenase activity and gene expression changes of hypoxia markers during the first four days of deacclimation evidently showed that a hypoxia response is activated during deacclimation. Epigenetic regulation was observed to be extensively involved during cold acclimation and 24 h of deacclimation in A. thaliana. Further, both deacclimation studies showed that the previous hypothesis that heat stress might play a role in early deacclimation, is not likely. A number of DNA- and histone demethylases as well as histone variants were upregulated during deacclimation suggesting a role in plant memory. Recently, multiple studies have shown that plants are able to retain memory of a previous cold stress even after a week of deacclimation. In this work, transcriptomic and metabolomic analyses of Arabidopsis during 24 h of priming (cold acclimation) and triggering (recurring cold stress after deacclimation) revealed a uniquely significant and transient induction of DREB1D, DREB1E and DREB1F transcription factors during triggering contributing to fine-tuning of the second cold stress response. Furthermore, genes encoding Late Embryogenesis Abundant (LEA) and antifreeze proteins and proteins detoxifying reactive oxygen species were higher induced during late triggering (24 h) compared to primed samples, while cell wall remodelers of the class xyloglucan endotransglucosylase/hydrolase were early responders of triggering. The high induction of cell wall remodelers during deacclimation as well as triggering proposes that these proteins play an essential role in the stabilization of the cells during growth as well as the response to recurring stresses. Collectively this work gives new insights on the regulation of deacclimation and cold stress memory in A. thaliana and opens the door to future targeted studies of essential genes in this process.
Due to global climate change providing food security for an increasing world population is a big challenge. Especially abiotic stressors have a strong negative effect on crop yield. To develop climate-adapted crops a comprehensive understanding of molecular alterations in the response of varying levels of environmental stresses is required. High throughput or ‘omics’ technologies can help to identify key-regulators and pathways of abiotic stress responses. In addition to obtain omics data also tools and statistical analyses need to be designed and evaluated to get reliable biological results.
To address these issues, I have conducted three different studies covering two omics technologies. In the first study, I used transcriptomic data from the two polymorphic Arabidopsis thaliana accessions, namely Col-0 and N14, to evaluate seven computational tools for their ability to map and quantify Illumina single-end reads. Between 92% and 99% of the reads were mapped against the reference sequence. The raw count distributions obtained from the different tools were highly correlated. Performing a differential gene expression analysis between plants exposed to 20 °C or 4°C (cold acclimation), a large pairwise overlap between the mappers was obtained. In the second study, I obtained transcript data from ten different Oryza sativa (rice) cultivars by PacBio Isoform sequencing that can capture full-length transcripts. De novo reference transcriptomes were reconstructed resulting in 38,900 to 54,500 high-quality isoforms per cultivar. Isoforms were collapsed to reduce sequence redundancy and evaluated, e.g. for protein completeness level (BUSCO), transcript length, and number of unique transcripts per gene loci. For the heat and drought tolerant aus cultivar N22, I identified around 650 unique and novel transcripts of which 56 were significantly differentially expressed in developing seeds during combined drought and heat stress. In the last study, I measured and analyzed the changes in metabolite profiles of eight rice cultivars exposed to high night temperature (HNT) stress and grown during the dry and wet season on the field in the Philippines. Season-specific changes in metabolite levels, as well as for agronomic parameters, were identified and metabolic pathways causing a yield decline at HNT conditions suggested.
In conclusion, the comparison of mapper performances can help plant scientists to decide on the right tool for their data. The de novo reconstruction of rice cultivars without a genome sequence provides a targeted, cost-efficient approach to identify novel genes responding to stress conditions for any organism. With the metabolomics approach for HNT stress in rice, I identified stress and season-specific metabolites which might be used as molecular markers for crop improvement in the future.