570 Biowissenschaften; Biologie
Maize (Zea mays L.) is a staple food whose production relies on seed stocks that largely comprise hybrid varieties. Therefore, knowledge about the molecular determinants of hybrid performance (HP) in the field can be used to devise better performing hybrids to address the demands for sustainable increase in yield. Here, we propose and test a classification-driven framework that uses metabolic profiles from in vitro grown young roots of parental lines from the Dent x Flint maize heterotic pattern to predict field HP. We identify parental analytes that best predict the metabolic inheritance patterns in 328 hybrids. We then demonstrate that these analytes are also predictive of field HP (0.64 >= r >= 0.79) and discriminate hybrids of good performance (accuracy of 87.50%). Therefore, our approach provides a cost-effective solution for hybrid selection programs.
Background: Hybrids represent a cornerstone in the success story of breeding programs. The fundamental principle underlying this success is the phenomenon of hybrid vigour, or heterosis. It describes an advantage of the offspring as compared to the two parental lines with respect to parameters such as growth and resistance against abiotic or biotic stress. Dominance, overdominance or epistasis based models are commonly used explanations. Conclusion/Significance: The heterosis level is clearly a function of the combination of the parents used for offspring production. This results in a major challenge for plant breeders, as usually several thousand combinations of parents have to be tested for identifying the best combinations. Thus, any approach to reliably predict heterosis levels based on properties of the parental lines would be highly beneficial for plant breeding. Methodology/Principal Findings: Recently, genetic data have been used to predict heterosis. Here we show that a combination of parental genetic and metabolic markers, identified via feature selection and minimum-description-length based regression methods, significantly improves the prediction of biomass heterosis in resulting offspring. These findings will help furthering our understanding of the molecular basis of heterosis, revealing, for instance, the presence of nonlinear genotype-phenotype relationships. In addition, we describe a possible approach for accelerated selection in plant breeding.
Prediction of hybrid biomass in Arabidopsis thaliana by selected parental SNP and metabolic markers
(2009)
A recombinant inbred line (RIL) population, derived from two Arabidopsis thaliana accessions, and the corresponding testcrosses with these two original accessions were used for the development and validation of machine learning models to predict the biomass of hybrids. Genetic and metabolic information of the RILs served as predictors. Feature selection reduced the number of variables (genetic and metabolic markers) in the models by more than 80% without impairing the predictive power. Thus, potential biomarkers have been revealed. Metabolites were shown to bear information on inherited macroscopic phenotypes. This proof of concept could be interesting for breeders. The example population exhibits substantial mid-parent biomass heterosis. The results of feature selection could therefore be used to shed light on the origin of heterosis. In this respect, mainly dominance effects were detected.
Prediction of hybrid biomass in Arabidopsis thaliana by selected parental SNP and metabolic markers
(2009)
A recombinant inbred line (RIL) population, derived from two Arabidopsis thaliana accessions, and the corresponding testcrosses with these two original accessions were used for the development and validation of machine learning models to predict the biomass of hybrids. Genetic and metabolic information of the RILs served as predictors. Feature selection reduced the number of variables (genetic and metabolic markers) in the models by more than 80% without impairing the predictive power. Thus, potential biomarkers have been revealed. Metabolites were shown to bear information on inherited macroscopic phenotypes. This proof of concept could be interesting for breeders. The example population exhibits substantial mid-parent biomass heterosis. The results of feature selection could therefore be used to shed light on the origin of heterosis. In this respect, mainly dominance effects were detected.
The main objective of this study was to identify genomic regions involved in biomass heterosis using QTL, generation means, and mode-of-inheritance classification analyses. In a modified North Carolina Design III we backcrossed 429 recombinant inbred line and 140 introgression line populations to the two parental accessions, C24 and Col-0, whose F 1 hybrid exhibited 44% heterosis for biomass. Mid-parent heterosis in the RILs ranged from −31 to 99% for dry weight and from −58 to 143% for leaf area. We detected ten genomic positions involved in biomass heterosis at an early developmental stage, individually explaining between 2.4 and 15.7% of the phenotypic variation. While overdominant gene action was prevalent in heterotic QTL, our results suggest that a combination of dominance, overdominance and epistasis is involved in biomass heterosis in this Arabidopsis cross.
The main objective of this study was to identify genomic regions involved in biomass heterosis using QTL, generation means, and mode-of-inheritance classification analyses. In a modified North Carolina Design III we backcrossed 429 recombinant inbred line and 140 introgression line populations to the two parental accessions, C24 and Col-0, whose F 1 hybrid exhibited 44% heterosis for biomass. Mid-parent heterosis in the RILs ranged from −31 to 99% for dry weight and from −58 to 143% for leaf area. We detected ten genomic positions involved in biomass heterosis at an early developmental stage, individually explaining between 2.4 and 15.7% of the phenotypic variation. While overdominant gene action was prevalent in heterotic QTL, our results suggest that a combination of dominance, overdominance and epistasis is involved in biomass heterosis in this Arabidopsis cross.
Background: Biological systems adapt to changing environments by reorganizing their cellula r and physiological program with metabolites representing one important response level. Different stresses lead to both conserved and specific responses on the metabolite level which should be reflected in the underl ying metabolic network. Methodology/Principal Findings: Starting from experimental data obtained by a GC-MS based high-throughput metabolic profiling technology we here develop an approach that: (1) extracts network representations from metabolic conditiondependent data by using pairwise correlations, (2) determines the sets of stable and condition-dependent correlations based on a combination of statistical significance and homogeneity tests, and (3) can identify metabolites related to the stress response, which goes beyond simple ob servation s about the changes of metabolic concentrations. The approach was tested with Escherichia colias a model organism observed under four different environmental stress conditions (cold stress, heat stress, oxidative stress, lactose diau xie) and control unperturbed conditions. By constructing the stable network component, which displays a scale free topology and small-world characteristics, we demonstrated that: (1) metabolite hubs in this reconstructed correlation networks are significantly enriched for those contained in biochemical networks such as EcoCyc, (2) particular components of the stable network are enriched for functionally related biochemical path ways, and (3) ind ependently of the response scale, based on their importance in the reorganization of the cor relation network a set of metabolites can be identified which represent hypothetical candidates for adjusting to a stress-specific response. Conclusions/Significance: Network-based tools allowed the identification of stress-dependent and general metabolic correlation networks. This correlation-network-ba sed approach does not rely on major changes in concentration to identify metabolites important for st ress adaptation, but rather on the changes in network properties with respect to metabolites. This should represent a useful complementary technique in addition to more classical approaches.
Maize is the cereal crop with the highest production worldwide, and its oil is a key energy resource. Improving the quantity and quality of maize oil requires a better understanding of lipid metabolism. To predict the function of maize genes involved in lipid biosynthesis, we assembled transcriptomic and lipidomic data sets from leaves of B73 and the high-oil line By804 in two distinct time-series experiments. The integrative analysis based on high-dimensional regularized regression yielded lipid-transcript associations indirectly validated by Gene Ontology and promoter motif enrichment analyses. The co-localization of lipid-transcript associations using the genetic mapping of lipid traits in leaves and seedlings of a B73 x By804 recombinant inbred line population uncovered 323 genes involved in the metabolism of phospholipids, galactolipids, sulfolipids and glycerolipids. The resulting association network further supported the involvement of 50 gene candidates in modulating levels of representatives from multiple acyl-lipid classes. Therefore, the proposed approach provides high-confidence candidates for experimental testing in maize and model plant species.