• search hit 3 of 13
Back to Result List

Interaction-based feature selection algorithm outperforms polygenic risk score in predicting Parkinson’s Disease status

  • Polygenic risk scores (PRS) aggregating results from genome-wide association studies are the state of the art in the prediction of susceptibility to complex traits or diseases, yet their predictive performance is limited for various reasons, not least of which is their failure to incorporate the effects of gene-gene interactions. Novel machine learning algorithms that use large amounts of data promise to find gene-gene interactions in order to build models with better predictive performance than PRS. Here, we present a data preprocessing step by using data-mining of contextual information to reduce the number of features, enabling machine learning algorithms to identify gene-gene interactions. We applied our approach to the Parkinson's Progression Markers Initiative (PPMI) dataset, an observational clinical study of 471 genotyped subjects (368 cases and 152 controls). With an AUC of 0.85 (95% CI = [0.72; 0.96]), the interaction-based prediction model outperforms the PRS (AUC of 0.58 (95% CI = [0.42; 0.81])). Furthermore, featurePolygenic risk scores (PRS) aggregating results from genome-wide association studies are the state of the art in the prediction of susceptibility to complex traits or diseases, yet their predictive performance is limited for various reasons, not least of which is their failure to incorporate the effects of gene-gene interactions. Novel machine learning algorithms that use large amounts of data promise to find gene-gene interactions in order to build models with better predictive performance than PRS. Here, we present a data preprocessing step by using data-mining of contextual information to reduce the number of features, enabling machine learning algorithms to identify gene-gene interactions. We applied our approach to the Parkinson's Progression Markers Initiative (PPMI) dataset, an observational clinical study of 471 genotyped subjects (368 cases and 152 controls). With an AUC of 0.85 (95% CI = [0.72; 0.96]), the interaction-based prediction model outperforms the PRS (AUC of 0.58 (95% CI = [0.42; 0.81])). Furthermore, feature importance analysis of the model provided insights into the mechanism of Parkinson's disease. For instance, the model revealed an interaction of previously described drug target candidate genes TMEM175 and GAPDHP25. These results demonstrate that interaction-based machine learning models can improve genetic prediction models and might provide an answer to the missing heritability problem.show moreshow less

Export metadata

Additional Services

Search Google Scholar Statistics
Metadaten
Author details:Justin L. Cope, Hannes A. Baukmann, Jörn E. Klinger, Charles N. J. Ravarani, Erwin BöttingerGND, Stefan KonigorskiORCiDGND, Marco F. SchmidtGND
DOI:https://doi.org/10.3389/fgene.2021.744557
ISSN:1664-8021
Pubmed ID:https://pubmed.ncbi.nlm.nih.gov/34745218
Title of parent work (English):Frontiers in genetics
Publisher:Frontiers Media
Place of publishing:Lausanne
Publication type:Article
Language:English
Date of first publication:2021/10/20
Publication year:2021
Release date:2023/04/03
Tag:PPMI (parkinson's progression markers initiative); epistasis; feature selection; machine learning; parkinson's disease
Volume:12
Article number:744557
Number of pages:9
Funding institution:Investitionsbank des Landes Brandenburg (ILB),
Organizational units:An-Institute / Hasso-Plattner-Institut für Digital Engineering gGmbH
DDC classification:5 Naturwissenschaften und Mathematik / 57 Biowissenschaften; Biologie / 570 Biowissenschaften; Biologie
Peer review:Referiert
Publishing method:Open Access / Gold Open-Access
DOAJ gelistet
License (German):License LogoCC-BY - Namensnennung 4.0 International
Accept ✔
This website uses technically necessary session cookies. By continuing to use the website, you agree to this. You can find our privacy policy here.