• search hit 10 of 13
Back to Result List

Phe2vec

  • Robust phenotyping of patients from electronic health records (EHRs) at scale is a challenge in clinical informatics. Here, we introduce Phe2vec, an automated framework for disease phenotyping from EHRs based on unsupervised learning and assess its effectiveness against standard rule-based algorithms from Phenotype KnowledgeBase (PheKB). Phe2vec is based on pre-computing embeddings of medical concepts and patients' clinical history. Disease phenotypes are then derived from a seed concept and its neighbors in the embedding space. Patients are linked to a disease if their embedded representation is close to the disease phenotype. Comparing Phe2vec and PheKB cohorts head-to-head using chart review, Phe2vec performed on par or better in nine out of ten diseases. Differently from other approaches, it can scale to any condition and was validated against widely adopted expert-based standards. Phe2vec aims to optimize clinical informatics research by augmenting current frameworks to characterize patients by condition and derive reliableRobust phenotyping of patients from electronic health records (EHRs) at scale is a challenge in clinical informatics. Here, we introduce Phe2vec, an automated framework for disease phenotyping from EHRs based on unsupervised learning and assess its effectiveness against standard rule-based algorithms from Phenotype KnowledgeBase (PheKB). Phe2vec is based on pre-computing embeddings of medical concepts and patients' clinical history. Disease phenotypes are then derived from a seed concept and its neighbors in the embedding space. Patients are linked to a disease if their embedded representation is close to the disease phenotype. Comparing Phe2vec and PheKB cohorts head-to-head using chart review, Phe2vec performed on par or better in nine out of ten diseases. Differently from other approaches, it can scale to any condition and was validated against widely adopted expert-based standards. Phe2vec aims to optimize clinical informatics research by augmenting current frameworks to characterize patients by condition and derive reliable disease cohorts.show moreshow less

Export metadata

Additional Services

Search Google Scholar Statistics
Metadaten
Author details:Jessica K. De Freitas, Kipp W. JohnsonORCiD, Eddye Golden, Girish N. Nadkarni, Joel T. Dudley, Erwin BöttingerGND, Benjamin S. GlicksbergORCiD, Riccardo MiottoORCiD
DOI:https://doi.org/10.1016/j.patter.2021.100337
ISSN:2666-3899
Pubmed ID:https://pubmed.ncbi.nlm.nih.gov/34553174
Title of parent work (English):Patterns
Subtitle (English):Automated disease phenotyping based on unsupervised embeddings from electronic health records
Publisher:Elsevier
Place of publishing:Amsterdam
Publication type:Article
Language:English
Date of first publication:2021/09/10
Publication year:2021
Release date:2024/01/12
Volume:2
Issue:9
Article number:100337
Number of pages:9
Funding institution:Hasso Plattner Foundation; Alzheimer's Drug Discovery Foundation; National Center for Advancing Translational Sciences, National Institutes of HealthUnited States Department of Health & Human ServicesNational Institutes of Health (NIH) - USANIH National Center for Advancing Translational Sciences (NCATS) [U54 TR001433-05]
Organizational units:An-Institute / Hasso-Plattner-Institut für Digital Engineering gGmbH
DDC classification:0 Informatik, Informationswissenschaft, allgemeine Werke / 00 Informatik, Wissen, Systeme / 004 Datenverarbeitung; Informatik
Peer review:Referiert
Publishing method:Open Access / Gold Open-Access
DOAJ gelistet
License (German):License LogoCC-BY-NC-ND - Namensnennung, nicht kommerziell, keine Bearbeitungen 4.0 International
Accept ✔
This website uses technically necessary session cookies. By continuing to use the website, you agree to this. You can find our privacy policy here.