Refine
Has Fulltext
- no (2)
Document Type
- Article (2)
Language
- English (2) (remove)
Is part of the Bibliography
- yes (2)
Keywords
- COVID-19 (1)
- EHR (1)
- TRIPOD (1)
- clinical (1)
- cohort (1)
- electronic health record (1)
- hospital (1)
- informatics (1)
- machine learning (1)
- mortality (1)
To identify genetic variants associated with head circumference in infancy, we performed a meta-analysis of seven genome-wide association studies (GWAS) (N = 10,768 individuals of European ancestry enrolled in pregnancy and/or birth cohorts) and followed up three lead signals in six replication studies (combined N = 19,089). rs7980687 on chromosome 12q24 (P = 8.1 x 10(-9)) and rs1042725 on chromosome 12q15 (P = 2.8 x 10(-10)) were robustly associated with head circumference in infancy. Although these loci have previously been associated with adult height(1), their effects on infant head circumference were largely independent of height (P = 3.8 x 10(-7) for rs7980687 and P = 1.3 x 10(-7) for rs1042725 after adjustment for infant height). A third signal, rs11655470 on chromosome 17q21, showed suggestive evidence of association with head circumference (P = 3.9 x 10(-6)). SNPs correlated to the 17q21 signal have shown genome-wide association with adult intracranial volume(2), Parkinson's disease and other neurodegenerative diseases(3-5), indicating that a common genetic variant in this region might link early brain growth with neurological disease in later life.
Background:
COVID-19 has infected millions of people worldwide and is responsible for several hundred thousand fatalities. The COVID-19 pandemic has necessitated thoughtful resource allocation and early identification of high-risk patients. However, effective methods to meet these needs are lacking.
Objective:
The aims of this study were to analyze the electronic health records (EHRs) of patients who tested positive for COVID-19 and were admitted to hospitals in the Mount Sinai Health System in New York City; to develop machine learning models for making predictions about the hospital course of the patients over clinically meaningful time horizons based on patient characteristics at admission; and to assess the performance of these models at multiple hospitals and time points.
Methods:
We used Extreme Gradient Boosting (XGBoost) and baseline comparator models to predict in-hospital mortality and critical events at time windows of 3, 5, 7, and 10 days from admission. Our study population included harmonized EHR data from five hospitals in New York City for 4098 COVID-19-positive patients admitted from March 15 to May 22, 2020. The models were first trained on patients from a single hospital (n=1514) before or on May 1, externally validated on patients from four other hospitals (n=2201) before or on May 1, and prospectively validated on all patients after May 1 (n=383). Finally, we established model interpretability to identify and rank variables that drive model predictions.
Results:
Upon cross-validation, the XGBoost classifier outperformed baseline models, with an area under the receiver operating characteristic curve (AUC-ROC) for mortality of 0.89 at 3 days, 0.85 at 5 and 7 days, and 0.84 at 10 days. XGBoost also performed well for critical event prediction, with an AUC-ROC of 0.80 at 3 days, 0.79 at 5 days, 0.80 at 7 days, and 0.81 at 10 days. In external validation, XGBoost achieved an AUC-ROC of 0.88 at 3 days, 0.86 at 5 days, 0.86 at 7 days, and 0.84 at 10 days for mortality prediction. Similarly, the unimputed XGBoost model achieved an AUC-ROC of 0.78 at 3 days, 0.79 at 5 days, 0.80 at 7 days, and 0.81 at 10 days. Trends in performance on prospective validation sets were similar. At 7 days, acute kidney injury on admission, elevated LDH, tachypnea, and hyperglycemia were the strongest drivers of critical event prediction, while higher age, anion gap, and C-reactive protein were the strongest drivers of mortality prediction.
Conclusions:
We externally and prospectively trained and validated machine learning models for mortality and critical events for patients with COVID-19 at different time horizons. These models identified at-risk patients and uncovered underlying relationships that predicted outcomes.