Machine learning to predict mortality and critical events in a cohort of patients with COVID-19 in New York City: model development and validation

Vaid, Akhil; Somani, Sulaiman; Russak, Adam J.; De Freitas, Jessica K.; Chaudhry, Fayzan F.; Paranjpe, Ishan; Johnson, Kipp W.; Lee, Samuel J.; Miotto, Riccardo; Richter, Felix; Zhao, Shan; Beckmann, Noam D.; Naik, Nidhi; Kia, Arash; Timsina, Prem; Lala, Anuradha; Paranjpe, Manish; Golden, Eddye; Danieletto, Matteo; Singh, Manbir; Meyer, Dara; O'Reilly, Paul F.; Huckins, Laura; Kovatch, Patricia; Finkelstein, Joseph; Freeman, Robert M.; Argulian, Edgar; Kasarskis, Andrew; Percha, Bethany; Aberg, Judith A.; Bagiella, Emilia; Horowitz, Carol R.; Murphy, Barbara; Nestler, Eric J.; Schadt, Eric E.; Cho, Judy H.; Cordon-Cardo, Carlos; Fuster, Valentin; Charney, Dennis S.; Reich, David L.; Böttinger, Erwin; Levin, Matthew A.; Narula, Jagat; Fayad, Zahi A.; Just, Allan C.; Charney, Alexander W.; Nadkarni, Girish N.; Glicksberg, Benjamin S.

doi:10.2196/24018

Treffer 10 von 21

Machine learning to predict mortality and critical events in a cohort of patients with COVID-19 in New York City: model development and validation

Background: COVID-19 has infected millions of people worldwide and is responsible for several hundred thousand fatalities. The COVID-19 pandemic has necessitated thoughtful resource allocation and early identification of high-risk patients. However, effective methods to meet these needs are lacking. Objective: The aims of this study were to analyze the electronic health records (EHRs) of patients who tested positive for COVID-19 and were admitted to hospitals in the Mount Sinai Health System in New York City; to develop machine learning models for making predictions about the hospital course of the patients over clinically meaningful time horizons based on patient characteristics at admission; and to assess the performance of these models at multiple hospitals and time points. Methods: We used Extreme Gradient Boosting (XGBoost) and baseline comparator models to predict in-hospital mortality and critical events at time windows of 3, 5, 7, and 10 days from admission. Our study population included harmonized EHR data from fiveBackground: COVID-19 has infected millions of people worldwide and is responsible for several hundred thousand fatalities. The COVID-19 pandemic has necessitated thoughtful resource allocation and early identification of high-risk patients. However, effective methods to meet these needs are lacking. Objective: The aims of this study were to analyze the electronic health records (EHRs) of patients who tested positive for COVID-19 and were admitted to hospitals in the Mount Sinai Health System in New York City; to develop machine learning models for making predictions about the hospital course of the patients over clinically meaningful time horizons based on patient characteristics at admission; and to assess the performance of these models at multiple hospitals and time points. Methods: We used Extreme Gradient Boosting (XGBoost) and baseline comparator models to predict in-hospital mortality and critical events at time windows of 3, 5, 7, and 10 days from admission. Our study population included harmonized EHR data from five hospitals in New York City for 4098 COVID-19-positive patients admitted from March 15 to May 22, 2020. The models were first trained on patients from a single hospital (n=1514) before or on May 1, externally validated on patients from four other hospitals (n=2201) before or on May 1, and prospectively validated on all patients after May 1 (n=383). Finally, we established model interpretability to identify and rank variables that drive model predictions. Results: Upon cross-validation, the XGBoost classifier outperformed baseline models, with an area under the receiver operating characteristic curve (AUC-ROC) for mortality of 0.89 at 3 days, 0.85 at 5 and 7 days, and 0.84 at 10 days. XGBoost also performed well for critical event prediction, with an AUC-ROC of 0.80 at 3 days, 0.79 at 5 days, 0.80 at 7 days, and 0.81 at 10 days. In external validation, XGBoost achieved an AUC-ROC of 0.88 at 3 days, 0.86 at 5 days, 0.86 at 7 days, and 0.84 at 10 days for mortality prediction. Similarly, the unimputed XGBoost model achieved an AUC-ROC of 0.78 at 3 days, 0.79 at 5 days, 0.80 at 7 days, and 0.81 at 10 days. Trends in performance on prospective validation sets were similar. At 7 days, acute kidney injury on admission, elevated LDH, tachypnea, and hyperglycemia were the strongest drivers of critical event prediction, while higher age, anion gap, and C-reactive protein were the strongest drivers of mortality prediction. Conclusions: We externally and prospectively trained and validated machine learning models for mortality and critical events for patients with COVID-19 at different time horizons. These models identified at-risk patients and uncovered underlying relationships that predicted outcomes.…

Metadaten
Verfasserangaben:	Akhil Vaid ORCiD, Sulaiman Somani ORCiD, Adam J. Russak ORCiD, Jessica K. De Freitas ORCiD, Fayzan F. Chaudhry ORCiD, Ishan Paranjpe ORCiD, Kipp W. Johnson ORCiD, Samuel J. Lee ORCiD, Riccardo Miotto ORCiD, Felix Richter ORCiD, Shan Zhao ORCiD, Noam D. Beckmann ORCiD, Nidhi Naik ORCiD, Arash Kia ORCiD, Prem Timsina ORCiD, Anuradha Lala ORCiD, Manish Paranjpe ORCiD, Eddye Golden ORCiD, Matteo Danieletto ORCiD, Manbir Singh ORCiD, Dara Meyer ORCiD, Paul F. O'Reilly ORCiD, Laura Huckins ORCiD, Patricia Kovatch ORCiD, Joseph Finkelstein ORCiD, Robert M. Freeman ORCiD, Edgar Argulian ORCiD, Andrew Kasarskis ORCiD, Bethany Percha ORCiD, Judith A. Aberg ORCiD, Emilia Bagiella ORCiD, Carol R. Horowitz ORCiD, Barbara Murphy ORCiD, Eric J. Nestler ORCiD, Eric E. Schadt ORCiD, Judy H. Cho ORCiD, Carlos Cordon-Cardo ORCiD, Valentin Fuster ORCiD, Dennis S. Charney ORCiD, David L. Reich ORCiD, Erwin Böttinger GND, Matthew A. Levin ORCiD, Jagat Narula ORCiD, Zahi A. Fayad ORCiD, Allan C. Just ORCiD, Alexander W. Charney ORCiD, Girish N. Nadkarni ORCiD, Benjamin S. Glicksberg ORCiD
DOI:	https://doi.org/10.2196/24018
ISSN:	1439-4456
ISSN:	1438-8871
Pubmed ID:	https://pubmed.ncbi.nlm.nih.gov/33027032
Titel des übergeordneten Werks (Englisch):	Journal of medical internet research : international scientific journal for medical research, information and communication on the internet ; JMIR
Verlag:	Healthcare World
Verlagsort:	Richmond, Va.
Publikationstyp:	Wissenschaftlicher Artikel
Sprache:	Englisch
Datum der Erstveröffentlichung:	01.09.2020
Erscheinungsjahr:	2020
Datum der Freischaltung:	10.10.2023
Freies Schlagwort / Tag:	COVID-19; EHR; TRIPOD; clinical; cohort; electronic health record; hospital; informatics; machine learning; mortality; performance; prediction
Band:	22
Ausgabe:	11
Aufsatznummer:	e24018
Seitenanzahl:	19
Fördernde Institution:	National Center for Advancing Translational Sciences, National; Institutes of HealthUnited States Department of Health & Human; ServicesNational Institutes of Health (NIH) - USANIH National Center for; Advancing Translational Sciences (NCATS) [U54 TR001433-05]
Organisationseinheiten:	An-Institute / Hasso-Plattner-Institut für Digital Engineering gGmbH
DDC-Klassifikation:	0 Informatik, Informationswissenschaft, allgemeine Werke / 00 Informatik, Wissen, Systeme / 000 Informatik, Informationswissenschaft, allgemeine Werke
	6 Technik, Medizin, angewandte Wissenschaften / 61 Medizin und Gesundheit / 610 Medizin und Gesundheit
Peer Review:	Referiert
Publikationsweg:	Open Access / Gold Open-Access
	DOAJ gelistet
Lizenz (Deutsch):	CC-BY - Namensnennung 4.0 International

Machine learning to predict mortality and critical events in a cohort of patients with COVID-19 in New York City: model development and validation

Metadaten exportieren

Weitere Dienste