TY  - JOUR
A1  - Adnan, Hassan Sami
A1  - Srsic, Amanda
A1  - Venticich, Pete Milos
A1  - Townend, David M.R.
T1  - Using AI for mental health analysis and prediction in school surveys
JF  - European journal of public health
N2  - Background:
Childhood and adolescence are critical stages of life for mental health and well-being. Schools are a key setting for mental health promotion and illness prevention. One in five children and adolescents have a mental disorder, about half of mental disorders beginning before the age of 14. Beneficial and explainable artificial intelligence can replace current paper- based and online approaches to school mental health surveys. This can enhance data acquisition, interoperability, data driven analysis, trust and compliance. This paper presents a model for using chatbots for non-obtrusive data collection and supervised machine learning models for data analysis; and discusses ethical considerations pertaining to the use of these models.

Methods:
For data acquisition, the proposed model uses chatbots which interact with students. The conversation log acts as the source of raw data for the machine learning. Pre-processing of the data is automated by filtering for keywords and phrases.
Existing survey results, obtained through current paper-based data collection methods, are evaluated by domain experts (health professionals). These can be used to create a test dataset to validate the machine learning models. Supervised learning
can then be deployed to classify specific behaviour and mental health patterns.

Results:
We present a model that can be used to improve upon current paper-based data collection and manual data analysis methods. An open-source GitHub repository contains necessary tools and components of this model. Privacy is respected through
rigorous observance of confidentiality and data protection requirements. Critical reflection on these ethics and law aspects is included in the project.

Conclusions:
This model strengthens mental health surveillance in schools. The same tools and components could be applied to other public health data. Future extensions of this model could also incorporate unsupervised learning to find clusters and patterns
of unknown effects.
KW  - ethics
KW  - artificial intelligence
KW  - adolescent
KW  - child
KW  - confidentiality
KW  - health personnel
KW  - mental disorders
KW  - mental health
KW  - personal satisfaction
KW  - privacy
KW  - school (environment)
KW  - statutes and laws
KW  - public health medicine
KW  - surveillance
KW  - medical
KW  - prevention
KW  - datasets
KW  - machine learning
KW  - supervised machine learning
KW  - data analysis
Y1  - 2020
U6  - https://doi.org/10.1093/eurpub/ckaa165.336
SN  - 1101-1262
SN  - 1464-360X
VL  - 30
SP  - V125
EP  - V125
PB  - Oxford Univ. Press
CY  - Oxford [u.a.]
ER  - 
TY  - JOUR
A1  - Cope, Justin L.
A1  - Baukmann, Hannes A.
A1  - Klinger, Jörn E.
A1  - Ravarani, Charles N. J.
A1  - Böttinger, Erwin
A1  - Konigorski, Stefan
A1  - Schmidt, Marco F.
T1  - Interaction-based feature selection algorithm outperforms polygenic risk score in predicting Parkinson’s Disease status
JF  - Frontiers in genetics
N2  - Polygenic risk scores (PRS) aggregating results from genome-wide association studies are the state of the art in the prediction of susceptibility to complex traits or diseases, yet their predictive performance is limited for various reasons, not least of which is their failure to incorporate the effects of gene-gene interactions. Novel machine learning algorithms that use large amounts of data promise to find gene-gene interactions in order to build models with better predictive performance than PRS. Here, we present a data preprocessing step by using data-mining of contextual information to reduce the number of features, enabling machine learning algorithms to identify gene-gene interactions. We applied our approach to the Parkinson's Progression Markers Initiative (PPMI) dataset, an observational clinical study of 471 genotyped subjects (368 cases and 152 controls). With an AUC of 0.85 (95% CI = [0.72; 0.96]), the interaction-based prediction model outperforms the PRS (AUC of 0.58 (95% CI = [0.42; 0.81])). Furthermore, feature importance analysis of the model provided insights into the mechanism of Parkinson's disease. For instance, the model revealed an interaction of previously described drug target candidate genes TMEM175 and GAPDHP25. These results demonstrate that interaction-based machine learning models can improve genetic prediction models and might provide an answer to the missing heritability problem.
KW  - epistasis
KW  - machine learning
KW  - feature selection
KW  - parkinson's disease
KW  - PPMI (parkinson's progression markers initiative)
Y1  - 2021
U6  - https://doi.org/10.3389/fgene.2021.744557
SN  - 1664-8021
VL  - 12
PB  - Frontiers Media
CY  - Lausanne
ER  - 
TY  - JOUR
A1  - Döllner, Jürgen Roland Friedrich
T1  - Geospatial artificial intelligence
BT  - potentials of machine learning for 3D point clouds and geospatial digital twins
JF  - Journal of photogrammetry, remote sensing and geoinformation science : PFG : Photogrammetrie, Fernerkundung, Geoinformation
N2  - Artificial intelligence (AI) is changing fundamentally the way how IT solutions are implemented and operated across all application domains, including the geospatial domain. This contribution outlines AI-based techniques for 3D point clouds and geospatial digital twins as generic components of geospatial AI. First, we briefly reflect on the term "AI" and outline technology developments needed to apply AI to IT solutions, seen from a software engineering perspective. Next, we characterize 3D point clouds as key category of geodata and their role for creating the basis for geospatial digital twins; we explain the feasibility of machine learning (ML) and deep learning (DL) approaches for 3D point clouds. In particular, we argue that 3D point clouds can be seen as a corpus with similar properties as natural language corpora and formulate a "Naturalness Hypothesis" for 3D point clouds. In the main part, we introduce a workflow for interpreting 3D point clouds based on ML/DL approaches that derive domain-specific and application-specific semantics for 3D point clouds without having to create explicit spatial 3D models or explicit rule sets. Finally, examples are shown how ML/DL enables us to efficiently build and maintain base data for geospatial digital twins such as virtual 3D city models, indoor models, or building information models.
N2  - Georäumliche Künstliche Intelligenz: Potentiale des Maschinellen Lernens für 3D-Punktwolken und georäumliche digitale Zwillinge. Künstliche Intelligenz (KI) verändert grundlegend die Art und Weise, wie IT-Lösungen in allen Anwendungsbereichen, einschließlich dem Geoinformationsbereich, implementiert und betrieben werden. In diesem Beitrag stellen wir KI-basierte Techniken für 3D-Punktwolken als einen Baustein der georäumlichen KI vor. Zunächst werden kurz der Begriﬀ "KI” und die technologischen Entwicklungen skizziert, die für die Anwendung von KI auf IT-Lösungen aus der Sicht der Softwaretechnik erforderlich sind. Als nächstes charakterisieren wir 3D-Punktwolken als Schlüsselkategorie von Geodaten und ihre Rolle für den Aufbau von räumlichen digitalen Zwillingen; wir erläutern die Machbarkeit der Ansätze für Maschinelles Lernen (ML) und Deep Learning (DL) in Bezug auf 3D-Punktwolken. Insbesondere argumentieren wir, dass 3D-Punktwolken als Korpus mit ähnlichen Eigenschaften wie natürlichsprachliche Korpusse gesehen werden können und 
formulieren eine "Natürlichkeitshypothese” für 3D-Punktwolken. Im Hauptteil stellen wir einen Workﬂow zur Interpretation  von 3D-Punktwolken auf der Grundlage von ML/DL-Ansätzen vor, die eine domänenspeziﬁsche und anwendungsspeziﬁsche Semantik für 3D-Punktwolken ableiten, ohne explizite räumliche 3D-Modelle oder explizite Regelsätze erstellen zu müssen.  Abschließend wird an Beispielen gezeigt, wie ML/DL es ermöglichen, Basisdaten für räumliche digitale Zwillinge, wie z.B. für virtuelle 3D-Stadtmodelle, Innenraummodelle oder Gebäudeinformationsmodelle, eﬃzient aufzubauen und zu pﬂegen.
KW  - geospatial artificial intelligence
KW  - machine learning
KW  - deep learning
KW  - 3D
KW  - point clouds
KW  - geospatial digital twins
KW  - 3D city models
Y1  - 2020
U6  - https://doi.org/10.1007/s41064-020-00102-3
SN  - 2512-2789
SN  - 2512-2819
VL  - 88
IS  - 1
SP  - 15
EP  - 24
PB  - Springer International Publishing
CY  - Cham
ER  - 
TY  - JOUR
A1  - Vaid, Akhil
A1  - Chan, Lili
A1  - Chaudhary, Kumardeep
A1  - Jaladanki, Suraj K.
A1  - Paranjpe, Ishan
A1  - Russak, Adam J.
A1  - Kia, Arash
A1  - Timsina, Prem
A1  - Levin, Matthew A.
A1  - He, John Cijiang
A1  - Böttinger, Erwin
A1  - Charney, Alexander W.
A1  - Fayad, Zahi A.
A1  - Coca, Steven G.
A1  - Glicksberg, Benjamin S.
A1  - Nadkarni, Girish N.
T1  - Predictive approaches for acute dialysis requirement and death in COVID-19
JF  - Clinical journal of the American Society of Nephrology : CJASN
N2  - Background and objectives
AKI treated with dialysis initiation is a common complication of coronavirus disease 2019 (COVID-19) among hospitalized patients. However, dialysis supplies and personnel are often limited. 

Design, setting, participants, & measurements
Using data from adult patients hospitalized with COVID-19 from five hospitals from theMount Sinai Health System who were admitted between March 10 and December 26, 2020, we developed and validated several models (logistic regression, Least Absolute Shrinkage and Selection Operator (LASSO), random forest, and eXtreme GradientBoosting [XGBoost; with and without imputation]) for predicting treatment with dialysis or death at various time horizons (1, 3, 5, and 7 days) after hospital admission. Patients admitted to theMount Sinai Hospital were used for internal validation, whereas the other hospitals formed part of the external validation cohort. Features included demographics, comorbidities, and laboratory and vital signs within 12 hours of hospital admission.

Results
A total of 6093 patients (2442 in training and 3651 in external validation) were included in the final cohort. Of the different modeling approaches used, XGBoost without imputation had the highest area under the receiver operating characteristic (AUROC) curve on internal validation (range of 0.93-0.98) and area under the precisionrecall curve (AUPRC; range of 0.78-0.82) for all time points. XGBoost without imputation also had the highest test parameters on external validation (AUROC range of 0.85-0.87, and AUPRC range of 0.27-0.54) across all time windows. XGBoost without imputation outperformed all models with higher precision and recall (mean difference in AUROC of 0.04; mean difference in AUPRC of 0.15). Features of creatinine, BUN, and red cell distribution width were major drivers of the model's prediction.

Conclusions
 An XGBoost model without imputation for prediction of a composite outcome of either death or dialysis in patients positive for COVID-19 had the best performance, as compared with standard and other machine learning models.
KW  - COVID-19
KW  - dialysis
KW  - machine learning
KW  - prediction
KW  - AKI
Y1  - 2021
U6  - https://doi.org/10.2215/CJN.17311120
SN  - 1555-9041
SN  - 1555-905X
VL  - 16
IS  - 8
SP  - 1158
EP  - 1168
PB  - American Society of Nephrology
CY  - Washington
ER  - 
TY  - JOUR
A1  - Vaid, Akhil
A1  - Somani, Sulaiman
A1  - Russak, Adam J.
A1  - De Freitas, Jessica K.
A1  - Chaudhry, Fayzan F.
A1  - Paranjpe, Ishan
A1  - Johnson, Kipp W.
A1  - Lee, Samuel J.
A1  - Miotto, Riccardo
A1  - Richter, Felix
A1  - Zhao, Shan
A1  - Beckmann, Noam D.
A1  - Naik, Nidhi
A1  - Kia, Arash
A1  - Timsina, Prem
A1  - Lala, Anuradha
A1  - Paranjpe, Manish
A1  - Golden, Eddye
A1  - Danieletto, Matteo
A1  - Singh, Manbir
A1  - Meyer, Dara
A1  - O'Reilly, Paul F.
A1  - Huckins, Laura
A1  - Kovatch, Patricia
A1  - Finkelstein, Joseph
A1  - Freeman, Robert M.
A1  - Argulian, Edgar
A1  - Kasarskis, Andrew
A1  - Percha, Bethany
A1  - Aberg, Judith A.
A1  - Bagiella, Emilia
A1  - Horowitz, Carol R.
A1  - Murphy, Barbara
A1  - Nestler, Eric J.
A1  - Schadt, Eric E.
A1  - Cho, Judy H.
A1  - Cordon-Cardo, Carlos
A1  - Fuster, Valentin
A1  - Charney, Dennis S.
A1  - Reich, David L.
A1  - Böttinger, Erwin
A1  - Levin, Matthew A.
A1  - Narula, Jagat
A1  - Fayad, Zahi A.
A1  - Just, Allan C.
A1  - Charney, Alexander W.
A1  - Nadkarni, Girish N.
A1  - Glicksberg, Benjamin S.
T1  - Machine learning to predict mortality and critical events in a cohort of patients with COVID-19 in New York City: model development and validation
JF  - Journal of medical internet research : international scientific journal for medical research, information and communication on the internet ; JMIR
N2  - Background:
COVID-19 has infected millions of people worldwide and is responsible for several hundred thousand fatalities. The COVID-19 pandemic has necessitated thoughtful resource allocation and early identification of high-risk patients. However, effective methods to meet these needs are lacking.

Objective:
The aims of this study were to analyze the electronic health records (EHRs) of patients who tested positive for COVID-19 and were admitted to hospitals in the Mount Sinai Health System in New York City; to develop machine learning models for making predictions about the hospital course of the patients over clinically meaningful time horizons based on patient characteristics at admission; and to assess the performance of these models at multiple hospitals and time points. 

Methods:
We used Extreme Gradient Boosting (XGBoost) and baseline comparator models to predict in-hospital mortality and critical events at time windows of 3, 5, 7, and 10 days from admission. Our study population included harmonized EHR data from five hospitals in New York City for 4098 COVID-19-positive patients admitted from March 15 to May 22, 2020. The models were first trained on patients from a single hospital (n=1514) before or on May 1, externally validated on patients from four other hospitals (n=2201) before or on May 1, and prospectively validated on all patients after May 1 (n=383). Finally, we established model interpretability to identify and rank variables that drive model predictions.

Results:
Upon cross-validation, the XGBoost classifier outperformed baseline models, with an area under the receiver operating characteristic curve (AUC-ROC) for mortality of 0.89 at 3 days, 0.85 at 5 and 7 days, and 0.84 at 10 days. XGBoost also performed well for critical event prediction, with an AUC-ROC of 0.80 at 3 days, 0.79 at 5 days, 0.80 at 7 days, and 0.81 at 10 days. In external validation, XGBoost achieved an AUC-ROC of 0.88 at 3 days, 0.86 at 5 days, 0.86 at 7 days, and 0.84 at 10 days for mortality prediction. Similarly, the unimputed XGBoost model achieved an AUC-ROC of 0.78 at 3 days, 0.79 at 5 days, 0.80 at 7 days, and 0.81 at 10 days. Trends in performance on prospective validation sets were similar. At 7 days, acute kidney injury on admission, elevated LDH, tachypnea, and hyperglycemia were the strongest drivers of critical event prediction, while higher age, anion gap, and C-reactive protein were the strongest drivers of mortality prediction. 

Conclusions:
We externally and prospectively trained and validated machine learning models for mortality and critical events for patients with COVID-19 at different time horizons. These models identified at-risk patients and uncovered underlying relationships that predicted outcomes.
KW  - machine learning
KW  - COVID-19
KW  - electronic health record
KW  - TRIPOD
KW  - clinical
KW  - informatics
KW  - prediction
KW  - mortality
KW  - EHR
KW  - cohort
KW  - hospital
KW  - performance
Y1  - 2020
U6  - https://doi.org/10.2196/24018
SN  - 1439-4456
SN  - 1438-8871
VL  - 22
IS  - 11
PB  - Healthcare World
CY  - Richmond, Va.
ER  -