TY - BOOK A1 - Zhang, Shuhao A1 - Plauth, Max A1 - Eberhardt, Felix A1 - Polze, Andreas A1 - Lehmann, Jens A1 - Sejdiu, Gezim A1 - Jabeen, Hajira A1 - Servadei, Lorenzo A1 - Möstl, Christian A1 - Bär, Florian A1 - Netzeband, André A1 - Schmidt, Rainer A1 - Knigge, Marlene A1 - Hecht, Sonja A1 - Prifti, Loina A1 - Krcmar, Helmut A1 - Sapegin, Andrey A1 - Jaeger, David A1 - Cheng, Feng A1 - Meinel, Christoph A1 - Friedrich, Tobias A1 - Rothenberger, Ralf A1 - Sutton, Andrew M. A1 - Sidorova, Julia A. A1 - Lundberg, Lars A1 - Rosander, Oliver A1 - Sköld, Lars A1 - Di Varano, Igor A1 - van der Walt, Estée A1 - Eloff, Jan H. P. A1 - Fabian, Benjamin A1 - Baumann, Annika A1 - Ermakova, Tatiana A1 - Kelkel, Stefan A1 - Choudhary, Yash A1 - Cooray, Thilini A1 - Rodríguez, Jorge A1 - Medina-Pérez, Miguel Angel A1 - Trejo, Luis A. A1 - Barrera-Animas, Ari Yair A1 - Monroy-Borja, Raúl A1 - López-Cuevas, Armando A1 - Ramírez-Márquez, José Emmanuel A1 - Grohmann, Maria A1 - Niederleithinger, Ernst A1 - Podapati, Sasidhar A1 - Schmidt, Christopher A1 - Huegle, Johannes A1 - de Oliveira, Roberto C. L. A1 - Soares, Fábio Mendes A1 - van Hoorn, André A1 - Neumer, Tamas A1 - Willnecker, Felix A1 - Wilhelm, Mathias A1 - Kuster, Bernhard ED - Meinel, Christoph ED - Polze, Andreas ED - Beins, Karsten ED - Strotmann, Rolf ED - Seibold, Ulrich ED - Rödszus, Kurt ED - Müller, Jürgen T1 - HPI Future SOC Lab – Proceedings 2017 T1 - HPI Future SOC Lab – Proceedings 2017 N2 - The “HPI Future SOC Lab” is a cooperation of the Hasso Plattner Institute (HPI) and industry partners. Its mission is to enable and promote exchange and interaction between the research community and the industry partners. The HPI Future SOC Lab provides researchers with free of charge access to a complete infrastructure of state of the art hard and software. This infrastructure includes components, which might be too expensive for an ordinary research environment, such as servers with up to 64 cores and 2 TB main memory. The offerings address researchers particularly from but not limited to the areas of computer science and business information systems. Main areas of research include cloud computing, parallelization, and In-Memory technologies. This technical report presents results of research projects executed in 2017. Selected projects have presented their results on April 25th and November 15th 2017 at the Future SOC Lab Day events. N2 - Das Future SOC Lab am HPI ist eine Kooperation des Hasso-Plattner-Instituts mit verschiedenen Industriepartnern. Seine Aufgabe ist die Ermöglichung und Förderung des Austausches zwischen Forschungsgemeinschaft und Industrie. Am Lab wird interessierten Wissenschaftlern eine Infrastruktur von neuester Hard- und Software kostenfrei für Forschungszwecke zur Verfügung gestellt. Dazu zählen teilweise noch nicht am Markt verfügbare Technologien, die im normalen Hochschulbereich in der Regel nicht zu finanzieren wären, bspw. Server mit bis zu 64 Cores und 2 TB Hauptspeicher. Diese Angebote richten sich insbesondere an Wissenschaftler in den Gebieten Informatik und Wirtschaftsinformatik. Einige der Schwerpunkte sind Cloud Computing, Parallelisierung und In-Memory Technologien. In diesem Technischen Bericht werden die Ergebnisse der Forschungsprojekte des Jahres 2017 vorgestellt. Ausgewählte Projekte stellten ihre Ergebnisse am 25. April und 15. November 2017 im Rahmen der Future SOC Lab Tag Veranstaltungen vor. T3 - Technische Berichte des Hasso-Plattner-Instituts für Digital Engineering an der Universität Potsdam - 130 KW - Future SOC Lab KW - research projects KW - multicore architectures KW - In-Memory technology KW - cloud computing KW - machine learning KW - artifical intelligence KW - Future SOC Lab KW - Forschungsprojekte KW - Multicore Architekturen KW - In-Memory Technologie KW - Cloud Computing KW - maschinelles Lernen KW - Künstliche Intelligenz Y1 - 2020 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-433100 SN - 978-3-86956-475-3 SN - 1613-5652 SN - 2191-1665 IS - 130 PB - Universitätsverlag Potsdam CY - Potsdam ER - TY - THES A1 - Zali, Zahra T1 - Volcanic tremor analysis based on advanced signal processing concepts including music information retrieval (MIR) strategies N2 - Volcanoes are one of the Earth’s most dynamic zones and responsible for many changes in our planet. Volcano seismology aims to provide an understanding of the physical processes in volcanic systems and anticipate the style and timing of eruptions by analyzing the seismic records. Volcanic tremor signals are usually observed in the seismic records before or during volcanic eruptions. Their analysis contributes to evaluate the evolving volcanic activity and potentially predict eruptions. Years of continuous seismic monitoring now provide useful information for operational eruption forecasting. The continuously growing amount of seismic recordings, however, poses a challenge for analysis, information extraction, and interpretation, to support timely decision making during volcanic crises. Furthermore, the complexity of eruption processes and precursory activities makes the analysis challenging. A challenge in studying seismic signals of volcanic origin is the coexistence of transient signal swarms and long-lasting volcanic tremor signals. Separating transient events from volcanic tremors can, therefore, contribute to improving our understanding of the underlying physical processes. Some similar issues (data reduction, source separation, extraction, and classification) are addressed in the context of music information retrieval (MIR). The signal characteristics of acoustic and seismic recordings comprise a number of similarities. This thesis is going beyond classical signal analysis techniques usually employed in seismology by exploiting similarities of seismic and acoustic signals and building the information retrieval strategy on the expertise developed in the field of MIR. First, inspired by the idea of harmonic–percussive separation (HPS) in musical signal processing, I have developed a method to extract harmonic volcanic tremor signals and to detect transient events from seismic recordings. This provides a clean tremor signal suitable for tremor investigation along with a characteristic function suitable for earthquake detection. Second, using HPS algorithms, I have developed a noise reduction technique for seismic signals. This method is especially useful for denoising ocean bottom seismometers, which are highly contaminated by noise. The advantage of this method compared to other denoising techniques is that it doesn’t introduce distortion to the broadband earthquake waveforms, which makes it reliable for different applications in passive seismological analysis. Third, to address the challenge of extracting information from high-dimensional data and investigating the complex eruptive phases, I have developed an advanced machine learning model that results in a comprehensive signal processing scheme for volcanic tremors. Using this method seismic signatures of major eruptive phases can be automatically detected. This helps to provide a chronology of the volcanic system. Also, this model is capable to detect weak precursory volcanic tremors prior to the eruption, which could be used as an indicator of imminent eruptive activity. The extracted patterns of seismicity and their temporal variations finally provide an explanation for the transition mechanism between eruptive phases. N2 - Vulkane gehören zu den dynamischsten Zonen der Erde und sind für viele Veränderungen auf unserem Planeten verantwortlich. Die Vulkanseismologie zielt darauf ab, physikalischen Prozesse in Vulkansystemen besser zu verstehen und die Art und den Zeitpunkt von Eruptionen durch die Analyse der seismischen Aufzeichnungen vorherzusagen. Die Signale vulkanischer Tremore werden normalerweise vor oder während Vulkanausbrüchen beobachtet und müssen überwacht werden, um die vulkanische Aktivität zu bewerten. Die Untersuchung vulkanischer Tremore ist ein wichtiger Teil der Vulkanüberwachung, die darauf abzielt, Anzeichen für das Erwachen oder Wiedererwachen von Vulkanen zu erkennen und möglicherweise Ausbrüche vorherzusagen. Mehrere Dekaden kontinuierlicher seismischer Überwachung liefern nützliche Informationen für die operative Eruptionsvorhersage. Die ständig wachsende Menge an seismischen Aufzeichnungen stellt jedoch eine Herausforderung für die Analyse, Informationsextraktion und Interpretation für die zeitnahe Entscheidungsfindung während Vulkankrisen dar. Darüber hinaus erschweren die Komplexität der Eruptionsprozesse und Vorläuferaktivitäten die Analyse. Eine Herausforderung bei der Untersuchung seismischer Signale vulkanischen Ursprungs ist die Koexistenz von transienten Signalschwärmen und lang anhaltenden vulkanischen Tremoren. Die Trennung dieser beiden Signaltypen kann daher dazu beitragen, unser Verständnis der zugrunde liegenden physikalischen Prozesse zu verbessern. Einige ähnliche Probleme (Datenreduktion, Quellentrennung, Extraktion und Klassifizierung) werden im Zusammenhang mit Music Information Retrieval (MIR, dt. Etwa Musik-Informationsabruf) behandelt. Die Signaleigenschaften von akustischen und seismischen Aufzeichnungen weisen eine Reihe von Gemeinsamkeiten auf. Ich gehe über die klassischen Signalanalysetechniken hinaus, die normalerweise in der Seismologie verwendet werden, indem ich die Ähnlichkeiten von seismischen und akustischen Signalen und das Fachwissen aus dem Gebiet der MIR zur Informationsgewinnung nutze. Inspiriert von der Idee der harmonisch-perkussiven Trennung (HPS) in der musikalischen Signalverarbeitung habe ich eine Methode entwickelt, mit der harmonische vulkanische Erschütterungssignale extrahiert und transiente Ereignisse aus seismischen Aufzeichnungen erkannt werden können. Dies liefert ein sauberes Tremorsignal für die Tremoruntersuchung, sowie eine charakteristischen Funktion, die für die Erdbebenerkennung geeignet ist. Weiterhin habe ich unter Verwendung von HPS-Algorithmen eine Rauschunterdrückungstechnik für seismische Signale entwickelt. Diese kann zum Beispiel verwendet werden, um klarere Signale an Meeresbodenseismometern zu erhalten, die sonst durch zu starkes Rauschen überdeckt sind. Der Vorteil dieser Methode im Vergleich zu anderen Denoising-Techniken besteht darin, dass sie keine Verzerrung in der Breitbandantwort der Erdbebenwellen einführt, was sie für verschiedene Anwendungen in der passiven seismologischen Analyse zuverlässiger macht. Um Informationen aus hochdimensionalen Daten zu extrahieren und komplexe Eruptionsphasen zu untersuchen, habe ich ein fortschrittliches maschinelles Lernmodell entwickelt, aus dem ein umfassendes Signalverarbeitungsschema für vulkanische Erschütterungen abgeleitet werden kann. Mit dieser Methode können automatisch seismische Signaturen größerer Eruptionsphasen identifizieren werden. Dies ist nützlich, um die Chronologie eines Vulkansystems zu verstehen. Außerdem ist dieses Modell in der Lage, schwache vulkanische Vorläuferbeben zu erkennen, die als Indikator für bevorstehende Eruptionsaktivität verwendet werden könnten. Basierend auf den extrahierten Seismizitätsmustern und ihren zeitlichen Variationen liefere ich eine Erklärung für den Übergangsmechanismus zwischen verschiedenen Eruptionsphasen. KW - seismic signal processing KW - machine learning KW - volcano seismology KW - music information retrieval KW - noise reduction Y1 - 2023 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-610866 ER - TY - JOUR A1 - Wulff, Peter A1 - Mientus, Lukas A1 - Nowak, Anna A1 - Borowski, Andreas T1 - KI-basierte Auswertung von schriftlichen Unterrichtsreflexionen im Fach Physik und automatisierte Rückmeldung JF - PSI-Potsdam: Ergebnisbericht zu den Aktivitäten im Rahmen der Qualitätsoffensive Lehrerbildung (2019-2023) (Potsdamer Beiträge zur Lehrerbildung und Bildungsforschung ; 3) N2 - Für die Entwicklung professioneller Handlungskompetenzen angehender Lehrkräfte stellt die Unterrichtsreflexion ein wichtiges Instrument dar, um Theoriewissen und Praxiserfahrungen in Beziehung zu setzen. Die Auswertung von Unterrichtsreflexionen und eine entsprechende Rückmeldung stellt Forschende und Dozierende allerdings vor praktische wie theoretische Herausforderungen. Im Kontext der Forschung zu Künstlicher Intelligenz (KI) entwickelte Methoden bieten hier neue Potenziale. Der Beitrag stellt überblicksartig zwei Teilstudien vor, die mit Hilfe von KI-Methoden wie dem maschinellen Lernen untersuchen, inwieweit eine Auswertung von Unterrichtsreflexionen angehender Physiklehrkräfte auf Basis eines theoretisch abgeleiteten Reflexionsmodells und die automatisierte Rückmeldung hierzu möglich sind. Dabei wurden unterschiedliche Ansätze des maschinellen Lernens verwendet, um modellbasierte Klassifikation und Exploration von Themen in Unterrichtsreflexionen umzusetzen. Die Genauigkeit der Ergebnisse wurde vor allem durch sog. Große Sprachmodelle gesteigert, die auch den Transfer auf andere Standorte und Fächer ermöglichen. Für die fachdidaktische Forschung bedeuten sie jedoch wiederum neue Herausforderungen, wie etwa systematische Verzerrungen und Intransparenz von Entscheidungen. Dennoch empfehlen wir, die Potenziale der KI-basierten Methoden gründlicher zu erforschen und konsequent in der Praxis (etwa in Form von Webanwendungen) zu implementieren. N2 - For the development of professional competencies in pre-service teachers, reflection on teaching experiences is proposed as an important tool to link theoretical knowledge and practice. However, evaluating reflections and providing appropriate feedback poses challenges of both theoretical and practical nature to researchers and educators. Methods associated with artificial intelligence research offer new potentials to discover patterns in complex datasets like reflections, as well as to evaluate these automatically and create feedback. In this article, we provide an overview of two sub-studies that investigate, using artificial intelligence methods such as machine learning, to what extent an evaluation of reflections of pre-service physics teachers based on a theoretically derived reflection model and automated feedback are possible. Across the sub-studies, different machine learning approaches were used to implement model-based classification and exploration of topics in reflections. Large language models in particular increase the accuracy of the results and allow for transfer to other locations and disciplines. However, entirely new challenges arise for educational research in relation to large language models, such as systematic biases and lack of transparency in decisions. Despite these uncertainties, we recommend further exploring the potentials of artificial intelligence-based methods and implementing them consistently in practice (for example, in the form of web applications). KW - Künstliche Intelligenz KW - Maschinelles Lernen KW - Natural Language Processing KW - Reflexion KW - Professionalisierung KW - artificial intelligence KW - machine learning KW - natural language processing KW - reflexion KW - professionalization Y1 - 2023 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-616363 SN - 978-3-86956-568-2 SN - 2626-3556 SN - 2626-4722 IS - 3 SP - 103 EP - 115 PB - Universitätsverlag Potsdam CY - Potsdam ER - TY - JOUR A1 - Wulff, Peter A1 - Buschhüter, David A1 - Westphal, Andrea A1 - Nowak, Anna A1 - Becker, Lisa A1 - Robalino, Hugo A1 - Stede, Manfred A1 - Borowski, Andreas T1 - Computer-based classification of preservice physics teachers’ written reflections JF - Journal of science education and technology N2 - Reflecting in written form on one's teaching enactments has been considered a facilitator for teachers' professional growth in university-based preservice teacher education. Writing a structured reflection can be facilitated through external feedback. However, researchers noted that feedback in preservice teacher education often relies on holistic, rather than more content-based, analytic feedback because educators oftentimes lack resources (e.g., time) to provide more analytic feedback. To overcome this impediment to feedback for written reflection, advances in computer technology can be of use. Hence, this study sought to utilize techniques of natural language processing and machine learning to train a computer-based classifier that classifies preservice physics teachers' written reflections on their teaching enactments in a German university teacher education program. To do so, a reflection model was adapted to physics education. It was then tested to what extent the computer-based classifier could accurately classify the elements of the reflection model in segments of preservice physics teachers' written reflections. Multinomial logistic regression using word count as a predictor was found to yield acceptable average human-computer agreement (F1-score on held-out test dataset of 0.56) so that it might fuel further development towards an automated feedback tool that supplements existing holistic feedback for written reflections with data-based, analytic feedback. KW - reflection KW - teacher professional development KW - hatural language KW - processing KW - machine learning Y1 - 2020 U6 - https://doi.org/10.1007/s10956-020-09865-1 SN - 1059-0145 SN - 1573-1839 VL - 30 IS - 1 SP - 1 EP - 15 PB - Springer CY - Dordrecht ER - TY - JOUR A1 - Wilksch, Moritz A1 - Abramova, Olga T1 - PyFin-sentiment BT - towards a machine-learning-based model for deriving sentiment from financial tweets JF - International journal of information management data insights N2 - Responding to the poor performance of generic automated sentiment analysis solutions on domain-specific texts, we collect a dataset of 10,000 tweets discussing the topics of finance and investing. We manually assign each tweet its market sentiment, i.e., the investor’s anticipation of a stock’s future return. Using this data, we show that all existing sentiment models trained on adjacent domains struggle with accurate market sentiment analysis due to the task’s specialized vocabulary. Consequently, we design, train, and deploy our own sentiment model. It outperforms all previous models (VADER, NTUSD-Fin, FinBERT, TwitterRoBERTa) when evaluated on Twitter posts. On posts from a different platform, our model performs on par with BERT-based large language models. We achieve this result at a fraction of the training and inference costs due to the model’s simple design. We publish the artifact as a python library to facilitate its use by future researchers and practitioners. KW - sentiment analysis KW - financial market sentiment KW - opinion mining KW - machine learning KW - deep learning Y1 - 2023 U6 - https://doi.org/10.1016/j.jjimei.2023.100171 SN - 2667-0968 VL - 3 IS - 1 PB - Elsevier CY - Amsterdam ER - TY - BOOK A1 - Weber, Benedikt T1 - Human pose estimation for decubitus prophylaxis T1 - Verwendung von Posenabschätzung zur Dekubitusprophylaxe N2 - Decubitus is one of the most relevant diseases in nursing and the most expensive to treat. It is caused by sustained pressure on tissue, so it particularly affects bed-bound patients. This work lays a foundation for pressure mattress-based decubitus prophylaxis by implementing a solution to the single-frame 2D Human Pose Estimation problem. For this, methods of Deep Learning are employed. Two approaches are examined, a coarse-to-fine Convolutional Neural Network for direct regression of joint coordinates and a U-Net for the derivation of probability distribution heatmaps. We conclude that training our models on a combined dataset of the publicly available Bodies at Rest and SLP data yields the best results. Furthermore, various preprocessing techniques are investigated, and a hyperparameter optimization is performed to discover an improved model architecture. Another finding indicates that the heatmap-based approach outperforms direct regression. This model achieves a mean per-joint position error of 9.11 cm for the Bodies at Rest data and 7.43 cm for the SLP data. We find that it generalizes well on data from mattresses other than those seen during training but has difficulties detecting the arms correctly. Additionally, we give a brief overview of the medical data annotation tool annoto we developed in the bachelor project and furthermore conclude that the Scrum framework and agile practices enhanced our development workflow. N2 - Dekubitus ist eine der relevantesten Krankheiten in der Krankenpflege und die kostspieligste in der Behandlung. Sie wird durch anhaltenden Druck auf Gewebe verursacht, betrifft also insbesondere bettlägerige Patienten. Diese Arbeit legt eine Grundlage für druckmatratzenbasierte Dekubitusprophylaxe, indem eine Lösung für das Einzelbild-2D-Posenabschätzungsproblem implementiert wird. Dafür werden Methoden des tiefen Lernens verwendet. Zwei Ansätze, basierend auf einem Gefalteten Neuronalen grob-zu-fein Netzwerk zur direkten Regression der Gelenkkoordinaten und auf einem U-Netzwerk zur Ableitung von Wahrscheinlichkeitsverteilungsbildern, werden untersucht. Wir schlussfolgern, dass das Training unserer Modelle auf einem kombinierten Datensatz, bestehend aus den frei verfügbaren Bodies at Rest und SLP Daten, die besten Ergebnisse liefert. Weiterhin werden diverse Vorverarbeitungsverfahren untersucht und eine Hyperparameteroptimierung zum Finden einer verbesserten Modellarchitektur durchgeführt. Der wahrscheinlichkeitsverteilungsbasierte Ansatz übertrifft die direkte Regression. Dieses Modell erreicht einen durchschnittlichen Pro-Gelenk-Positionsfehler von 9,11 cm auf den Bodies at Rest und von 7,43 cm auf den SLP Daten. Wir sehen, dass es gut auf Daten anderer als der im Training verwendeten Matratzen funktioniert, aber Schwierigkeiten mit der korrekten Erkennung der Arme hat. Weiterhin geben wir eine kurze Übersicht des medizinischen Datenannotationstools annoto, welches wir im Zusammenhang mit dem Bachelorprojekt entwickelt haben, und schlussfolgern außerdem, dass Scrum und agile Praktiken unseren Entwicklungsprozess verbessert haben. T3 - Technische Berichte des Hasso-Plattner-Instituts für Digital Engineering an der Universität Potsdam - 153 KW - machine learning KW - deep learning KW - convolutional neural networks KW - pose estimation KW - decubitus KW - telemedicine KW - maschinelles Lernen KW - tiefes Lernen KW - gefaltete neuronale Netze KW - Posenabschätzung KW - Dekubitus KW - Telemedizin Y1 - 2023 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-567196 SN - 978-3-86956-551-4 SN - 1613-5652 SN - 2191-1665 IS - 153 PB - Universitätsverlag Potsdam CY - Potsdam ER - TY - JOUR A1 - Vaid, Akhil A1 - Somani, Sulaiman A1 - Russak, Adam J. A1 - De Freitas, Jessica K. A1 - Chaudhry, Fayzan F. A1 - Paranjpe, Ishan A1 - Johnson, Kipp W. A1 - Lee, Samuel J. A1 - Miotto, Riccardo A1 - Richter, Felix A1 - Zhao, Shan A1 - Beckmann, Noam D. A1 - Naik, Nidhi A1 - Kia, Arash A1 - Timsina, Prem A1 - Lala, Anuradha A1 - Paranjpe, Manish A1 - Golden, Eddye A1 - Danieletto, Matteo A1 - Singh, Manbir A1 - Meyer, Dara A1 - O'Reilly, Paul F. A1 - Huckins, Laura A1 - Kovatch, Patricia A1 - Finkelstein, Joseph A1 - Freeman, Robert M. A1 - Argulian, Edgar A1 - Kasarskis, Andrew A1 - Percha, Bethany A1 - Aberg, Judith A. A1 - Bagiella, Emilia A1 - Horowitz, Carol R. A1 - Murphy, Barbara A1 - Nestler, Eric J. A1 - Schadt, Eric E. A1 - Cho, Judy H. A1 - Cordon-Cardo, Carlos A1 - Fuster, Valentin A1 - Charney, Dennis S. A1 - Reich, David L. A1 - Böttinger, Erwin A1 - Levin, Matthew A. A1 - Narula, Jagat A1 - Fayad, Zahi A. A1 - Just, Allan C. A1 - Charney, Alexander W. A1 - Nadkarni, Girish N. A1 - Glicksberg, Benjamin S. T1 - Machine learning to predict mortality and critical events in a cohort of patients with COVID-19 in New York City: model development and validation JF - Journal of medical internet research : international scientific journal for medical research, information and communication on the internet ; JMIR N2 - Background: COVID-19 has infected millions of people worldwide and is responsible for several hundred thousand fatalities. The COVID-19 pandemic has necessitated thoughtful resource allocation and early identification of high-risk patients. However, effective methods to meet these needs are lacking. Objective: The aims of this study were to analyze the electronic health records (EHRs) of patients who tested positive for COVID-19 and were admitted to hospitals in the Mount Sinai Health System in New York City; to develop machine learning models for making predictions about the hospital course of the patients over clinically meaningful time horizons based on patient characteristics at admission; and to assess the performance of these models at multiple hospitals and time points. Methods: We used Extreme Gradient Boosting (XGBoost) and baseline comparator models to predict in-hospital mortality and critical events at time windows of 3, 5, 7, and 10 days from admission. Our study population included harmonized EHR data from five hospitals in New York City for 4098 COVID-19-positive patients admitted from March 15 to May 22, 2020. The models were first trained on patients from a single hospital (n=1514) before or on May 1, externally validated on patients from four other hospitals (n=2201) before or on May 1, and prospectively validated on all patients after May 1 (n=383). Finally, we established model interpretability to identify and rank variables that drive model predictions. Results: Upon cross-validation, the XGBoost classifier outperformed baseline models, with an area under the receiver operating characteristic curve (AUC-ROC) for mortality of 0.89 at 3 days, 0.85 at 5 and 7 days, and 0.84 at 10 days. XGBoost also performed well for critical event prediction, with an AUC-ROC of 0.80 at 3 days, 0.79 at 5 days, 0.80 at 7 days, and 0.81 at 10 days. In external validation, XGBoost achieved an AUC-ROC of 0.88 at 3 days, 0.86 at 5 days, 0.86 at 7 days, and 0.84 at 10 days for mortality prediction. Similarly, the unimputed XGBoost model achieved an AUC-ROC of 0.78 at 3 days, 0.79 at 5 days, 0.80 at 7 days, and 0.81 at 10 days. Trends in performance on prospective validation sets were similar. At 7 days, acute kidney injury on admission, elevated LDH, tachypnea, and hyperglycemia were the strongest drivers of critical event prediction, while higher age, anion gap, and C-reactive protein were the strongest drivers of mortality prediction. Conclusions: We externally and prospectively trained and validated machine learning models for mortality and critical events for patients with COVID-19 at different time horizons. These models identified at-risk patients and uncovered underlying relationships that predicted outcomes. KW - machine learning KW - COVID-19 KW - electronic health record KW - TRIPOD KW - clinical KW - informatics KW - prediction KW - mortality KW - EHR KW - cohort KW - hospital KW - performance Y1 - 2020 U6 - https://doi.org/10.2196/24018 SN - 1439-4456 SN - 1438-8871 VL - 22 IS - 11 PB - Healthcare World CY - Richmond, Va. ER - TY - JOUR A1 - Vaid, Akhil A1 - Chan, Lili A1 - Chaudhary, Kumardeep A1 - Jaladanki, Suraj K. A1 - Paranjpe, Ishan A1 - Russak, Adam J. A1 - Kia, Arash A1 - Timsina, Prem A1 - Levin, Matthew A. A1 - He, John Cijiang A1 - Böttinger, Erwin A1 - Charney, Alexander W. A1 - Fayad, Zahi A. A1 - Coca, Steven G. A1 - Glicksberg, Benjamin S. A1 - Nadkarni, Girish N. T1 - Predictive approaches for acute dialysis requirement and death in COVID-19 JF - Clinical journal of the American Society of Nephrology : CJASN N2 - Background and objectives AKI treated with dialysis initiation is a common complication of coronavirus disease 2019 (COVID-19) among hospitalized patients. However, dialysis supplies and personnel are often limited. Design, setting, participants, & measurements Using data from adult patients hospitalized with COVID-19 from five hospitals from theMount Sinai Health System who were admitted between March 10 and December 26, 2020, we developed and validated several models (logistic regression, Least Absolute Shrinkage and Selection Operator (LASSO), random forest, and eXtreme GradientBoosting [XGBoost; with and without imputation]) for predicting treatment with dialysis or death at various time horizons (1, 3, 5, and 7 days) after hospital admission. Patients admitted to theMount Sinai Hospital were used for internal validation, whereas the other hospitals formed part of the external validation cohort. Features included demographics, comorbidities, and laboratory and vital signs within 12 hours of hospital admission. Results A total of 6093 patients (2442 in training and 3651 in external validation) were included in the final cohort. Of the different modeling approaches used, XGBoost without imputation had the highest area under the receiver operating characteristic (AUROC) curve on internal validation (range of 0.93-0.98) and area under the precisionrecall curve (AUPRC; range of 0.78-0.82) for all time points. XGBoost without imputation also had the highest test parameters on external validation (AUROC range of 0.85-0.87, and AUPRC range of 0.27-0.54) across all time windows. XGBoost without imputation outperformed all models with higher precision and recall (mean difference in AUROC of 0.04; mean difference in AUPRC of 0.15). Features of creatinine, BUN, and red cell distribution width were major drivers of the model's prediction. Conclusions An XGBoost model without imputation for prediction of a composite outcome of either death or dialysis in patients positive for COVID-19 had the best performance, as compared with standard and other machine learning models. KW - COVID-19 KW - dialysis KW - machine learning KW - prediction KW - AKI Y1 - 2021 U6 - https://doi.org/10.2215/CJN.17311120 SN - 1555-9041 SN - 1555-905X VL - 16 IS - 8 SP - 1158 EP - 1168 PB - American Society of Nephrology CY - Washington ER - TY - JOUR A1 - Tong, Hao A1 - Nikoloski, Zoran T1 - Machine learning approaches for crop improvement BT - leveraging phenotypic and genotypic big data JF - Journal of plant physiology : biochemistry, physiology, molecular biology and biotechnology of plants N2 - Highly efficient and accurate selection of elite genotypes can lead to dramatic shortening of the breeding cycle in major crops relevant for sustaining present demands for food, feed, and fuel. In contrast to classical approaches that emphasize the need for resource-intensive phenotyping at all stages of artificial selection, genomic selection dramatically reduces the need for phenotyping. Genomic selection relies on advances in machine learning and the availability of genotyping data to predict agronomically relevant phenotypic traits. Here we provide a systematic review of machine learning approaches applied for genomic selection of single and multiple traits in major crops in the past decade. We emphasize the need to gather data on intermediate phenotypes, e.g. metabolite, protein, and gene expression levels, along with developments of modeling techniques that can lead to further improvements of genomic selection. In addition, we provide a critical view of factors that affect genomic selection, with attention to transferability of models between different environments. Finally, we highlight the future aspects of integrating high-throughput molecular phenotypic data from omics technologies with biological networks for crop improvement. KW - genomic selection KW - genomic prediction KW - machine learning KW - multiple KW - traits KW - multi-omics KW - GxE interaction Y1 - 2020 U6 - https://doi.org/10.1016/j.jplph.2020.153354 SN - 0176-1617 SN - 1618-1328 VL - 257 PB - Elsevier CY - München ER - TY - JOUR A1 - Steinberg, Andreas A1 - Vasyura-Bathke, Hannes A1 - Gaebler, Peter Jost A1 - Ohrnberger, Matthias A1 - Ceranna, Lars T1 - Estimation of seismic moment tensors using variational inference machine learning JF - Journal of geophysical research : Solid earth N2 - We present an approach for rapidly estimating full moment tensors of earthquakes and their parameter uncertainties based on short time windows of recorded seismic waveform data by considering deep learning of Bayesian Neural Networks (BNNs). The individual neural networks are trained on synthetic seismic waveform data and corresponding known earthquake moment-tensor parameters. A monitoring volume has been predefined to form a three-dimensional grid of locations and to train a BNN for each grid point. Variational inference on several of these networks allows us to consider several sources of error and how they affect the estimated full moment-tensor parameters and their uncertainties. In particular, we demonstrate how estimated parameter distributions are affected by uncertainties in the earthquake centroid location in space and time as well as in the assumed Earth structure model. We apply our approach as a proof of concept on seismic waveform recordings of aftershocks of the Ridgecrest 2019 earthquake with moment magnitudes ranging from Mw 2.7 to Mw 5.5. Overall, good agreement has been achieved between inferred parameter ensembles and independently estimated parameters using classical methods. Our developed approach is fast and robust, and therefore, suitable for down-stream analyses that need rapid estimates of the source mechanism for a large number of earthquakes. KW - seismology KW - machine learning KW - earthquake source KW - moment tensor KW - full KW - waveform Y1 - 2021 U6 - https://doi.org/10.1029/2021JB022685 SN - 2169-9313 SN - 2169-9356 VL - 126 IS - 10 PB - American Geophysical Union CY - Washington ER -