TY - JOUR A1 - Ehrig, Lukas A1 - Wagner, Ann-Christin A1 - Wolter, Heike A1 - Correll, Christoph U. A1 - Geisel, Olga A1 - Konigorski, Stefan T1 - FASDetect as a machine learning-based screening app for FASD in youth with ADHD JF - npj Digital Medicine N2 - Fetal alcohol-spectrum disorder (FASD) is underdiagnosed and often misdiagnosed as attention-deficit/hyperactivity disorder (ADHD). Here, we develop a screening tool for FASD in youth with ADHD symptoms. To develop the prediction model, medical record data from a German University outpatient unit are assessed including 275 patients aged 0-19 years old with FASD with or without ADHD and 170 patients with ADHD without FASD aged 0-19 years old. We train 6 machine learning models based on 13 selected variables and evaluate their performance. Random forest models yield the best prediction models with a cross-validated AUC of 0.92 (95% confidence interval [0.84, 0.99]). Follow-up analyses indicate that a random forest model with 6 variables - body length and head circumference at birth, IQ, socially intrusive behaviour, poor memory and sleep disturbance - yields equivalent predictive accuracy. We implement the prediction model in a web-based app called FASDetect - a user-friendly, clinically scalable FASD risk calculator that is freely available at https://fasdetect.dhc-lab.hpi.de. KW - Medical research KW - Psychiatric disorders Y1 - 2023 U6 - https://doi.org/10.1038/s41746-023-00864-1 SN - 2398-6352 VL - 6 IS - 1 PB - Macmillan Publishers Limited CY - Basingstoke ER - TY - JOUR A1 - Slosarek, Tamara A1 - Ibing, Susanne A1 - Schormair, Barbara A1 - Heyne, Henrike A1 - Böttinger, Erwin A1 - Andlauer, Till A1 - Schurmann, Claudia T1 - Implementation and evaluation of personal genetic testing as part of genomics analysis courses in German universities JF - BMC Medical Genomics N2 - Purpose Due to the increasing application of genome analysis and interpretation in medical disciplines, professionals require adequate education. Here, we present the implementation of personal genotyping as an educational tool in two genomics courses targeting Digital Health students at the Hasso Plattner Institute (HPI) and medical students at the Technical University of Munich (TUM). Methods We compared and evaluated the courses and the students ' perceptions on the course setup using questionnaires. Results During the course, students changed their attitudes towards genotyping (HPI: 79% [15 of 19], TUM: 47% [25 of 53]). Predominantly, students became more critical of personal genotyping (HPI: 73% [11 of 15], TUM: 72% [18 of 25]) and most students stated that genetic analyses should not be allowed without genetic counseling (HPI: 79% [15 of 19], TUM: 70% [37 of 53]). Students found the personal genotyping component useful (HPI: 89% [17 of 19], TUM: 92% [49 of 53]) and recommended its inclusion in future courses (HPI: 95% [18 of 19], TUM: 98% [52 of 53]). Conclusion Students perceived the personal genotyping component as valuable in the described genomics courses. The implementation described here can serve as an example for future courses in Europe. KW - Genomics education KW - Personal genotyping KW - Personalized medicine Y1 - 2023 U6 - https://doi.org/10.1186/s12920-023-01503-0 SN - 1755-8794 VL - 16 IS - 1 PB - BMC CY - London ER - TY - THES A1 - Taleb, Aiham T1 - Self-supervised deep learning methods for medical image analysis T1 - Selbstüberwachte Deep Learning Methoden für die medizinische Bildanalyse N2 - Deep learning has seen widespread application in many domains, mainly for its ability to learn data representations from raw input data. Nevertheless, its success has so far been coupled with the availability of large annotated (labelled) datasets. This is a requirement that is difficult to fulfil in several domains, such as in medical imaging. Annotation costs form a barrier in extending deep learning to clinically-relevant use cases. The labels associated with medical images are scarce, since the generation of expert annotations of multimodal patient data at scale is non-trivial, expensive, and time-consuming. This substantiates the need for algorithms that learn from the increasing amounts of unlabeled data. Self-supervised representation learning algorithms offer a pertinent solution, as they allow solving real-world (downstream) deep learning tasks with fewer annotations. Self-supervised approaches leverage unlabeled samples to acquire generic features about different concepts, enabling annotation-efficient downstream task solving subsequently. Nevertheless, medical images present multiple unique and inherent challenges for existing self-supervised learning approaches, which we seek to address in this thesis: (i) medical images are multimodal, and their multiple modalities are heterogeneous in nature and imbalanced in quantities, e.g. MRI and CT; (ii) medical scans are multi-dimensional, often in 3D instead of 2D; (iii) disease patterns in medical scans are numerous and their incidence exhibits a long-tail distribution, so it is oftentimes essential to fuse knowledge from different data modalities, e.g. genomics or clinical data, to capture disease traits more comprehensively; (iv) Medical scans usually exhibit more uniform color density distributions, e.g. in dental X-Rays, than natural images. Our proposed self-supervised methods meet these challenges, besides significantly reducing the amounts of required annotations. We evaluate our self-supervised methods on a wide array of medical imaging applications and tasks. Our experimental results demonstrate the obtained gains in both annotation-efficiency and performance; our proposed methods outperform many approaches from related literature. Additionally, in case of fusion with genetic modalities, our methods also allow for cross-modal interpretability. In this thesis, not only we show that self-supervised learning is capable of mitigating manual annotation costs, but also our proposed solutions demonstrate how to better utilize it in the medical imaging domain. Progress in self-supervised learning has the potential to extend deep learning algorithms application to clinical scenarios. N2 - Deep Learning findet in vielen Bereichen breite Anwendung, vor allem wegen seiner Fähigkeit, Datenrepräsentationen aus rohen Eingabedaten zu lernen. Dennoch war der Erfolg bisher an die Verfügbarkeit großer annotatierter Datensätze geknüpft. Dies ist eine Anforderung, die in verschiedenen Bereichen, z. B. in der medizinischen Bildgebung, schwer zu erfüllen ist. Die Kosten für die Annotation stellen ein Hindernis für die Ausweitung des Deep Learning auf klinisch relevante Anwendungsfälle dar. Die mit medizinischen Bildern verbundenen Annotationen sind rar, da die Erstellung von Experten Annotationen für multimodale Patientendaten in großem Umfang nicht trivial, teuer und zeitaufwändig ist. Dies unterstreicht den Bedarf an Algorithmen, die aus den wachsenden Mengen an unbeschrifteten Daten lernen. Selbstüberwachte Algorithmen für das Repräsentationslernen bieten eine mögliche Lösung, da sie die Lösung realer (nachgelagerter) Deep-Learning-Aufgaben mit weniger Annotationen ermöglichen. Selbstüberwachte Ansätze nutzen unannotierte Stichproben, um generisches Eigenschaften über verschiedene Konzepte zu erlangen und ermöglichen so eine annotationseffiziente Lösung nachgelagerter Aufgaben. Medizinische Bilder stellen mehrere einzigartige und inhärente Herausforderungen für existierende selbstüberwachte Lernansätze dar, die wir in dieser Arbeit angehen wollen: (i) medizinische Bilder sind multimodal, und ihre verschiedenen Modalitäten sind von Natur aus heterogen und in ihren Mengen unausgewogen, z.B. (ii) medizinische Scans sind mehrdimensional, oft in 3D statt in 2D; (iii) Krankheitsmuster in medizinischen Scans sind zahlreich und ihre Häufigkeit weist eine Long-Tail-Verteilung auf, so dass es oft unerlässlich ist, Wissen aus verschiedenen Datenmodalitäten, z. B. Genomik oder klinische Daten, zu verschmelzen, um Krankheitsmerkmale umfassender zu erfassen; (iv) medizinische Scans weisen in der Regel eine gleichmäßigere Farbdichteverteilung auf, z. B. in zahnmedizinischen Röntgenaufnahmen, als natürliche Bilder. Die von uns vorgeschlagenen selbstüberwachten Methoden adressieren diese Herausforderungen und reduzieren zudem die Menge der erforderlichen Annotationen erheblich. Wir evaluieren unsere selbstüberwachten Methoden in verschiedenen Anwendungen und Aufgaben der medizinischen Bildgebung. Unsere experimentellen Ergebnisse zeigen, dass die von uns vorgeschlagenen Methoden sowohl die Effizienz der Annotation als auch die Leistung steigern und viele Ansätze aus der verwandten Literatur übertreffen. Darüber hinaus ermöglichen unsere Methoden im Falle der Fusion mit genetischen Modalitäten auch eine modalübergreifende Interpretierbarkeit. In dieser Arbeit zeigen wir nicht nur, dass selbstüberwachtes Lernen in der Lage ist, die Kosten für manuelle Annotationen zu senken, sondern auch, wie man es in der medizinischen Bildgebung besser nutzen kann. Fortschritte beim selbstüberwachten Lernen haben das Potenzial, die Anwendung von Deep-Learning-Algorithmen auf klinische Szenarien auszuweiten. KW - Artificial Intelligence KW - machine learning KW - unsupervised learning KW - representation learning KW - Künstliche Intelligenz KW - maschinelles Lernen KW - Representationlernen KW - selbstüberwachtes Lernen Y1 - 2024 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-644089 ER - TY - JOUR A1 - Shams, Boshra A1 - Wang, Ziqian A1 - Roine, Timo A1 - Aydogan, Dogu Baran A1 - Vajkoczy, Peter A1 - Lippert, Christoph A1 - Picht, Thomas A1 - Fekonja, Lucius Samo T1 - Machine learning-based prediction of motor status in glioma patients using diffusion MRI metrics along the corticospinal tract JF - Brain communications N2 - Shams et al. report that glioma patients' motor status is predicted accurately by diffusion MRI metrics along the corticospinal tract based on support vector machine method, reaching an overall accuracy of 77%. They show that these metrics are more effective than demographic and clinical variables. Along tract statistics enables white matter characterization using various diffusion MRI metrics. These diffusion models reveal detailed insights into white matter microstructural changes with development, pathology and function. Here, we aim at assessing the clinical utility of diffusion MRI metrics along the corticospinal tract, investigating whether motor glioma patients can be classified with respect to their motor status. We retrospectively included 116 brain tumour patients suffering from either left or right supratentorial, unilateral World Health Organization Grades II, III and IV gliomas with a mean age of 53.51 +/- 16.32 years. Around 37% of patients presented with preoperative motor function deficits according to the Medical Research Council scale. At group level comparison, the highest non-overlapping diffusion MRI differences were detected in the superior portion of the tracts' profiles. Fractional anisotropy and fibre density decrease, apparent diffusion coefficient axial diffusivity and radial diffusivity increase. To predict motor deficits, we developed a method based on a support vector machine using histogram-based features of diffusion MRI tract profiles (e.g. mean, standard deviation, kurtosis and skewness), following a recursive feature elimination method. Our model achieved high performance (74% sensitivity, 75% specificity, 74% overall accuracy and 77% area under the curve). We found that apparent diffusion coefficient, fractional anisotropy and radial diffusivity contributed more than other features to the model. Incorporating the patient demographics and clinical features such as age, tumour World Health Organization grade, tumour location, gender and resting motor threshold did not affect the model's performance, revealing that these features were not as effective as microstructural measures. These results shed light on the potential patterns of tumour-related microstructural white matter changes in the prediction of functional deficits. KW - machine learning KW - support vector machine KW - tractography KW - diffusion MRI; KW - corticospinal tract Y1 - 2022 U6 - https://doi.org/10.1093/braincomms/fcac141 SN - 2632-1297 VL - 4 IS - 3 PB - Oxford University Press CY - Oxford ER - TY - JOUR A1 - Ring, Raphaela M. A1 - Eisenmann, Clemens A1 - Kandil, Farid A1 - Steckhan, Nico A1 - Demmrich, Sarah A1 - Klatte, Caroline A1 - Kessler, Christian S. A1 - Jeitler, Michael A1 - Boschmann, Michael A1 - Michalsen, Andreas A1 - Blakeslee, Sarah B. A1 - Stöckigt, Barbara A1 - Stritter, Wiebke A1 - Koppold-Liebscher, Daniela A. T1 - Mental and behavioural responses to Bahá’í fasting: Looking behind the scenes of a religiously motivated intermittent fast using a mixed methods approach JF - Nutrients N2 - Background/Objective: Historically, fasting has been practiced not only for medical but also for religious reasons. Baha'is follow an annual religious intermittent dry fast of 19 days. We inquired into motivation behind and subjective health impacts of Baha'i fasting. Methods: A convergent parallel mixed methods design was embedded in a clinical single arm observational study. Semi-structured individual interviews were conducted before (n = 7), during (n = 8), and after fasting (n = 8). Three months after the fasting period, two focus group interviews were conducted (n = 5/n = 3). A total of 146 Baha'i volunteers answered an online survey at five time points before, during, and after fasting. Results: Fasting was found to play a central role for the religiosity of interviewees, implying changes in daily structures, spending time alone, engaging in religious practices, and experiencing social belonging. Results show an increase in mindfulness and well-being, which were accompanied by behavioural changes and experiences of self-efficacy and inner freedom. Survey scores point to an increase in mindfulness and well-being during fasting, while stress, anxiety, and fatigue decreased. Mindfulness remained elevated even three months after the fast. Conclusion: Baha'i fasting seems to enhance participants' mindfulness and well-being, lowering stress levels and reducing fatigue. Some of these effects lasted more than three months after fasting. KW - intermittent food restriction KW - mindfulness KW - self-efficacy KW - well-being KW - mixed methods KW - health behaviour KW - coping ability KW - religiously motivated KW - dry fasting Y1 - 2022 U6 - https://doi.org/10.3390/nu14051038 SN - 2072-6643 VL - 14 IS - 5 PB - MDPI CY - Basel ER - TY - BOOK A1 - Kuban, Robert A1 - Rotta, Randolf A1 - Nolte, Jörg A1 - Chromik, Jonas A1 - Beilharz, Jossekin Jakob A1 - Pirl, Lukas A1 - Friedrich, Tobias A1 - Lenzner, Pascal A1 - Weyand, Christopher A1 - Juiz, Carlos A1 - Bermejo, Belen A1 - Sauer, Joao A1 - Coelh, Leandro dos Santos A1 - Najafi, Pejman A1 - Pünter, Wenzel A1 - Cheng, Feng A1 - Meinel, Christoph A1 - Sidorova, Julia A1 - Lundberg, Lars A1 - Vogel, Thomas A1 - Tran, Chinh A1 - Moser, Irene A1 - Grunske, Lars A1 - Elsaid, Mohamed Esameldin Mohamed A1 - Abbas, Hazem M. A1 - Rula, Anisa A1 - Sejdiu, Gezim A1 - Maurino, Andrea A1 - Schmidt, Christopher A1 - Hügle, Johannes A1 - Uflacker, Matthias A1 - Nozza, Debora A1 - Messina, Enza A1 - Hoorn, André van A1 - Frank, Markus A1 - Schulz, Henning A1 - Alhosseini Almodarresi Yasin, Seyed Ali A1 - Nowicki, Marek A1 - Muite, Benson K. A1 - Boysan, Mehmet Can A1 - Bianchi, Federico A1 - Cremaschi, Marco A1 - Moussa, Rim A1 - Abdel-Karim, Benjamin M. A1 - Pfeuffer, Nicolas A1 - Hinz, Oliver A1 - Plauth, Max A1 - Polze, Andreas A1 - Huo, Da A1 - Melo, Gerard de A1 - Mendes Soares, Fábio A1 - Oliveira, Roberto Célio Limão de A1 - Benson, Lawrence A1 - Paul, Fabian A1 - Werling, Christian A1 - Windheuser, Fabian A1 - Stojanovic, Dragan A1 - Djordjevic, Igor A1 - Stojanovic, Natalija A1 - Stojnev Ilic, Aleksandra A1 - Weidmann, Vera A1 - Lowitzki, Leon A1 - Wagner, Markus A1 - Ifa, Abdessatar Ben A1 - Arlos, Patrik A1 - Megia, Ana A1 - Vendrell, Joan A1 - Pfitzner, Bjarne A1 - Redondo, Alberto A1 - Ríos Insua, David A1 - Albert, Justin Amadeus A1 - Zhou, Lin A1 - Arnrich, Bert A1 - Szabó, Ildikó A1 - Fodor, Szabina A1 - Ternai, Katalin A1 - Bhowmik, Rajarshi A1 - Campero Durand, Gabriel A1 - Shevchenko, Pavlo A1 - Malysheva, Milena A1 - Prymak, Ivan A1 - Saake, Gunter ED - Meinel, Christoph ED - Polze, Andreas ED - Beins, Karsten ED - Strotmann, Rolf ED - Seibold, Ulrich ED - Rödszus, Kurt ED - Müller, Jürgen T1 - HPI Future SOC Lab – Proceedings 2019 N2 - The “HPI Future SOC Lab” is a cooperation of the Hasso Plattner Institute (HPI) and industry partners. Its mission is to enable and promote exchange and interaction between the research community and the industry partners. The HPI Future SOC Lab provides researchers with free of charge access to a complete infrastructure of state of the art hard and software. This infrastructure includes components, which might be too expensive for an ordinary research environment, such as servers with up to 64 cores and 2 TB main memory. The offerings address researchers particularly from but not limited to the areas of computer science and business information systems. Main areas of research include cloud computing, parallelization, and In-Memory technologies. This technical report presents results of research projects executed in 2019. Selected projects have presented their results on April 9th and November 12th 2019 at the Future SOC Lab Day events. N2 - Das Future SOC Lab am HPI ist eine Kooperation des Hasso-Plattner-Instituts mit verschiedenen Industriepartnern. Seine Aufgabe ist die Ermöglichung und Förderung des Austausches zwischen Forschungsgemeinschaft und Industrie. Am Lab wird interessierten Wissenschaftlern eine Infrastruktur von neuester Hard- und Software kostenfrei für Forschungszwecke zur Verfügung gestellt. Dazu zählen teilweise noch nicht am Markt verfügbare Technologien, die im normalen Hochschulbereich in der Regel nicht zu finanzieren wären, bspw. Server mit bis zu 64 Cores und 2 TB Hauptspeicher. Diese Angebote richten sich insbesondere an Wissenschaftler in den Gebieten Informatik und Wirtschaftsinformatik. Einige der Schwerpunkte sind Cloud Computing, Parallelisierung und In-Memory Technologien. In diesem Technischen Bericht werden die Ergebnisse der Forschungsprojekte des Jahres 2019 vorgestellt. Ausgewählte Projekte stellten ihre Ergebnisse am 09. April und 12. November 2019 im Rahmen des Future SOC Lab Tags vor. T3 - Technische Berichte des Hasso-Plattner-Instituts für Digital Engineering an der Universität Potsdam - 158 KW - Future SOC Lab KW - research projects KW - multicore architectures KW - in-memory technology KW - cloud computing KW - machine learning KW - artifical intelligence KW - Future SOC Lab KW - Forschungsprojekte KW - Multicore Architekturen KW - In-Memory Technologie KW - Cloud Computing KW - maschinelles Lernen KW - künstliche Intelligenz Y1 - 2023 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-597915 SN - 978-3-86956-564-4 SN - 1613-5652 SN - 2191-1665 IS - 158 PB - Universitätsverlag Potsdam CY - Potsdam ER - TY - THES A1 - Richly, Keven T1 - Memory-efficient data management for spatio-temporal applications BT - workload-driven fine-grained configuration optimization for storing spatio-temporal data in columnar In-memory databases N2 - The wide distribution of location-acquisition technologies means that large volumes of spatio-temporal data are continuously being accumulated. Positioning systems such as GPS enable the tracking of various moving objects' trajectories, which are usually represented by a chronologically ordered sequence of observed locations. The analysis of movement patterns based on detailed positional information creates opportunities for applications that can improve business decisions and processes in a broad spectrum of industries (e.g., transportation, traffic control, or medicine). Due to the large data volumes generated in these applications, the cost-efficient storage of spatio-temporal data is desirable, especially when in-memory database systems are used to achieve interactive performance requirements. To efficiently utilize the available DRAM capacities, modern database systems support various tuning possibilities to reduce the memory footprint (e.g., data compression) or increase performance (e.g., additional indexes structures). By considering horizontal data partitioning, we can independently apply different tuning options on a fine-grained level. However, the selection of cost and performance-balancing configurations is challenging, due to the vast number of possible setups consisting of mutually dependent individual decisions. In this thesis, we introduce multiple approaches to improve spatio-temporal data management by automatically optimizing diverse tuning options for the application-specific access patterns and data characteristics. Our contributions are as follows: (1) We introduce a novel approach to determine fine-grained table configurations for spatio-temporal workloads. Our linear programming (LP) approach jointly optimizes the (i) data compression, (ii) ordering, (iii) indexing, and (iv) tiering. We propose different models which address cost dependencies at different levels of accuracy to compute optimized tuning configurations for a given workload, memory budgets, and data characteristics. To yield maintainable and robust configurations, we further extend our LP-based approach to incorporate reconfiguration costs as well as optimizations for multiple potential workload scenarios. (2) To optimize the storage layout of timestamps in columnar databases, we present a heuristic approach for the workload-driven combined selection of a data layout and compression scheme. By considering attribute decomposition strategies, we are able to apply application-specific optimizations that reduce the memory footprint and improve performance. (3) We introduce an approach that leverages past trajectory data to improve the dispatch processes of transportation network companies. Based on location probabilities, we developed risk-averse dispatch strategies that reduce critical delays. (4) Finally, we used the use case of a transportation network company to evaluate our database optimizations on a real-world dataset. We demonstrate that workload-driven fine-grained optimizations allow us to reduce the memory footprint (up to 71% by equal performance) or increase the performance (up to 90% by equal memory size) compared to established rule-based heuristics. Individually, our contributions provide novel approaches to the current challenges in spatio-temporal data mining and database research. Combining them allows in-memory databases to store and process spatio-temporal data more cost-efficiently. N2 - Durch die starke Verbreitung von Systemen zur Positionsbestimmung werden fortlaufend große Mengen an Bewegungsdaten mit einem räumlichen und zeitlichen Bezug gesammelt. Ortungssysteme wie GPS ermöglichen, die Bewegungen verschiedener Objekte (z. B. Personen oder Fahrzeuge) nachzuverfolgen. Diese werden in der Regel durch eine chronologisch geordnete Abfolge beobachteter Aufenthaltsorte repräsentiert. Die Analyse von Bewegungsmustern auf der Grundlage detaillierter Positionsinformationen schafft in unterschiedlichsten Branchen (z. B. Transportwesen, Verkehrssteuerung oder Medizin) die Möglichkeit Geschäftsentscheidungen und -prozesse zu verbessern. Aufgrund der großen Datenmengen, die bei diesen Anwendungen auftreten, stellt die kosteneffiziente Speicherung von Bewegungsdaten eine Herausforderung dar. Dies ist insbesondere der Fall, wenn Hauptspeicherdatenbanken zur Speicherung eingesetzt werden, um die Anforderungen bezüglich interaktiver Antwortzeiten zu erfüllen. Um die verfügbaren Speicherkapazitäten effizient zu nutzen, unterstützen moderne Datenbanksysteme verschiedene Optimierungsmöglichkeiten, um den Speicherbedarf zu reduzieren (z. B. durch Datenkomprimierung) oder die Performance zu erhöhen (z. B. durch Indexstrukturen). Dabei ermöglicht eine horizontale Partitionierung der Daten, dass unabhängig voneinander verschiedene Optimierungen feingranular auf einzelnen Bereichen der Daten angewendet werden können. Die Auswahl von Konfigurationen, die sowohl die Kosten als auch Leistungsanforderungen berücksichtigen, ist jedoch aufgrund der großen Anzahl möglicher Kombinationen -- die aus voneinander abhängigen Einzelentscheidungen bestehen -- komplex. In dieser Dissertation präsentieren wir mehrere Ansätze zur Verbesserung der Datenverwaltung, indem wir die Auswahl verschiedener Datenbankoptimierungen automatisch für die anwendungsspezifischen Zugriffsmuster und Dateneigenschaften anpassen. Diesbezüglich leistet die vorliegende Dissertation die folgenden Beiträge: (1) Wir stellen einen neuen Ansatz vor, um feingranulare Tabellenkonfigurationen für räumlich-zeitliche Workloads zu bestimmen. In diesem Zusammenhang optimiert unser Linear Programming (LP) Ansatz gemeinsam (i) die Datenkompression, (ii) die Sortierung, (iii) die Indizierung und (iv) die Datenplatzierung. Hierzu schlagen wir verschiedene Modelle mit unterschiedlichen Kostenabhängigkeiten vor, um optimierte Konfigurationen für einen gegebenen Workload, ein Speicherbudget und die vorliegenden Dateneigenschaften zu berechnen. Durch die Erweiterung des LP-basierten Ansatzes zur Berücksichtigung von Modifikationskosten und verschiedener potentieller Workloads ist es möglich, die Wartbarkeit und Robustheit der bestimmten Tabellenkonfiguration zu erhöhen. (2) Um die Speicherung von Timestamps in spalten-orientierten Datenbanken zu optimieren, stellen wir einen heuristischen Ansatz für die kombinierte Auswahl eines Speicherlayouts und eines Kompressionsschemas vor. Zudem sind wir durch die Berücksichtigung von Strategien zur Aufteilung von Attributen in der Lage, anwendungsspezifische Optimierungen anzuwenden, die den Speicherbedarf reduzieren und die Performance verbessern. (3) Wir stellen einen Ansatz vor, der in der Vergangenheit beobachtete Bewegungsmuster nutzt, um die Zuweisungsprozesse von Vermittlungsdiensten zur Personenbeförderung zu verbessern. Auf der Grundlage von Standortwahrscheinlichkeiten haben wir verschiedene Strategien für die Vergabe von Fahraufträgen an Fahrer entwickelt, die kritische Verspätungen reduzieren. (4) Abschließend haben wir unsere Datenbankoptimierungen anhand eines realen Datensatzes eines Transportdienstleisters evaluiert. In diesem Zusammenhang zeigen wir, dass wir durch feingranulare workload-basierte Optimierungen den Speicherbedarf (um bis zu 71% bei vergleichbarer Performance) reduzieren oder die Performance (um bis zu 90% bei gleichem Speicherverbrauch) im Vergleich zu regelbasierten Heuristiken verbessern können. Die einzelnen Beiträge stellen neuartige Ansätze für aktuelle Herausforderungen im Bereich des Data Mining und der Datenbankforschung dar. In Kombination ermöglichen sie eine kosteneffizientere Speicherung und Verarbeitung von Bewegungsdaten in Hauptspeicherdatenbanken. KW - spatio-temporal data management KW - trajectory data KW - columnar databases KW - in-memory data management KW - database tuning KW - spaltenorientierte Datenbanken KW - Datenbankoptimierung KW - Hauptspeicher Datenmanagement KW - Datenverwaltung für Daten mit räumlich-zeitlichem Bezug KW - Trajektoriendaten Y1 - 2024 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-635473 ER - TY - JOUR A1 - Rosin, Paul L. A1 - Lai, Yu-Kun A1 - Mould, David A1 - Yi, Ran A1 - Berger, Itamar A1 - Doyle, Lars A1 - Lee, Seungyong A1 - Li, Chuan A1 - Liu, Yong-Jin A1 - Semmo, Amir A1 - Shamir, Ariel A1 - Son, Minjung A1 - Winnemöller, Holger T1 - NPRportrait 1.0: A three-level benchmark for non-photorealistic rendering of portraits JF - Computational visual media N2 - Recently, there has been an upsurge of activity in image-based non-photorealistic rendering (NPR), and in particular portrait image stylisation, due to the advent of neural style transfer (NST). However, the state of performance evaluation in this field is poor, especially compared to the norms in the computer vision and machine learning communities. Unfortunately, the task of evaluating image stylisation is thus far not well defined, since it involves subjective, perceptual, and aesthetic aspects. To make progress towards a solution, this paper proposes a new structured, three-level, benchmark dataset for the evaluation of stylised portrait images. Rigorous criteria were used for its construction, and its consistency was validated by user studies. Moreover, a new methodology has been developed for evaluating portrait stylisation algorithms, which makes use of the different benchmark levels as well as annotations provided by user studies regarding the characteristics of the faces. We perform evaluation for a wide variety of image stylisation methods (both portrait-specific and general purpose, and also both traditional NPR approaches and NST) using the new benchmark dataset. KW - non-photorealistic rendering (NPR) KW - image stylization KW - style transfer KW - portrait KW - evaluation KW - benchmark Y1 - 2022 U6 - https://doi.org/10.1007/s41095-021-0255-3 SN - 2096-0433 SN - 2096-0662 VL - 8 IS - 3 SP - 445 EP - 465 PB - Springer Nature CY - London ER - TY - JOUR A1 - Vitagliano, Gerardo A1 - Hameed, Mazhar A1 - Jiang, Lan A1 - Reisener, Lucas A1 - Wu, Eugene A1 - Naumann, Felix T1 - Pollock: a data loading benchmark JF - Proceedings of the VLDB Endowment N2 - Any system at play in a data-driven project has a fundamental requirement: the ability to load data. The de-facto standard format to distribute and consume raw data is CSV. Yet, the plain text and flexible nature of this format make such files often difficult to parse and correctly load their content, requiring cumbersome data preparation steps. We propose a benchmark to assess the robustness of systems in loading data from non-standard CSV formats and with structural inconsistencies. First, we formalize a model to describe the issues that affect real-world files and use it to derive a systematic lpollutionz process to generate dialects for any given grammar. Our benchmark leverages the pollution framework for the csv format. To guide pollution, we have surveyed thousands of real-world, publicly available csv files, recording the problems we encountered. We demonstrate the applicability of our benchmark by testing and scoring 16 different systems: popular csv parsing frameworks, relational database tools, spreadsheet systems, and a data visualization tool. Y1 - 2023 U6 - https://doi.org/10.14778/3594512.3594518 SN - 2150-8097 VL - 16 IS - 8 SP - 1870 EP - 1882 PB - Association for Computing Machinery CY - New York ER - TY - JOUR A1 - Wiemker, Veronika A1 - Bunova, Anna A1 - Neufeld, Maria A1 - Gornyi, Boris A1 - Yurasova, Elena A1 - Konigorski, Stefan A1 - Kalinina, Anna A1 - Kontsevaya, Anna A1 - Ferreira-Borges, Carina A1 - Probst, Charlotte T1 - Pilot study to evaluate usability and acceptability of the 'Animated Alcohol Assessment Tool' in Russian primary healthcare JF - Digital health N2 - Background and aims: Accurate and user-friendly assessment tools quantifying alcohol consumption are a prerequisite to effective prevention and treatment programmes, including Screening and Brief Intervention. Digital tools offer new potential in this field. We developed the ‘Animated Alcohol Assessment Tool’ (AAA-Tool), a mobile app providing an interactive version of the World Health Organization's Alcohol Use Disorders Identification Test (AUDIT) that facilitates the description of individual alcohol consumption via culturally informed animation features. This pilot study evaluated the Russia-specific version of the Animated Alcohol Assessment Tool with regard to (1) its usability and acceptability in a primary healthcare setting, (2) the plausibility of its alcohol consumption assessment results and (3) the adequacy of its Russia-specific vessel and beverage selection. Methods: Convenience samples of 55 patients (47% female) and 15 healthcare practitioners (80% female) in 2 Russian primary healthcare facilities self-administered the Animated Alcohol Assessment Tool and rated their experience on the Mobile Application Rating Scale – User Version. Usage data was automatically collected during app usage, and additional feedback on regional content was elicited in semi-structured interviews. Results: On average, patients completed the Animated Alcohol Assessment Tool in 6:38 min (SD = 2.49, range = 3.00–17.16). User satisfaction was good, with all subscale Mobile Application Rating Scale – User Version scores averaging >3 out of 5 points. A majority of patients (53%) and practitioners (93%) would recommend the tool to ‘many people’ or ‘everyone’. Assessed alcohol consumption was plausible, with a low number (14%) of logically impossible entries. Most patients reported the Animated Alcohol Assessment Tool to reflect all vessels (78%) and all beverages (71%) they typically used. Conclusion: High acceptability ratings by patients and healthcare practitioners, acceptable completion time, plausible alcohol usage assessment results and perceived adequacy of region-specific content underline the Animated Alcohol Assessment Tool's potential to provide a novel approach to alcohol assessment in primary healthcare. After its validation, the Animated Alcohol Assessment Tool might contribute to reducing alcohol-related harm by facilitating Screening and Brief Intervention implementation in Russia and beyond. KW - Alcohol use assessment KW - Alcohol Use Disorders Identification Test KW - screening tools KW - digital health KW - mobile applications KW - Russia KW - primary healthcare KW - usability KW - acceptability Y1 - 2022 U6 - https://doi.org/10.1177/20552076211074491 SN - 2055-2076 VL - 8 PB - Sage Publications CY - London ER -