TY  - JOUR
A1  - Bläsius, Thomas
A1  - Friedrich, Tobias
A1  - Krejca, Martin S.
A1  - Molitor, Louise
T1  - The impact of geometry on monochrome regions in the flip Schelling process
JF  - Computational geometry
N2  - Schelling's classical segregation model gives a coherent explanation for the wide-spread phenomenon of residential segregation. We introduce an agent-based saturated open-city variant, the Flip Schelling Process (FSP), in which agents, placed on a graph, have one out of two types and, based on the predominant type in their neighborhood, decide whether to change their types; similar to a new agent arriving as soon as another agent leaves the vertex. We investigate the probability that an edge {u,v} is monochrome, i.e., that both vertices u and v have the same type in the FSP, and we provide a general framework for analyzing the influence of the underlying graph topology on residential segregation. In particular, for two adjacent vertices, we show that a highly decisive common neighborhood, i.e., a common neighborhood where the absolute value of the difference between the number of vertices with different types is high, supports segregation and, moreover, that large common neighborhoods are more decisive. As an application, we study the expected behavior of the FSP on two common random graph models with and without geometry: (1) For random geometric graphs, we show that the existence of an edge {u,v} makes a highly decisive common neighborhood for u and v more likely. Based on this, we prove the existence of a constant c>0 such that the expected fraction of monochrome edges after the FSP is at least 1/2+c. (2) For Erdős–Rényi graphs we show that large common neighborhoods are unlikely and that the expected fraction of monochrome edges after the FSP is at most 1/2+o(1). Our results indicate that the cluster structure of the underlying graph has a significant impact on the obtained segregation strength.
KW  - Agent-based model
KW  - Schelling segregation
KW  - Spin system
Y1  - 2022
U6  - https://doi.org/10.1016/j.comgeo.2022.101902
SN  - 0925-7721
SN  - 1879-081X
VL  - 108
PB  - Elsevier
CY  - Amsterdam
ER  - 
TY  - JOUR
A1  - Ruipérez-Valiente, José A.
A1  - Staubitz, Thomas
A1  - Jenner, Matt
A1  - Halawa, Sherif
A1  - Zhang, Jiayin
A1  - Despujol, Ignacio
A1  - Maldonado-Mahauad, Jorge
A1  - Montoro, German
A1  - Peffer, Melanie
A1  - Rohloff, Tobias
A1  - Lane, Jenny
A1  - Turro, Carlos
A1  - Li, Xitong
A1  - Pérez-Sanagustín, Mar
A1  - Reich, Justin
T1  - Large scale analytics of global and regional MOOC providers: Differences in learners' demographics, preferences, and perceptions
JF  - Computers & education
N2  - Massive Open Online Courses (MOOCs) remarkably attracted global media attention, but the spotlight has been concentrated on a handful of English-language providers. While Coursera, edX, Udacity, and FutureLearn received most of the attention and scrutiny, an entirely new ecosystem of local MOOC providers was growing in parallel. This ecosystem is harder to study than the major players: they are spread around the world, have less staff devoted to maintaining research data, and operate in multiple languages with university and corporate regional partners. To better understand how online learning opportunities are expanding through this regional MOOC ecosystem, we created a research partnership among 15 different MOOC providers from nine countries. We gathered data from over eight million learners in six thousand MOOCs, and we conducted a large-scale survey with more than 10 thousand participants. From our analysis, we argue that these regional providers may be better positioned to meet the goals of expanding access to higher education in their regions than the better-known global providers. To make this claim we highlight three trends: first, regional providers attract a larger local population with more inclusive demographic profiles; second, students predominantly choose their courses based on topical interest, and regional providers do a better job at catering to those needs; and third, many students feel more at ease learning from institutions they already know and have references from. Our work raises the importance of local education in the global MOOC ecosystem, while calling for additional research and conversations across the diversity of MOOC providers.
KW  - Learning analytics
KW  - Educational data mining
KW  - Massive open online courses
KW  - Large scale analytics
KW  - Cultural factors
KW  - Equity
KW  - Distance learning
Y1  - 2022
U6  - https://doi.org/10.1016/j.compedu.2021.104426
SN  - 0360-1315
SN  - 1873-782X
VL  - 180
PB  - Elsevier
CY  - Oxford
ER  - 
TY  - THES
A1  - Haskamp, Thomas
T1  - Products design organizations
T1  - Produkte designen Organisationen
BT  - how industrial-aged companies accomplish digital product innovation
BT  - wie etablierte Industrieunternehmen digitale Produktinnovationen erreichen
N2  - The automotive industry is a prime example of digital technologies reshaping mobility. Connected, autonomous, shared, and electric (CASE) trends lead to new emerging players that threaten existing industrial-aged companies. To respond, incumbents need to bridge the gap between contrasting product architecture and organizational principles in the physical and digital realms. Over-the-air (OTA) technology, that enables seamless software updates and on-demand feature additions for customers, is an example of CASE-driven digital product innovation. Through an extensive longitudinal case study of an OTA initiative by an industrial- aged automaker, this dissertation explores how incumbents accomplish digital product innovation. Building on modularity, liminality, and the mirroring hypothesis, it presents a process model that explains the triggers, mechanisms, and outcomes of this process. In contrast to the literature, the findings emphasize the primacy of addressing product architecture challenges over organizational ones and highlight the managerial implications for success.
N2  - Die Entwicklung neuer digitaler Produktinnovation erfordert in etablierten Industrieunternehmen die Integration von digitalen und physischen Elementen. Dies ist besonders in der Automobilindustrie sichtbar, wo der Trend zu vernetzter, autonomer, gemeinsam genutzter und elektrischer Mobilität zu einem neuen Wettbewerb führt, welcher etablierte Marktteilnehmer bedroht. Diese müssen lernen wie die Integration von gegensätzlichen Produktarchitekturen und Organisationsprinzipien aus der digitalen und physischen Produktentwicklung funktioniert.
Die vorliegende Dissertation widmet sich diesem Problem. Basierend auf einer Fallstudie einer digitalen Produktinnovationsinitiative eines Premiummobilitätsanbieters rund um die Integration von Over-the-Air-Technologie für Software-Updates liefert sie wichtige Erkenntnisse. Erstens, etablierte Organisationen müssen Ihre Produktarchitektur befähigen, um verschiedene Produktarchitekturprinzipien in Einklang zu bringen. Zweitens, verschiedene Produktentwicklungsprozesse pro Produktebene müssen aufeinander abgestimmt werden. Drittens, die Organisationsstruktur muss erweitert werden, um die verschiedenen Produktebenen abzubilden. Darüber hinaus müssen auch Ressourcenallokationsprozesse auf die Entwicklungsprozesse abgestimmt werden.
Basierend auf diesen Erkenntnissen und mit der bestehenden Fachliteratur wird in der Dissertation ein Prozessmodell entwickelt, welches erklären soll, wie etablierte Industrieunternehmen digitale Produktinnovation erreichen. Kernauslöser sind externer Marktdruck sowie existierende Architekturprinzipien. Wechselseitige Mechanismen wie die Befähigung der Produktarchitektur, die Erweiterung der Organisationstruktur, die Anpassung der Produktentwicklungsprozesse und die Anpassung der Ressourcenallokationsprozesse erklären den Prozess welcher in einer neuen Produktarchitektur sowie einer erweiterten Organisationsstruktur mündet. Der Forschungsbeitrag der Arbeit liegt im Bereich der digitalen Produktinnovation. Sie verlagert den Forschungsfokus auf Fragen der Produktarchitektur und verbindet diese durch Konzepte der Modularität mit organisatorischen Fragestellungen. Für die Praxis ergeben sich vier Hebel die Entscheidungsträger/innen nutzen können, um die Fähigkeiten zur digitalen Produktinnovation zu stärken.
KW  - digital product innovation
KW  - digital transformation
KW  - digital innovation
KW  - digitale Produktinnovation
KW  - digitale Transformation
KW  - digitale Innovation
Y1  - 2024
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-646954
ER  - 
TY  - THES
A1  - Lagodzinski, Julius Albert Gregor
T1  - Counting homomorphisms over fields of prime order
T1  - Zählen von Homomorphismen über Körper mit Primzahlordnung
N2  - Homomorphisms are a fundamental concept in mathematics expressing the similarity of structures. They provide a framework that captures many of the central problems of computer science with close ties to various other fields of science. Thus, many studies over the last four decades have been devoted to the algorithmic complexity of homomorphism problems. Despite their generality, it has been found that non-uniform homomorphism problems, where the target structure is fixed, frequently feature complexity dichotomies. Exploring the limits of these dichotomies represents the common goal of this line of research.

We investigate the problem of counting homomorphisms to a fixed structure over a finite field of prime order and its algorithmic complexity. Our emphasis is on graph homomorphisms and the resulting problem #_{p}Hom[H] for a graph H and a prime p. The main research question is how counting over a finite field of prime order affects the complexity. 

In the first part of this thesis, we tackle the research question in its generality and develop a framework for studying the complexity of counting problems based on category theory. In the absence of problem-specific details, results in the language of category theory provide a clear picture of the properties needed and highlight common ground between different branches of science. The proposed problem #Mor^{C}[B] of counting the number of morphisms to a fixed object B of C is abstract in nature and encompasses important problems like constraint satisfaction problems, which serve as a leading example for all our results. We find explanations and generalizations for a plethora of results in counting complexity. Our main technical result is that specific matrices of morphism counts are non-singular. The strength of this result lies in its algebraic nature. First, our proofs rely on carefully constructed systems of linear equations, which we know to be uniquely solvable. Second, by exchanging the field that the matrix is defined by to a finite field of order p, we obtain analogous results for modular counting. For the latter, cancellations are implied by automorphisms of order p, but intriguingly we find that these present the only obstacle to translating our results from exact counting to modular counting. If we restrict our attention to reduced objects without automorphisms of order p, we obtain results analogue to those for exact counting. This is underscored by a confluent reduction that allows this restriction by constructing a reduced object for any given object. We emphasize the strength of the categorial perspective by applying the duality principle, which yields immediate consequences for the dual problem of counting the number of morphisms from a fixed object.

In the second part of this thesis, we focus on graphs and the problem #_{p}Hom[H]. We conjecture that automorphisms of order p capture all possible cancellations and that, for a reduced graph H, the problem #_{p}Hom[H] features the complexity dichotomy analogue to the one given for exact counting by Dyer and Greenhill. This serves as a generalization of the conjecture by Faben and Jerrum for the modulus 2. The criterion for tractability is that H is a collection of complete bipartite and reflexive complete graphs. From the findings of part one, we show that the conjectured dichotomy implies dichotomies for all quantum homomorphism problems, in particular counting vertex surjective homomorphisms and compactions modulo p. Since the tractable cases in the dichotomy are solved by trivial computations, the study of the intractable cases remains. As an initial problem in a series of reductions capable of implying hardness, we employ the problem of counting weighted independent sets in a bipartite graph modulo prime p. A dichotomy for this problem is shown, stating that the trivial cases occurring when a weight is congruent modulo p to 0 are the only tractable cases. We reduce the possible structure of H to the bipartite case by a reduction to the restricted homomorphism problem #_{p}Hom^{bip}[H] of counting modulo p the number of homomorphisms between bipartite graphs that maintain a given order of bipartition. This reduction does not have an impact on the accessibility of the technical results, thanks to the generality of the findings of part one. In order to prove the conjecture, it suffices to show that for a connected bipartite graph that is not complete, #_{p}Hom^{bip}[H] is #_{p}P-hard. Through a rigorous structural study of bipartite graphs, we establish this result for the rich class of bipartite graphs that are (K_{3,3}\{e}, domino)-free. This overcomes in particular the substantial hurdle imposed by squares, which leads us to explore the global structure of H and prove the existence of explicit structures that imply hardness.
N2  - Homomorphismen sind ein grundlegendes Konzept der Mathematik, das die Ähnlichkeit von Strukturen ausdrückt. Sie bieten einen Rahmen, der viele der zentralen Probleme der Informatik umfasst und enge Verbindungen zu verschiedenen Wissenschaftsbereichen aufweist. Aus diesem Grund haben sich in den letzten vier Jahrzehnten viele Studien mit der algorithmischen Komplexität von Homomorphismusproblemen beschäftigt. Trotz ihrer Allgemeingültigkeit wurden Komplexitätsdichotomien häufig für nicht-uniforme Homomorphismusprobleme nachgewiesen, bei denen die Zielstruktur fixiert ist. Die Grenzen dieser Dichotomien zu erforschen, ist das gemeinsame Ziel dieses Forschungskalküls.

Wir untersuchen das Problem und seine algorithmische Komplexität, Homomorphismen zu einer festen Struktur über einem endlichen Körper mit Primzahlordnung zu zählen. Wir konzentrieren uns auf Graphenhomomorphismen und das daraus resultierende Problem #_{p}Hom[H] für einen Graphen H und eine Primzahl p. Die Hauptforschungsfrage ist, wie das Zählen über einem endlichen Körper mit Primzahlordnung die Komplexität beeinflusst. 

Im ersten Teil wird die Forschungsfrage in ihrer Allgemeinheit behandelt und ein Rahmen für die Untersuchung der Komplexität von Zählproblemen auf der Grundlage der Kategorientheorie entwickelt. Losgelöst von problemspezifischen Details liefern die Ergebnisse in der Sprache der Kategorientheorie ein klares Bild der benötigten Eigenschaften und zeigen Gemeinsamkeiten zwischen verschiedenen Wissenschaftsgebieten auf. Das vorgeschlagene Problem #Mor^{C}[B] des Zählens der Anzahl von Morphismen zu einem festen Objekt B von C ist abstrakter Natur und umfasst wichtige Probleme wie Constraint Satisfaction Problems, die als leitendes Beispiel für alle unsere Ergebnisse dienen. Wir finden Erklärungen und Verallgemeinerungen für eine Vielzahl von Ergebnissen in der Komplexitätstheorie von Zählproblemen. Unser wichtigstes technisches Ergebnis ist, dass bestimmte Matrizen von Morphismenzahlen nicht singulär sind. Die Stärke dieses Ergebnisses liegt in seiner algebraischen Natur. Erstens basieren unsere Beweise auf sorgfältig konstruierten linearen Gleichungssystemen, von denen wir wissen, dass sie eindeutig lösbar sind. Zweitens, indem wir den Körper, über dem die Matrix definiert ist, durch einen endlichen Körper der Ordnung p ersetzen, erhalten wir analoge Ergebnisse für das modulare Zählen. Für letztere sind Annullierungen durch Automorphismen der Ordnung p impliziert, aber faszinierenderweise stellen diese das einzige Hindernis für die Übertragung unserer Ergebnisse von der exakten auf die modulare Zählung dar. Wenn wir unsere Aufmerksamkeit auf reduzierte Objekte ohne Automorphismen der Ordnung p beschränken, erhalten wir Ergebnisse, die zu denen des exakten Zählens analog sind. Dies wird durch eine konfluente Reduktion unterstrichen, die für jedes beliebige Objekt ein reduziertes Objekt konstruiert. Wir heben die Stärke der kategorialen Perspektive durch die Anwendung des Dualitätsprinzips hervor, das direkte Konsequenzen für das duale Problem des Zählens der Anzahl der Morphismen von einem fixen Objekts aus liefert.

Im zweiten Teil konzentrieren wir uns auf Graphen und das Problem #_{p}Hom[H]. Wir stellen die Vermutung auf, dass Automorphismen der Ordnung p alle möglichen Annullierungen erklären und dass das Problem #_{p}Hom[H] für einen reduzierten Graphen H eine Komplexitätsdichotomie analog zu der aufweist, die von Dyer und Greenhill für das exakte Zählen bewiesen wurde. Dies stellt eine Verallgemeinerung der Vermutung von Faben und Jerrum für den Modulus 2 dar. Das Kriterium für die effiziente Lösbarkeit ist, dass H lediglich aus vollständigen bipartiten und reflexiven vollständigen Graphen besteht. Basierend auf den Ergebnisse des ersten Teils zeigen wir, dass die Vermutung Dichotomien für alle Quantenhomomorphismenprobleme impliziert, insbesondere für das Zählen modulo p von Homomorphismen surjektiv auf Knoten und von Verdichtungen. Da die effizient lösbaren Fälle in der Dichotomie durch triviale Berechnungen gelöst werden, bleibt es, die unlösbaren Fälle zu untersuchen. Als erstes Problem in einer Reihe von Reduktionen, deren Ziel es ist, Härte zu implizieren, verwenden wir das Problem des Zählens gewichteter unabhängiger Mengen in einem bipartiten Graphen modulo p. Für dieses Problem beweisen wir eine Dichotomie, die besagt, dass nur die trivialen Fälle effizient lösbar sind. Diese treten auf, wenn ein Gewicht kongruent modulo p zu 0 ist. Durch eine Reduktion auf das eingeschränkte Homomorphismusproblem #_{p}Hom^{bip}[H] reduzieren wir die mögliche Struktur von H auf den bipartiten Fall. Hierbei handelt es sich um das Problem des Zählens modulo p der Homomorphismen zwischen bipartiten Graphen, die eine gegebene Ordnung der Bipartition erhalten. Dank der Allgemeingültigkeit der Ergebnisse des ersten Teils hat diese Reduktion keinen Einfluss auf die Verfügbarkeit der technischen Ergebnisse. Für einen Beweis der Vermutung genügt es zu zeigen, dass #_{p}Hom^{bip}[H] für einen zusammenhängenden und nicht vollständigen bipartiten Graphen #_{p}P-schwer ist. Durch eine rigorose Untersuchung der Struktur von bipartiten Graphen beweisen wir dieses Ergebnis für die umfangreiche Klasse von bipartiten Graphen, die (K_{3,3}\{e}, domino)-frei sind. Dies überwindet insbesondere die substantielle Hürde, die durch Quadrate gegeben ist und uns dazu veranlasst, die globale Struktur von H zu untersuchen und die Existenz expliziter Strukturen zu beweisen, die Härte implizieren.
KW  - complexity theory
KW  - (modular) counting
KW  - relational structures
KW  - categories
KW  - homomorphisms
KW  - Zählen
KW  - Kategorien
KW  - Komplexitätstheorie
KW  - Homomorphismen
KW  - relationale Strukturen
Y1  - 2024
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-646037
ER  - 
TY  - JOUR
A1  - Essen, Anna
A1  - Stern, Ariel Dora
A1  - Haase, Christoffer Bjerre
A1  - Car, Josip
A1  - Greaves, Felix
A1  - Paparova, Dragana
A1  - Vandeput, Steven
A1  - Wehrens, Rik
A1  - Bates, David W.
T1  - Health app policy
BT  - international comparison of nine countries' approaches
JF  - npj digital medicine
N2  - An abundant and growing supply of digital health applications (apps) exists in the commercial tech-sector, which can be bewildering for clinicians, patients, and payers. A growing challenge for the health care system is therefore to facilitate the identification of safe and effective apps for health care practitioners and patients to generate the most health benefit as well as guide payer coverage decisions. Nearly all developed countries are attempting to define policy frameworks to improve decision-making, patient care, and health outcomes in this context. This study compares the national policy approaches currently in development/use for health apps in nine countries. We used secondary data, combined with a detailed review of policy and regulatory documents, and interviews with key individuals and experts in the field of digital health policy to collect data about implemented and planned policies and initiatives. We found that most approaches aim for centralized pipelines for health app approvals, although some countries are adding decentralized elements. While the countries studied are taking diverse paths, there is nevertheless broad, international convergence in terms of requirements in the areas of transparency, health content, interoperability, and privacy and security. The sheer number of apps on the market in most countries represents a challenge for clinicians and patients. Our analyses of the relevant policies identified challenges in areas such as reimbursement, safety, and privacy and suggest that more regulatory work is needed in the areas of operationalization, implementation and international transferability of approvals. Cross-national efforts are needed around regulation and for countries to realize the benefits of these technologies.
Y1  - 2022
U6  - https://doi.org/10.1038/s41746-022-00573-1
SN  - 2398-6352
VL  - 5
IS  - 1
PB  - Macmillan Publishers Limited
CY  - Basingstoke
ER  - 
TY  - JOUR
A1  - Kühne, Katharina
A1  - Herbold, Erika
A1  - Bendel, Oliver
A1  - Zhou, Yuefang
A1  - Fischer, Martin H.
T1  - “Ick bin een Berlina”
BT  - dialect proficiency impacts a robot’s trustworthiness and competence evaluation
JF  - Frontiers in robotics and AI
N2  - Background:   Robots are increasingly used as interaction partners with humans. Social robots are designed to follow expected behavioral norms when engaging with humans and are available with different voices and even accents. Some studies suggest that people prefer robots to speak in the user’s dialect, while others indicate a preference for different dialects.

Methods:   Our study examined the impact of the Berlin dialect on perceived trustworthiness and competence of a robot. One hundred and twenty German native speakers (Mage = 32 years, SD = 12 years) watched an online video featuring a NAO robot speaking either in the Berlin dialect or standard German and assessed its trustworthiness and competence.

Results:   We found a positive relationship between participants’ self-reported Berlin dialect proficiency and trustworthiness in the dialect-speaking robot. Only when controlled for demographic factors, there was a positive association between participants’ dialect proficiency, dialect performance and their assessment of robot’s competence for the standard German-speaking robot. Participants’ age, gender, length of residency in Berlin, and device used to respond also influenced assessments. Finally, the robot’s competence positively predicted its trustworthiness.

Discussion:   Our results inform the design of social robots and emphasize the importance of device control in online experiments.
KW  - competence
KW  - dialect
KW  - human-robot interaction
KW  - robot voice
KW  - social robot
KW  - trust
Y1  - 2024
U6  - https://doi.org/10.3389/frobt.2023.1241519
SN  - 2296-9144
VL  - 10
PB  - Frontiers Media S.A.
CY  - Lausanne
ER  - 
TY  - JOUR
A1  - Dressler, Falko
A1  - Chiasserini, Carla Fabiana
A1  - Fitzek, Frank H. P.
A1  - Karl, Holger
A1  - Cigno, Renato Lo
A1  - Capone, Antonio
A1  - Casetti, Claudio
A1  - Malandrino, Francesco
A1  - Mancuso, Vincenzo
A1  - Klingler, Florian
A1  - Rizzo, Gianluca
T1  - V-Edge
BT  - virtual edge computing as an enabler for novel microservices and cooperative computing
JF  - IEEE network
N2  - As we move from 5G to 6G, edge computing is one of the concepts that needs revisiting. Its core idea is still intriguing: Instead of sending all data and tasks from an end user's device to the cloud, possibly covering thousands of kilometers and introducing delays lower-bounded by propagation speed, edge servers deployed in close proximity to the user (e.g., at some base station) serve as proxy for the cloud. This is particularly interesting for upcoming machine-learning-based intelligent services, which require substantial computational and networking performance for continuous model training. However, this promising idea is hampered by the limited number of such edge servers. In this article, we discuss a way forward, namely the V-Edge concept. V-Edge helps bridge the gap between cloud, edge, and fog by virtualizing all available resources including the end users' devices and making these resources widely available. Thus, V-Edge acts as an enabler for novel microservices as well as cooperative computing solutions in next-generation networks. We introduce the general V-Edge architecture, and we characterize some of the key research challenges to overcome in order to enable wide-spread and intelligent edge services.
KW  - Training
KW  - Performance evaluation
KW  - Cloud computing
KW  - Microservice
KW  - architectures
KW  - Computer architecture
KW  - Delays
KW  - Servers
Y1  - 2022
U6  - https://doi.org/10.1109/MNET.001.2100491
SN  - 0890-8044
SN  - 1558-156X
VL  - 36
IS  - 3
SP  - 24
EP  - 31
PB  - Inst. of Electr. and Electronics Engineers
CY  - Piscataway
ER  - 
TY  - JOUR
A1  - Ehrig, Lukas
A1  - Wagner, Ann-Christin
A1  - Wolter, Heike
A1  - Correll, Christoph U.
A1  - Geisel, Olga
A1  - Konigorski, Stefan
T1  - FASDetect as a machine learning-based screening app for FASD in youth with ADHD
JF  - npj Digital Medicine
N2  - Fetal alcohol-spectrum disorder (FASD) is underdiagnosed and often misdiagnosed as attention-deficit/hyperactivity disorder (ADHD). Here, we develop a screening tool for FASD in youth with ADHD symptoms. To develop the prediction model, medical record data from a German University outpatient unit are assessed including 275 patients aged 0-19 years old with FASD with or without ADHD and 170 patients with ADHD without FASD aged 0-19 years old. We train 6 machine learning models based on 13 selected variables and evaluate their performance. Random forest models yield the best prediction models with a cross-validated AUC of 0.92 (95% confidence interval [0.84, 0.99]). Follow-up analyses indicate that a random forest model with 6 variables - body length and head circumference at birth, IQ, socially intrusive behaviour, poor memory and sleep disturbance - yields equivalent predictive accuracy. We implement the prediction model in a web-based app called FASDetect - a user-friendly, clinically scalable FASD risk calculator that is freely available at https://fasdetect.dhc-lab.hpi.de.
KW  - Medical research
KW  - Psychiatric disorders
Y1  - 2023
U6  - https://doi.org/10.1038/s41746-023-00864-1
SN  - 2398-6352
VL  - 6
IS  - 1
PB  - Macmillan Publishers Limited
CY  - Basingstoke
ER  - 
TY  - JOUR
A1  - Slosarek, Tamara
A1  - Ibing, Susanne
A1  - Schormair, Barbara
A1  - Heyne, Henrike
A1  - Böttinger, Erwin
A1  - Andlauer, Till
A1  - Schurmann, Claudia
T1  - Implementation and evaluation of personal genetic testing as part of genomics analysis courses in German universities
JF  - BMC Medical Genomics
N2  - Purpose
Due to the increasing application of genome analysis and interpretation in medical disciplines, professionals require adequate education. Here, we present the implementation of personal genotyping as an educational tool in two genomics courses targeting Digital Health students at the Hasso Plattner Institute (HPI) and medical students at the Technical University of Munich (TUM).

Methods
We compared and evaluated the courses and the students ' perceptions on the course setup using questionnaires.

Results
During the course, students changed their attitudes towards genotyping (HPI: 79% [15 of 19], TUM: 47% [25 of 53]). Predominantly, students became more critical of personal genotyping (HPI: 73% [11 of 15], TUM: 72% [18 of 25]) and most students stated that genetic analyses should not be allowed without genetic counseling (HPI: 79% [15 of 19], TUM: 70% [37 of 53]). Students found the personal genotyping component useful (HPI: 89% [17 of 19], TUM: 92% [49 of 53]) and recommended its inclusion in future courses (HPI: 95% [18 of 19], TUM: 98% [52 of 53]).

Conclusion
Students perceived the personal genotyping component as valuable in the described genomics courses. The implementation described here can serve as an example for future courses in Europe.
KW  - Genomics education
KW  - Personal genotyping
KW  - Personalized medicine
Y1  - 2023
U6  - https://doi.org/10.1186/s12920-023-01503-0
SN  - 1755-8794
VL  - 16
IS  - 1
PB  - BMC
CY  - London
ER  - 
TY  - THES
A1  - Taleb, Aiham
T1  - Self-supervised deep learning methods for medical image analysis
T1  - Selbstüberwachte Deep Learning Methoden für die medizinische Bildanalyse
N2  - Deep learning has seen widespread application in many domains, mainly for its ability to learn data representations from raw input data. Nevertheless, its success has so far been coupled with the availability of large annotated (labelled) datasets. This is a requirement that is difficult to fulfil in several domains, such as in medical imaging. Annotation costs form a barrier in extending deep learning to clinically-relevant use cases. The labels associated with medical images are scarce, since the generation of expert annotations of multimodal patient data at scale is non-trivial, expensive, and time-consuming. This substantiates the need for algorithms that learn from the increasing amounts of unlabeled data. Self-supervised representation learning algorithms offer a pertinent solution, as they allow solving real-world (downstream) deep learning tasks with fewer annotations. Self-supervised approaches leverage unlabeled samples to acquire generic features about different concepts, enabling annotation-efficient downstream task solving subsequently.
Nevertheless, medical images present multiple unique and inherent challenges for existing self-supervised learning approaches, which we seek to address in this thesis: (i) medical images are multimodal, and their multiple modalities are heterogeneous in nature and imbalanced in quantities, e.g. MRI and CT; (ii) medical scans are multi-dimensional, often in 3D instead of 2D; (iii) disease patterns in medical scans are numerous and their incidence exhibits a long-tail distribution, so it is oftentimes essential to fuse knowledge from different data modalities, e.g. genomics or clinical data, to capture disease traits more comprehensively; (iv) Medical scans usually exhibit more uniform color density distributions, e.g. in dental X-Rays, than natural images. Our proposed self-supervised methods meet these challenges, besides significantly reducing the amounts of required annotations.
We evaluate our self-supervised methods on a wide array of medical imaging applications and tasks. Our experimental results demonstrate the obtained gains in both annotation-efficiency and performance; our proposed methods outperform many approaches from related literature. Additionally, in case of fusion with genetic modalities, our methods also allow for cross-modal interpretability. In this thesis, not only we show that self-supervised learning is capable of mitigating manual annotation costs, but also our proposed solutions demonstrate how to better utilize it in the medical imaging domain. Progress in self-supervised learning has the potential to extend deep learning algorithms application to clinical scenarios.
N2  - Deep Learning findet in vielen Bereichen breite Anwendung, vor allem wegen seiner Fähigkeit, Datenrepräsentationen aus rohen Eingabedaten zu lernen. Dennoch war der Erfolg bisher an die Verfügbarkeit großer annotatierter Datensätze geknüpft. Dies ist eine Anforderung, die in verschiedenen Bereichen, z. B. in der medizinischen Bildgebung, schwer zu erfüllen ist. Die Kosten für die Annotation stellen ein Hindernis für die Ausweitung des Deep Learning auf klinisch relevante Anwendungsfälle dar. Die mit medizinischen Bildern verbundenen Annotationen sind rar, da die Erstellung von Experten Annotationen für multimodale Patientendaten in großem Umfang nicht trivial, teuer und zeitaufwändig ist. Dies unterstreicht den Bedarf an Algorithmen, die aus den wachsenden Mengen an unbeschrifteten Daten lernen. Selbstüberwachte Algorithmen für das Repräsentationslernen bieten eine mögliche Lösung, da sie die Lösung realer (nachgelagerter) Deep-Learning-Aufgaben mit weniger Annotationen ermöglichen. Selbstüberwachte Ansätze nutzen unannotierte Stichproben, um generisches Eigenschaften über verschiedene Konzepte zu erlangen und ermöglichen so eine annotationseffiziente Lösung nachgelagerter Aufgaben.
Medizinische Bilder stellen mehrere einzigartige und inhärente Herausforderungen für existierende selbstüberwachte Lernansätze dar, die wir in dieser Arbeit angehen wollen: (i) medizinische Bilder sind multimodal, und ihre verschiedenen Modalitäten sind von Natur aus heterogen und in ihren Mengen unausgewogen, z.B. (ii) medizinische Scans sind mehrdimensional, oft in 3D statt in 2D; (iii) Krankheitsmuster in medizinischen Scans sind zahlreich und ihre Häufigkeit weist eine Long-Tail-Verteilung auf, so dass es oft unerlässlich ist, Wissen aus verschiedenen Datenmodalitäten, z. B. Genomik oder klinische Daten, zu verschmelzen, um Krankheitsmerkmale umfassender zu erfassen; (iv) medizinische Scans weisen in der Regel eine gleichmäßigere Farbdichteverteilung auf, z. B. in zahnmedizinischen Röntgenaufnahmen, als natürliche Bilder. Die von uns vorgeschlagenen selbstüberwachten Methoden adressieren diese Herausforderungen und reduzieren zudem die Menge der erforderlichen Annotationen erheblich.
Wir evaluieren unsere selbstüberwachten Methoden in verschiedenen Anwendungen und Aufgaben der medizinischen Bildgebung. Unsere experimentellen Ergebnisse zeigen, dass die von uns vorgeschlagenen Methoden sowohl die Effizienz der Annotation als auch die Leistung steigern und viele Ansätze aus der verwandten Literatur übertreffen. Darüber hinaus ermöglichen unsere Methoden im Falle der Fusion mit genetischen Modalitäten auch eine modalübergreifende Interpretierbarkeit. In dieser Arbeit zeigen wir nicht nur, dass selbstüberwachtes Lernen in der Lage ist, die Kosten für manuelle Annotationen zu senken, sondern auch, wie man es in der medizinischen Bildgebung besser nutzen kann. Fortschritte beim selbstüberwachten Lernen haben das Potenzial, die Anwendung von Deep-Learning-Algorithmen auf klinische Szenarien auszuweiten.
KW  - Artificial Intelligence
KW  - machine learning
KW  - unsupervised learning
KW  - representation learning
KW  - Künstliche Intelligenz
KW  - maschinelles Lernen
KW  - Representationlernen
KW  - selbstüberwachtes Lernen
Y1  - 2024
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-644089
ER  - 
TY  - JOUR
A1  - Shams, Boshra
A1  - Wang, Ziqian
A1  - Roine, Timo
A1  - Aydogan, Dogu Baran
A1  - Vajkoczy, Peter
A1  - Lippert, Christoph
A1  - Picht, Thomas
A1  - Fekonja, Lucius Samo
T1  - Machine learning-based prediction of motor status in glioma patients using diffusion MRI metrics along the corticospinal tract
JF  - Brain communications
N2  - Shams et al. report that glioma patients' motor status is predicted accurately by diffusion MRI metrics along the corticospinal tract based on support vector machine method, reaching an overall accuracy of 77%. They show that these metrics are more effective than demographic and clinical variables. 

Along tract statistics enables white matter characterization using various diffusion MRI metrics. These diffusion models reveal detailed insights into white matter microstructural changes with development, pathology and function. Here, we aim at assessing the clinical utility of diffusion MRI metrics along the corticospinal tract, investigating whether motor glioma patients can be classified with respect to their motor status. We retrospectively included 116 brain tumour patients suffering from either left or right supratentorial, unilateral World Health Organization Grades II, III and IV gliomas with a mean age of 53.51 +/- 16.32 years. Around 37% of patients presented with preoperative motor function deficits according to the Medical Research Council scale. At group level comparison, the highest non-overlapping diffusion MRI differences were detected in the superior portion of the tracts' profiles. Fractional anisotropy and fibre density decrease, apparent diffusion coefficient axial diffusivity and radial diffusivity increase. To predict motor deficits, we developed a method based on a support vector machine using histogram-based features of diffusion MRI tract profiles (e.g. mean, standard deviation, kurtosis and skewness), following a recursive feature elimination method. Our model achieved high performance (74% sensitivity, 75% specificity, 74% overall accuracy and 77% area under the curve). We found that apparent diffusion coefficient, fractional anisotropy and radial diffusivity contributed more than other features to the model. Incorporating the patient demographics and clinical features such as age, tumour World Health Organization grade, tumour location, gender and resting motor threshold did not affect the model's performance, revealing that these features were not as effective as microstructural measures. These results shed light on the potential patterns of tumour-related microstructural white matter changes in the prediction of functional deficits.
KW  - machine learning
KW  - support vector machine
KW  - tractography
KW  - diffusion MRI;
KW  - corticospinal tract
Y1  - 2022
U6  - https://doi.org/10.1093/braincomms/fcac141
SN  - 2632-1297
VL  - 4
IS  - 3
PB  - Oxford University Press
CY  - Oxford
ER  - 
TY  - JOUR
A1  - Ring, Raphaela M.
A1  - Eisenmann, Clemens
A1  - Kandil, Farid
A1  - Steckhan, Nico
A1  - Demmrich, Sarah
A1  - Klatte, Caroline
A1  - Kessler, Christian S.
A1  - Jeitler, Michael
A1  - Boschmann, Michael
A1  - Michalsen, Andreas
A1  - Blakeslee, Sarah B.
A1  - Stöckigt, Barbara
A1  - Stritter, Wiebke
A1  - Koppold-Liebscher, Daniela A.
T1  - Mental and behavioural responses to Bahá’í fasting: Looking behind the scenes of a religiously motivated intermittent fast using a mixed methods approach
JF  - Nutrients
N2  - Background/Objective: Historically, fasting has been practiced not only for medical but also for religious reasons. Baha'is follow an annual religious intermittent dry fast of 19 days. We inquired into motivation behind and subjective health impacts of Baha'i fasting. Methods: A convergent parallel mixed methods design was embedded in a clinical single arm observational study. Semi-structured individual interviews were conducted before (n = 7), during (n = 8), and after fasting (n = 8). Three months after the fasting period, two focus group interviews were conducted (n = 5/n = 3). A total of 146 Baha'i volunteers answered an online survey at five time points before, during, and after fasting. Results: Fasting was found to play a central role for the religiosity of interviewees, implying changes in daily structures, spending time alone, engaging in religious practices, and experiencing social belonging. Results show an increase in mindfulness and well-being, which were accompanied by behavioural changes and experiences of self-efficacy and inner freedom. Survey scores point to an increase in mindfulness and well-being during fasting, while stress, anxiety, and fatigue decreased. Mindfulness remained elevated even three months after the fast. Conclusion: Baha'i fasting seems to enhance participants' mindfulness and well-being, lowering stress levels and reducing fatigue. Some of these effects lasted more than three months after fasting.
KW  - intermittent food restriction
KW  - mindfulness
KW  - self-efficacy
KW  - well-being
KW  - mixed methods
KW  - health behaviour
KW  - coping ability
KW  - religiously motivated
KW  - dry fasting
Y1  - 2022
U6  - https://doi.org/10.3390/nu14051038
SN  - 2072-6643
VL  - 14
IS  - 5
PB  - MDPI
CY  - Basel
ER  - 
TY  - BOOK
A1  - Kuban, Robert
A1  - Rotta, Randolf
A1  - Nolte, Jörg
A1  - Chromik, Jonas
A1  - Beilharz, Jossekin Jakob
A1  - Pirl, Lukas
A1  - Friedrich, Tobias
A1  - Lenzner, Pascal
A1  - Weyand, Christopher
A1  - Juiz, Carlos
A1  - Bermejo, Belen
A1  - Sauer, Joao
A1  - Coelh, Leandro dos Santos
A1  - Najafi, Pejman
A1  - Pünter, Wenzel
A1  - Cheng, Feng
A1  - Meinel, Christoph
A1  - Sidorova, Julia
A1  - Lundberg, Lars
A1  - Vogel, Thomas
A1  - Tran, Chinh
A1  - Moser, Irene
A1  - Grunske, Lars
A1  - Elsaid, Mohamed Esameldin Mohamed
A1  - Abbas, Hazem M.
A1  - Rula, Anisa
A1  - Sejdiu, Gezim
A1  - Maurino, Andrea
A1  - Schmidt, Christopher
A1  - Hügle, Johannes
A1  - Uflacker, Matthias
A1  - Nozza, Debora
A1  - Messina, Enza
A1  - Hoorn, André van
A1  - Frank, Markus
A1  - Schulz, Henning
A1  - Alhosseini Almodarresi Yasin, Seyed Ali
A1  - Nowicki, Marek
A1  - Muite, Benson K.
A1  - Boysan, Mehmet Can
A1  - Bianchi, Federico
A1  - Cremaschi, Marco
A1  - Moussa, Rim
A1  - Abdel-Karim, Benjamin M.
A1  - Pfeuffer, Nicolas
A1  - Hinz, Oliver
A1  - Plauth, Max
A1  - Polze, Andreas
A1  - Huo, Da
A1  - Melo, Gerard de
A1  - Mendes Soares, Fábio
A1  - Oliveira, Roberto Célio Limão de
A1  - Benson, Lawrence
A1  - Paul, Fabian
A1  - Werling, Christian
A1  - Windheuser, Fabian
A1  - Stojanovic, Dragan
A1  - Djordjevic, Igor
A1  - Stojanovic, Natalija
A1  - Stojnev Ilic, Aleksandra
A1  - Weidmann, Vera
A1  - Lowitzki, Leon
A1  - Wagner, Markus
A1  - Ifa, Abdessatar Ben
A1  - Arlos, Patrik
A1  - Megia, Ana
A1  - Vendrell, Joan
A1  - Pfitzner, Bjarne
A1  - Redondo, Alberto
A1  - Ríos Insua, David
A1  - Albert, Justin Amadeus
A1  - Zhou, Lin
A1  - Arnrich, Bert
A1  - Szabó, Ildikó
A1  - Fodor, Szabina
A1  - Ternai, Katalin
A1  - Bhowmik, Rajarshi
A1  - Campero Durand, Gabriel
A1  - Shevchenko, Pavlo
A1  - Malysheva, Milena
A1  - Prymak, Ivan
A1  - Saake, Gunter
ED  - Meinel, Christoph
ED  - Polze, Andreas
ED  - Beins, Karsten
ED  - Strotmann, Rolf
ED  - Seibold, Ulrich
ED  - Rödszus, Kurt
ED  - Müller, Jürgen
T1  - HPI Future SOC Lab – Proceedings 2019
N2  - The “HPI Future SOC Lab” is a cooperation of the Hasso Plattner Institute (HPI) and industry partners. Its mission is to enable and promote exchange and interaction between the research community and the industry partners.
  The HPI Future SOC Lab provides researchers with free of charge access to a complete infrastructure of state of the art hard and software. This infrastructure includes components, which might be too expensive for an ordinary research environment, such as servers with up to 64 cores and 2 TB main memory. The offerings address researchers particularly from but not limited to the areas of computer science and business information systems. Main areas of research include cloud computing, parallelization, and In-Memory technologies.
  This technical report presents results of research projects executed in 2019. Selected projects have presented their results on April 9th and November 12th 2019 at the Future SOC Lab Day events.
N2  - Das Future SOC Lab am HPI ist eine Kooperation des Hasso-Plattner-Instituts mit verschiedenen Industriepartnern. Seine Aufgabe ist die Ermöglichung und Förderung des Austausches zwischen Forschungsgemeinschaft und Industrie.
  Am Lab wird interessierten Wissenschaftlern eine Infrastruktur von neuester Hard- und Software kostenfrei für Forschungszwecke zur Verfügung gestellt. Dazu zählen teilweise noch nicht am Markt verfügbare Technologien, die im normalen Hochschulbereich in der Regel nicht zu finanzieren wären, bspw. Server mit bis zu 64 Cores und 2 TB Hauptspeicher. Diese Angebote richten sich insbesondere an Wissenschaftler in den Gebieten Informatik und Wirtschaftsinformatik. Einige der Schwerpunkte sind Cloud Computing, Parallelisierung und In-Memory Technologien. 
  In diesem Technischen Bericht werden die Ergebnisse der Forschungsprojekte des Jahres 2019 vorgestellt.  Ausgewählte Projekte stellten ihre Ergebnisse am 09. April und 12. November 2019 im Rahmen des Future SOC Lab Tags vor.
T3  - Technische Berichte des Hasso-Plattner-Instituts für Digital Engineering an der Universität Potsdam - 158 
KW  - Future SOC Lab
KW  - research projects
KW  - multicore architectures
KW  - in-memory technology
KW  - cloud computing
KW  - machine learning
KW  - artifical intelligence
KW  - Future SOC Lab
KW  - Forschungsprojekte
KW  - Multicore Architekturen
KW  - In-Memory Technologie
KW  - Cloud Computing
KW  - maschinelles Lernen
KW  - künstliche Intelligenz
Y1  - 2023
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-597915
SN  - 978-3-86956-564-4
SN  - 1613-5652
SN  - 2191-1665
IS  - 158
PB  - Universitätsverlag Potsdam
CY  - Potsdam
ER  - 
TY  - THES
A1  - Richly, Keven
T1  - Memory-efficient data management for spatio-temporal applications
BT  - workload-driven fine-grained configuration optimization for storing spatio-temporal data in columnar In-memory databases
N2  - The wide distribution of location-acquisition technologies means that large volumes of spatio-temporal data are continuously being accumulated. Positioning systems such as GPS enable the tracking of various moving objects' trajectories, which are usually represented by a chronologically ordered sequence of observed locations. The analysis of movement patterns based on detailed positional information creates opportunities for applications that can improve business decisions and processes in a broad spectrum of industries (e.g., transportation, traffic control, or medicine). Due to the large data volumes generated in these applications, the cost-efficient storage of spatio-temporal data is desirable, especially when in-memory database systems are used to achieve interactive performance requirements. 

To efficiently utilize the available DRAM capacities, modern database systems support various tuning possibilities to reduce the memory footprint (e.g., data compression) or increase performance (e.g., additional indexes structures). By considering horizontal data partitioning, we can independently apply different tuning options on a fine-grained level. However, the selection of cost and performance-balancing configurations is challenging, due to the vast number of possible setups consisting of mutually dependent individual decisions. 

In this thesis, we introduce multiple approaches to improve spatio-temporal data management by automatically optimizing diverse tuning options for the application-specific access patterns and data characteristics. Our contributions are as follows:
(1) We introduce a novel approach to determine fine-grained table configurations for spatio-temporal workloads. Our linear programming (LP) approach jointly optimizes the (i) data compression, (ii) ordering, (iii) indexing, and (iv) tiering. We propose different models which address cost dependencies at different levels of accuracy to compute optimized tuning configurations for a given workload, memory budgets, and data characteristics. To yield maintainable and robust configurations, we further extend our LP-based approach to incorporate reconfiguration costs as well as optimizations for multiple potential workload scenarios. 
(2) To optimize the storage layout of timestamps in columnar databases, we present a heuristic approach for the workload-driven combined selection of a data layout and compression scheme. By considering attribute decomposition strategies, we are able to apply application-specific optimizations that reduce the memory footprint and improve performance.   
(3) We introduce an approach that leverages past trajectory data to improve the dispatch processes of transportation network companies. Based on location probabilities, we developed risk-averse dispatch strategies that reduce critical delays.
(4) Finally, we used the use case of a transportation network company to evaluate our database optimizations on a real-world dataset. We demonstrate that workload-driven fine-grained optimizations allow us to reduce the memory footprint (up to 71% by equal performance) or increase the performance (up to 90% by equal memory size) compared to established rule-based heuristics. 

Individually, our contributions provide novel approaches to the current challenges in spatio-temporal data mining and database research. Combining them allows in-memory databases to store and process spatio-temporal data more cost-efficiently.
N2  - Durch die starke Verbreitung von Systemen zur Positionsbestimmung werden fortlaufend große Mengen an Bewegungsdaten mit einem räumlichen und zeitlichen Bezug gesammelt. Ortungssysteme wie GPS ermöglichen, die Bewegungen verschiedener Objekte (z. B. Personen oder Fahrzeuge) nachzuverfolgen. Diese werden in der Regel durch eine chronologisch geordnete Abfolge beobachteter Aufenthaltsorte repräsentiert. Die Analyse von Bewegungsmustern auf der Grundlage detaillierter Positionsinformationen schafft in unterschiedlichsten Branchen (z. B. Transportwesen, Verkehrssteuerung oder Medizin) die Möglichkeit Geschäftsentscheidungen und -prozesse zu verbessern. Aufgrund der großen Datenmengen, die bei diesen Anwendungen auftreten, stellt die kosteneffiziente Speicherung von Bewegungsdaten eine Herausforderung dar. Dies ist insbesondere der Fall, wenn Hauptspeicherdatenbanken zur Speicherung eingesetzt werden, um die Anforderungen bezüglich interaktiver Antwortzeiten zu erfüllen.

Um die verfügbaren Speicherkapazitäten effizient zu nutzen, unterstützen moderne Datenbanksysteme verschiedene Optimierungsmöglichkeiten, um den Speicherbedarf zu reduzieren (z. B. durch Datenkomprimierung) oder die Performance zu erhöhen (z. B. durch Indexstrukturen). Dabei ermöglicht eine horizontale Partitionierung der Daten, dass unabhängig voneinander verschiedene Optimierungen feingranular auf einzelnen Bereichen der Daten angewendet werden können. Die Auswahl von Konfigurationen, die sowohl die Kosten als auch Leistungsanforderungen berücksichtigen, ist jedoch aufgrund der großen Anzahl möglicher Kombinationen -- die aus voneinander abhängigen Einzelentscheidungen bestehen -- komplex.

In dieser Dissertation präsentieren wir mehrere Ansätze zur Verbesserung der Datenverwaltung, indem wir die Auswahl verschiedener Datenbankoptimierungen automatisch für die anwendungsspezifischen Zugriffsmuster und Dateneigenschaften anpassen. Diesbezüglich leistet die vorliegende Dissertation die folgenden Beiträge: (1) Wir stellen einen neuen Ansatz vor, um feingranulare Tabellenkonfigurationen für räumlich-zeitliche Workloads zu bestimmen. In diesem Zusammenhang optimiert unser Linear Programming (LP) Ansatz gemeinsam (i) die Datenkompression, (ii) die Sortierung, (iii) die Indizierung und (iv) die Datenplatzierung. Hierzu schlagen wir verschiedene Modelle mit unterschiedlichen Kostenabhängigkeiten vor, um optimierte Konfigurationen für einen gegebenen Workload, ein Speicherbudget und die vorliegenden Dateneigenschaften zu berechnen. Durch die Erweiterung des LP-basierten Ansatzes zur Berücksichtigung von Modifikationskosten und verschiedener potentieller Workloads ist es möglich, die Wartbarkeit und Robustheit der bestimmten Tabellenkonfiguration zu erhöhen.
(2) Um die Speicherung von Timestamps in spalten-orientierten Datenbanken zu optimieren, stellen wir einen heuristischen Ansatz für die kombinierte Auswahl eines Speicherlayouts und eines Kompressionsschemas vor. Zudem sind wir durch die Berücksichtigung von Strategien zur Aufteilung von Attributen in der Lage, anwendungsspezifische Optimierungen anzuwenden, die den Speicherbedarf reduzieren und die Performance verbessern.
(3) Wir stellen einen Ansatz vor, der in der Vergangenheit beobachtete Bewegungsmuster nutzt, um die Zuweisungsprozesse von Vermittlungsdiensten zur Personenbeförderung zu verbessern. Auf der Grundlage von Standortwahrscheinlichkeiten haben wir verschiedene Strategien für die Vergabe von Fahraufträgen an Fahrer entwickelt, die kritische Verspätungen reduzieren.
(4) Abschließend haben wir unsere Datenbankoptimierungen anhand eines realen Datensatzes eines Transportdienstleisters evaluiert. In diesem Zusammenhang zeigen wir, dass wir durch feingranulare workload-basierte Optimierungen den Speicherbedarf (um bis zu 71% bei vergleichbarer Performance) reduzieren oder die Performance (um bis zu 90% bei gleichem Speicherverbrauch) im Vergleich zu regelbasierten Heuristiken verbessern können.

Die einzelnen Beiträge stellen neuartige Ansätze für aktuelle Herausforderungen im Bereich des Data Mining und der Datenbankforschung dar. In Kombination ermöglichen sie eine kosteneffizientere Speicherung und Verarbeitung von Bewegungsdaten in Hauptspeicherdatenbanken.
KW  - spatio-temporal data management
KW  - trajectory data
KW  - columnar databases
KW  - in-memory data management
KW  - database tuning
KW  - spaltenorientierte Datenbanken
KW  - Datenbankoptimierung
KW  - Hauptspeicher Datenmanagement
KW  - Datenverwaltung für Daten mit räumlich-zeitlichem Bezug
KW  - Trajektoriendaten
Y1  - 2024
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-635473
ER  - 
TY  - JOUR
A1  - Rosin, Paul L.
A1  - Lai, Yu-Kun
A1  - Mould, David
A1  - Yi, Ran
A1  - Berger, Itamar
A1  - Doyle, Lars
A1  - Lee, Seungyong
A1  - Li, Chuan
A1  - Liu, Yong-Jin
A1  - Semmo, Amir
A1  - Shamir, Ariel
A1  - Son, Minjung
A1  - Winnemöller, Holger
T1  - NPRportrait 1.0: A three-level benchmark for non-photorealistic rendering of portraits
JF  - Computational visual media
N2  - Recently, there has been an upsurge of activity in image-based non-photorealistic rendering (NPR), and in particular portrait image stylisation, due to the advent of neural style transfer (NST). However, the state of performance evaluation in this field is poor, especially compared to the norms in the computer vision and machine learning communities. Unfortunately, the task of evaluating image stylisation is thus far not well defined, since it involves subjective, perceptual, and aesthetic aspects. To make progress towards a solution, this paper proposes a new structured, three-level, benchmark dataset for the evaluation of stylised portrait images. Rigorous criteria were used for its construction, and its consistency was validated by user studies. Moreover, a new methodology has been developed for evaluating portrait stylisation algorithms, which makes use of the different benchmark levels as well as annotations provided by user studies regarding the characteristics of the faces. We perform evaluation for a wide variety of image stylisation methods (both portrait-specific and general purpose, and also both traditional NPR approaches and NST) using the new benchmark dataset.
KW  - non-photorealistic rendering (NPR)
KW  - image stylization
KW  - style transfer
KW  - portrait
KW  - evaluation
KW  - benchmark
Y1  - 2022
U6  - https://doi.org/10.1007/s41095-021-0255-3
SN  - 2096-0433
SN  - 2096-0662
VL  - 8
IS  - 3
SP  - 445
EP  - 465
PB  - Springer Nature
CY  - London
ER  - 
TY  - JOUR
A1  - Vitagliano, Gerardo
A1  - Hameed, Mazhar
A1  - Jiang, Lan
A1  - Reisener, Lucas
A1  - Wu, Eugene
A1  - Naumann, Felix
T1  - Pollock: a data loading benchmark
JF  - Proceedings of the VLDB Endowment
N2  - Any system at play in a data-driven project has a fundamental requirement: the ability to load data. The de-facto standard format to distribute and consume raw data is CSV. Yet, the plain text and flexible nature of this format make such files often difficult to parse and correctly load their content, requiring cumbersome data preparation steps. We propose a benchmark to assess the robustness of systems in loading data from non-standard CSV formats and with structural inconsistencies. First, we formalize a model to describe the issues that affect real-world files and use it to derive a systematic lpollutionz process to generate dialects for any given grammar. Our benchmark leverages the pollution framework for the csv format. To guide pollution, we have surveyed thousands of real-world, publicly available csv files, recording the problems we encountered. We demonstrate the applicability of our benchmark by testing and scoring 16 different systems: popular csv parsing frameworks, relational database tools, spreadsheet systems, and a data visualization tool.
Y1  - 2023
U6  - https://doi.org/10.14778/3594512.3594518
SN  - 2150-8097
VL  - 16
IS  - 8
SP  - 1870
EP  - 1882
PB  - Association for Computing Machinery
CY  - New York
ER  - 
TY  - JOUR
A1  - Wiemker, Veronika
A1  - Bunova, Anna
A1  - Neufeld, Maria
A1  - Gornyi, Boris
A1  - Yurasova, Elena
A1  - Konigorski, Stefan
A1  - Kalinina, Anna
A1  - Kontsevaya, Anna
A1  - Ferreira-Borges, Carina
A1  - Probst, Charlotte
T1  - Pilot study to evaluate usability and acceptability of the 'Animated Alcohol Assessment Tool' in Russian primary healthcare
JF  - Digital health
N2  - Background and aims: Accurate and user-friendly assessment tools quantifying alcohol consumption are a prerequisite to effective prevention and treatment programmes, including Screening and Brief Intervention. Digital tools offer new potential in this field. We developed the ‘Animated Alcohol Assessment Tool’ (AAA-Tool), a mobile app providing an interactive version of the World Health Organization's Alcohol Use Disorders Identification Test (AUDIT) that facilitates the description of individual alcohol consumption via culturally informed animation features. This pilot study evaluated the Russia-specific version of the Animated Alcohol Assessment Tool with regard to (1) its usability and acceptability in a primary healthcare setting, (2) the plausibility of its alcohol consumption assessment results and (3) the adequacy of its Russia-specific vessel and beverage selection. Methods: Convenience samples of 55 patients (47% female) and 15 healthcare practitioners (80% female) in 2 Russian primary healthcare facilities self-administered the Animated Alcohol Assessment Tool and rated their experience on the Mobile Application Rating Scale – User Version. Usage data was automatically collected during app usage, and additional feedback on regional content was elicited in semi-structured interviews. Results: On average, patients completed the Animated Alcohol Assessment Tool in 6:38 min (SD = 2.49, range = 3.00–17.16). User satisfaction was good, with all subscale Mobile Application Rating Scale – User Version scores averaging >3 out of 5 points. A majority of patients (53%) and practitioners (93%) would recommend the tool to ‘many people’ or ‘everyone’. Assessed alcohol consumption was plausible, with a low number (14%) of logically impossible entries. Most patients reported the Animated Alcohol Assessment Tool to reflect all vessels (78%) and all beverages (71%) they typically used. Conclusion: High acceptability ratings by patients and healthcare practitioners, acceptable completion time, plausible alcohol usage assessment results and perceived adequacy of region-specific content underline the Animated Alcohol Assessment Tool's potential to provide a novel approach to alcohol assessment in primary healthcare. After its validation, the Animated Alcohol Assessment Tool might contribute to reducing alcohol-related harm by facilitating Screening and Brief Intervention implementation in Russia and beyond.
KW  - Alcohol use assessment
KW  - Alcohol Use Disorders Identification Test
KW  - screening tools
KW  - digital health
KW  - mobile applications
KW  - Russia
KW  - primary healthcare
KW  - usability
KW  - acceptability
Y1  - 2022
U6  - https://doi.org/10.1177/20552076211074491
SN  - 2055-2076
VL  - 8
PB  - Sage Publications
CY  - London
ER  - 
TY  - JOUR
A1  - Fehr, Jana
A1  - Piccininni, Marco
A1  - Kurth, Tobias
A1  - Konigorski, Stefan
T1  - Assessing the transportability of clinical prediction models for cognitive impairment using causal models
JF  - BMC medical research methodology
N2  - Background
Machine learning models promise to support diagnostic predictions, but may not perform well in new settings. Selecting the best model for a new setting without available data is challenging. We aimed to investigate the transportability by calibration and discrimination of prediction models for cognitive impairment in simulated external settings with different distributions of demographic and clinical characteristics.

Methods
We mapped and quantified relationships between variables associated with cognitive impairment using causal graphs, structural equation models, and data from the ADNI study. These estimates were then used to generate datasets and evaluate prediction models with different sets of predictors. We measured transportability to external settings under guided interventions on age, APOE & epsilon;4, and tau-protein, using performance differences between internal and external settings measured by calibration metrics and area under the receiver operating curve (AUC).

Results
Calibration differences indicated that models predicting with causes of the outcome were more transportable than those predicting with consequences. AUC differences indicated inconsistent trends of transportability between the different external settings. Models predicting with consequences tended to show higher AUC in the external settings compared to internal settings, while models predicting with parents or all variables showed similar AUC.

Conclusions
We demonstrated with a practical prediction task example that predicting with causes of the outcome results in better transportability compared to anti-causal predictions when considering calibration differences. We conclude that calibration performance is crucial when assessing model transportability to external settings.
KW  - Alzheimer's Disease
KW  - Clinical risk prediction
KW  - DAG
KW  - Causality;
KW  - Transportability
Y1  - 2023
U6  - https://doi.org/10.1186/s12874-023-02003-6
SN  - 1471-2288
VL  - 23
IS  - 1
PB  - BMC
CY  - London
ER  - 
TY  - JOUR
A1  - Garrels, Tim
A1  - Khodabakhsh, Athar
A1  - Renard, Bernhard Y.
A1  - Baum, Katharina
T1  - LazyFox: fast and parallelized overlapping community detection in large graphs
JF  - PEERJ Computer Science
N2  - The detection of communities in graph datasets provides insight about a graph's underlying structure and is an important tool for various domains such as social sciences, marketing, traffic forecast, and drug discovery. While most existing algorithms provide fast approaches for community detection, their results usually contain strictly separated communities. However, most datasets would semantically allow for or even require overlapping communities that can only be determined at much higher computational cost. We build on an efficient algorithm, FOX, that detects such overlapping communities. FOX measures the closeness of a node to a community by approximating the count of triangles which that node forms with that community. We propose LAZYFOX, a multi-threaded adaptation of the FOX algorithm, which provides even faster detection without an impact on community quality. This allows for the analyses of significantly larger and more complex datasets. LAZYFOX enables overlapping community detection on complex graph datasets with millions of nodes and billions of edges in days instead of weeks. As part of this work, LAZYFOX's implementation was published and is available as a tool under an MIT licence at https://github.com/TimGarrels/LazyFox.
KW  - Overlapping community detection
KW  - Large networks
KW  - Weighted clustering coefficient
KW  - Heuristic triangle estimation
KW  - Parallelized algorithm
KW  - C++ tool
KW  - Runtime improvement
KW  - Open source
KW  - Graph algorithm
KW  - Community analysis
Y1  - 2023
U6  - https://doi.org/10.7717/peerj-cs.1291
SN  - 2376-5992
VL  - 9
PB  - PeerJ Inc.
CY  - London
ER  - 
TY  - JOUR
A1  - Kappattanavar, Arpita Mallikarjuna
A1  - Hecker, Pascal
A1  - Moontaha, Sidratul
A1  - Steckhan, Nico
A1  - Arnrich, Bert
T1  - Food choices after cognitive load
BT  - an affective computing approach
JF  - Sensors
N2  - Psychology and nutritional science research has highlighted the impact of negative emotions and cognitive load on calorie consumption behaviour using subjective questionnaires. Isolated studies in other domains objectively assess cognitive load without considering its effects on eating behaviour. This study aims to explore the potential for developing an integrated eating behaviour assistant system that incorporates cognitive load factors. Two experimental sessions were conducted using custom-developed experimentation software to induce different stimuli. During these sessions, we collected 30 h of physiological, food consumption, and affective states questionnaires data to automatically detect cognitive load and analyse its effect on food choice. Utilising grid search optimisation and leave-one-subject-out cross-validation, a support vector machine model achieved a mean classification accuracy of 85.12% for the two cognitive load tasks using eight relevant features. Statistical analysis was performed on calorie consumption and questionnaire data. Furthermore, 75% of the subjects with higher negative affect significantly increased consumption of specific foods after high-cognitive-load tasks. These findings offer insights into the intricate relationship between cognitive load, affective states, and food choice, paving the way for an eating behaviour assistant system to manage food choices during cognitive load. Future research should enhance system capabilities and explore real-world applications.
KW  - cognitive load
KW  - eating behaviour
KW  - machine learning
KW  - physiological signals
KW  - photoplethysmography
KW  - electrodermal activity
KW  - sensors
Y1  - 2023
U6  - https://doi.org/10.3390/s23146597
SN  - 1424-8220
VL  - 23
IS  - 14
PB  - MDPI
CY  - Basel
ER  - 
TY  - JOUR
A1  - Cohen, Sarel
A1  - Hershcovitch, Moshik
A1  - Taraz, Martin
A1  - Kissig, Otto
A1  - Issac, Davis
A1  - Wood, Andrew
A1  - Waddington, Daniel
A1  - Chin, Peter
A1  - Friedrich, Tobias
T1  - Improved and optimized drug repurposing for the SARS-CoV-2 pandemic
JF  - PLoS one
N2  - The active global SARS-CoV-2 pandemic caused more than 426 million cases and 5.8 million deaths worldwide. The development of completely new drugs for such a novel disease is a challenging, time intensive process. Despite researchers around the world working on this task, no effective treatments have been developed yet. This emphasizes the importance of drug repurposing, where treatments are found among existing drugs that are meant for different diseases. A common approach to this is based on knowledge graphs, that condense relationships between entities like drugs, diseases and genes. Graph neural networks (GNNs) can then be used for the task at hand by predicting links in such knowledge graphs. Expanding on state-of-the-art GNN research, Doshi et al. recently developed the Dr-COVID model. We further extend their work using additional output interpretation strategies. The best aggregation strategy derives a top-100 ranking of 8,070 candidate drugs, 32 of which are currently being tested in COVID-19-related clinical trials. Moreover, we present an alternative application for the model, the generation of additional candidates based on a given pre-selection of drug candidates using collaborative filtering. In addition, we improved the implementation of the Dr-COVID model by significantly shortening the inference and pre-processing time by exploiting data-parallelism. As drug repurposing is a task that requires high computation and memory resources, we further accelerate the post-processing phase using a new emerging hardware-we propose a new approach to leverage the use of high-capacity Non-Volatile Memory for aggregate drug ranking.
Y1  - 2023
U6  - https://doi.org/10.1371/journal.pone.0266572
SN  - 1932-6203
VL  - 18
IS  - 3
PB  - PLoS
CY  - San Fransisco
ER  - 
TY  - JOUR
A1  - Piro, Vitor C.
A1  - Renard, Bernhard Y.
T1  - Contamination detection and microbiome exploration with GRIMER
JF  - GigaScience
N2  - Background:
Contamination detection is a important step that should be carefully considered in early stages when designing and performing microbiome studies to avoid biased outcomes. Detecting and removing true contaminants is challenging, especially in low-biomass samples or in studies lacking proper controls. Interactive visualizations and analysis platforms are crucial to better guide this step, to help to identify and detect noisy patterns that could potentially be contamination. Additionally, external evidence, like aggregation of several contamination detection methods and the use of common contaminants reported in the literature, could help to discover and mitigate contamination. 

Results: 
We propose GRIMER, a tool that performs automated analyses and generates a portable and interactive dashboard integrating annotation, taxonomy, and metadata. It unifies several sources of evidence to help detect contamination. GRIMER is independent of quantification methods and directly analyzes contingency tables to create an interactive and offline report. Reports can be created in seconds and are accessible for nonspecialists, providing an intuitive set of charts to explore data distribution among observations and samples and its connections with external sources. Further, we compiled and used an extensive list of possible external contaminant taxa and common contaminants with 210 genera and 627 species reported in 22 published articles. 

Conclusion:
GRIMER enables visual data exploration and analysis, supporting contamination detection in microbiome studies. The tool and data presented are open source and available at https://gitlab.com/dacs-hpi/grimer.
KW  - Contamination
KW  - Microbiome
KW  - Visualization
KW  - Taxonomy
Y1  - 2023
U6  - https://doi.org/10.1093/gigascience/giad017
SN  - 2047-217X
VL  - 12
PB  - Oxford Univ. Press
CY  - Oxford
ER  - 
TY  - JOUR
A1  - Gärtner, Thomas
A1  - Schneider, Juliana
A1  - Arnrich, Bert
A1  - Konigorski, Stefan
T1  - Comparison of Bayesian Networks, G-estimation and linear models to estimate causal treatment effects in aggregated N-of-1 trials with carry-over effects
JF  - BMC Medical Research Methodology
N2  - Background
The aggregation of a series of N-of-1 trials presents an innovative and efficient study design, as an alternative to traditional randomized clinical trials. Challenges for the statistical analysis arise when there is carry-over or complex dependencies of the treatment effect of interest.

Methods
In this study, we evaluate and compare methods for the analysis of aggregated N-of-1 trials in different scenarios with carry-over and complex dependencies of treatment effects on covariates. For this, we simulate data of a series of N-of-1 trials for Chronic Nonspecific Low Back Pain based on assumed causal relationships parameterized by directed acyclic graphs. In addition to existing statistical methods such as regression models, Bayesian Networks, and G-estimation, we introduce a carry-over adjusted parametric model (COAPM).

Results
The results show that all evaluated existing models have a good performance when there is no carry-over and no treatment dependence. When there is carry-over, COAPM yields unbiased and more efficient estimates while all other methods show some bias in the estimation. When there is known treatment dependence, all approaches that are capable to model it yield unbiased estimates. Finally, the efficiency of all methods decreases slightly when there are missing values, and the bias in the estimates can also increase.

Conclusions
This study presents a systematic evaluation of existing and novel approaches for the statistical analysis of a series of N-of-1 trials. We derive practical recommendations which methods may be best in which scenarios.
KW  - N-of-1 trials
KW  - Randomized clinical trials
KW  - Bayesian Networks;
KW  - G-estimation
KW  - Linear model
KW  - Simulation study
KW  - Chronic Nonspecific Low
KW  - Back Pain
Y1  - 2023
U6  - https://doi.org/10.1186/s12874-023-02012-5
SN  - 1471-2288
VL  - 23
IS  - 1
PB  - BMC
CY  - London
ER  - 
TY  - JOUR
A1  - Lewkowicz, Daniel
A1  - Böttinger, Erwin
A1  - Siegel, Martin
T1  - Economic evaluation of digital therapeutic care apps for unsupervised treatment of low back pain
BT  - Monte Carlo Simulation
JF  - JMIR mhealth and uhealth
N2  - Background: 
Digital therapeutic care (DTC) programs are unsupervised app-based treatments that provide video exercises and educational material to patients with nonspecific low back pain during episodes of pain and functional disability. German statutory health insurance can reimburse DTC programs since 2019, but evidence on efficacy and reasonable pricing remains scarce. This paper presents a probabilistic sensitivity analysis (PSA) to evaluate the efficacy and cost-utility of a DTC app against treatment as usual (TAU) in Germany. 

Objective: 
The aim of this study was to perform a PSA in the form of a Monte Carlo simulation based on the deterministic base case analysis to account for model assumptions and parameter uncertainty. We also intend to explore to what extent the results in this probabilistic analysis differ from the results in the base case analysis and to what extent a shortage of outcome data concerning quality-of-life (QoL) metrics impacts the overall results. 

Methods: 
The PSA builds upon a state-transition Markov chain with a 4-week cycle length over a model time horizon of 3 years from a recently published deterministic cost-utility analysis. A Monte Carlo simulation with 10,000 iterations and a cohort size of 10,000 was employed to evaluate the cost-utility from a societal perspective. Quality-adjusted life years (QALYs) were derived from Veterans RAND 6-Dimension (VR-6D) and Short-Form 6-Dimension (SF-6D) single utility scores. Finally, we also simulated reducing the price for a 3-month app prescription to analyze at which price threshold DTC would result in being the dominant strategy over TAU in Germany. 

Results: 
The Monte Carlo simulation yielded on average a euro135.97 (a currency exchange rate of EUR euro1=US $1.069 is applicable) incremental cost and 0.004 incremental QALYs per person and year for the unsupervised DTC app strategy compared to in-person physiotherapy in Germany. The corresponding incremental cost-utility ratio (ICUR) amounts to an additional euro34,315.19 per additional QALY. DTC yielded more QALYs in 54.96% of the iterations. DTC dominates TAU in 24.04% of the iterations for QALYs. Reducing the app price in the simulation from currently euro239.96 to euro164.61 for a 3-month prescription could yield a negative ICUR and thus make DTC the dominant strategy, even though the estimated probability of DTC being more effective than TAU is only 54.96%.

Conclusions:
Decision-makers should be cautious when considering the reimbursement of DTC apps since no significant treatment effect was found, and the probability of cost-effectiveness remains below 60% even for an infinite willingness-to-pay threshold. More app-based studies involving the utilization of QoL outcome parameters are urgently needed to account for the low and limited precision of the available QoL input parameters, which are crucial to making profound recommendations concerning the cost-utility of novel apps.
KW  - cost-utility analysis
KW  - cost
KW  - probabilistic sensitivity analysis
KW  - Monte Carlo simulation
KW  - low back pain
KW  - pain
KW  - economic
KW  - cost-effectiveness
KW  - Markov model
KW  - digital therapy
KW  - digital health app
KW  - mHealth
KW  - mobile health
KW  - health app
KW  - mobile app
KW  - orthopedic
KW  - QUALY
KW  - DALY
KW  - quality-adjusted life years
KW  - disability-adjusted life years
KW  - time horizon
KW  - veteran
KW  - statistics
Y1  - 2023
U6  - https://doi.org/10.2196/44585
SN  - 2291-5222
VL  - 11
PB  - JMIR Publications
CY  - Toronto
ER  - 
TY  - JOUR
A1  - Moontaha, Sidratul
A1  - Schumann, Franziska Elisabeth Friederike
A1  - Arnrich, Bert
T1  - Online learning for wearable EEG-Based emotion classification
JF  - Sensors
N2  - Giving emotional intelligence to machines can facilitate the early detection and prediction of mental diseases and symptoms. Electroencephalography (EEG)-based emotion recognition is widely applied because it measures electrical correlates directly from the brain rather than indirect measurement of other physiological responses initiated by the brain. Therefore, we used non-invasive and portable EEG sensors to develop a real-time emotion classification pipeline. The pipeline trains different binary classifiers for Valence and Arousal dimensions from an incoming EEG data stream achieving a 23.9% (Arousal) and 25.8% (Valence) higher F1-Score on the state-of-art AMIGOS dataset than previous work. Afterward, the pipeline was applied to the curated dataset from 15 participants using two consumer-grade EEG devices while watching 16 short emotional videos in a controlled environment. Mean F1-Scores of 87% (Arousal) and 82% (Valence) were achieved for an immediate label setting. Additionally, the pipeline proved to be fast enough to achieve predictions in real-time in a live scenario with delayed labels while continuously being updated. The significant discrepancy from the readily available labels on the classification scores leads to future work to include more data. Thereafter, the pipeline is ready to be used for real-time applications of emotion classification.
KW  - online learning
KW  - real-time
KW  - emotion classification
KW  - AMIGOS dataset
KW  - wearable EEG (muse and neurosity crown)
KW  - psychopy experiments
Y1  - 2023
U6  - https://doi.org/10.3390/s23052387
SN  - 1424-8220
VL  - 23
IS  - 5
PB  - MDPI
CY  - Basel
ER  - 
TY  - JOUR
A1  - Kirchler, Matthias
A1  - Konigorski, Stefan
A1  - Norden, Matthias
A1  - Meltendorf, Christian
A1  - Kloft, Marius
A1  - Schurmann, Claudia
A1  - Lippert, Christoph
T1  - transferGWAS
BT  - GWAS of images using deep transfer learning
JF  - Bioinformatics
N2  - Motivation: 
Medical images can provide rich information about diseases and their biology. However, investigating their association with genetic variation requires non-standard methods. We propose transferGWAS, a novel approach to perform genome-wide association studies directly on full medical images. First, we learn semantically meaningful representations of the images based on a transfer learning task, during which a deep neural network is trained on independent but similar data. Then, we perform genetic association tests with these representations. 

Results: 
We validate the type I error rates and power of transferGWAS in simulation studies of synthetic images. Then we apply transferGWAS in a genome-wide association study of retinal fundus images from the UK Biobank. This first-of-a-kind GWAS of full imaging data yielded 60 genomic regions associated with retinal fundus images, of which 7 are novel candidate loci for eye-related traits and diseases.
Y1  - 2022
U6  - https://doi.org/10.1093/bioinformatics/btac369
SN  - 1367-4803
SN  - 1460-2059
VL  - 38
IS  - 14
SP  - 3621
EP  - 3628
PB  - Oxford Univ. Press
CY  - Oxford
ER  - 
TY  - THES
A1  - Lorson, Annalena
T1  - Understanding early stage evolution of digital innovation units in manufacturing companies
T1  - Verständnis der frühphasigen Entwicklung digitaler Innovationseinheiten in Fertigungsunternehmen
N2  - The dynamic landscape of digital transformation entails an impact on industrial-age manufacturing companies that goes beyond product offerings, changing operational paradigms, and requiring an organization-wide metamorphosis. An initiative to address the given challenges is the creation of Digital Innovation Units (DIUs) – departments or distinct legal entities that use new structures and practices to develop digital products, services, and business models and support or drive incumbents’ digital transformation. With more than 300 units in German-speaking countries alone and an increasing number of scientific publications, DIUs have become a widespread phenomenon in both research and practice.

This dissertation examines the evolution process of DIUs in the manufacturing
industry during their first three years of operation, through an extensive longitudinal single-case study and several cross-case syntheses of seven DIUs. Building on the lenses of organizational change and development, time, and socio-technical systems, this research provides insights into the fundamentals, temporal dynamics, socio-technical interactions, and relational dynamics of a DIU’s evolution process. Thus, the dissertation promotes a dynamic understanding of DIUs and adds a two-dimensional perspective to the often one-dimensional view of these units and their interactions with the main organization throughout the startup and growth phases of a DIU.

Furthermore, the dissertation constructs a phase model that depicts the early stages of DIU evolution based on these findings and by incorporating literature from information systems research. As a result, it illustrates the progressive intensification of collaboration between the DIU and the main organization. After being implemented, the DIU sparks initial collaboration and instigates change within (parts of) the main organization. Over time, it adapts to the corporate environment to some extent, responding to changing circumstances in order to contribute to long-term transformation. Temporally, the DIU drives the early phases of cooperation and adaptation in particular, while the main organization triggers the first major evolutionary step and realignment of the DIU.

Overall, the thesis identifies DIUs as malleable organizational structures that are crucial for digital transformation. Moreover, it provides guidance for practitioners on the process of building a new DIU from scratch or optimizing an existing one.
N2  - Die digitale Transformation produzierender Unternehmen geht über die bloße Veränderung des Produktangebots hinaus; sie durchdringt operative Paradigmen und erfordert eine umfassende, unternehmensweite Metamorphose. Eine Initiative, den damit verbundenen Herausforderungen zu begegnen, ist der Aufbau einer Digital Innovation Unit (DIU) (zu deutsch: digitale Innovationseinheit) – eine Abteilung oder separate rechtliche Einheit, die neue organisationale Strukturen und Arbeitspraktiken nutzt, um digitale Produkte, Dienstleistungen und Geschäftsmodelle zu entwickeln und die digitale Transformation von etabliertenUnternehmen zu unterstützen oder voranzutreiben. Mit mehr als 300 Einheitenallein im deutschsprachigen Raum und einer wachsenden Zahl wissenschaftlicher Publikationen sind DIUs sowohl in der Forschung als auch in der Praxis ein weit verbreitetes Phänomen.

Auf Basis einer umfassenden Längsschnittstudie und mehrerer Querschnittsanalysen von sieben Fertigungsunternehmen und ihren DIUs untersucht diese Dissertation den Entwicklungsprozess von DIUs in den ersten drei Betriebsjahren. Gestützt auf theoretische Perspektiven zu organisatorischem Wandel, Zeit und sozio-technischen Systemen bietet sie Einblicke in die Grundlagen, die zeitlichen Dynamiken, die sozio-technischen Interaktionen und die Beziehungsdynamiken des Entwicklungsprozesses von DIUs. Die Dissertation erweitert somit das dynamische Verständnis von DIUs und fügt der oft eindimensionalen Sichtweise auf diese Einheiten und ihre Interaktionen mit der Hauptorganisation eine zweidimensionale Perspektive entlang der Gründungs- und Wachstumsphasen einer DIU hinzu.

Darüber hinaus konstruiert die Dissertation ein Phasenmodell, das die frühen Phasen der DIU-Entwicklung auf der Grundlage dieser Erkenntnisse und unter Einbeziehung von Literatur aus der Wirtschaftsinformatikforschung abbildet. Es veranschaulicht die schrittweise Intensivierung der Zusammenarbeit zwischen der DIU und der Hauptorganisation. Nach ihrer Implementierung initiiert die DIU die anfängliche Zusammenarbeit und stößt Veränderungen innerhalb (von Teilen) der Hauptorganisation an. Im Laufe der Zeit passt sich die DIU bis zu einem gewissen Grad dem Unternehmensumfeld an und reagiert auf sich verändernde Umstände, um zu einer langfristigen Veränderung beizutragen. Zeitlich gesehen treibt die DIU vor allem die frühen Phasen der Zusammenarbeit und Anpassung voran, während die Hauptorganisation den ersten großen Entwicklungsschritt und die Neuausrichtung der DIU auslöst.

Insgesamt identifiziert die Dissertation DIUs als anpassungsfähige Organisationsstrukturen, die für die digitale Transformation entscheidend sind. Darüber hinaus bietet sie Praktikern einen Leitfaden für den Aufbau einer neuen oder die Optimierung einer bestehenden DIU.
KW  - digital transformation
KW  - digital innovation units
KW  - evolution of digital innovation units
KW  - manufacturing companies
KW  - digitale Transformation
KW  - digitale Innovationseinheit
KW  - Entwicklung digitaler Innovationseinheiten
KW  - Fertigungsunternehmen
Y1  - 2024
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-639141
ER  - 
TY  - JOUR
A1  - Cseh, Agnes
A1  - Faenza, Yuri
A1  - Kavitha, Telikepalli
A1  - Powers, Vladlena
T1  - Understanding popular matchings via stable matchings
JF  - SIAM journal on discrete mathematics
N2  - An instance of the marriage problem is given by a graph G = (A boolean OR B, E), together with, for each vertex of G, a strict preference order over its neighbors. A matching M of G is popular in the marriage instance if M does not lose a head-to-head election against any matching where vertices are voters. Every stable matching is a min-size popular matching; another subclass of popular matchings that always exists and can be easily computed is the set of dominant matchings. A popular matching M is dominant if M wins the head-to-head election against any larger matching. Thus, every dominant matching is a max-size popular matching, and it is known that the set of dominant matchings is the linear image of the set of stable matchings in an auxiliary graph. Results from the literature seem to suggest that stable and dominant matchings behave, from a complexity theory point of view, in a very similar manner within the class of popular matchings. The goal of this paper is to show that there are instead differences in the tractability of stable and dominant matchings and to investigate further their importance for popular matchings. First, we show that it is easy to check if all popular matchings are also stable; however, it is co-NP hard to check if all popular matchings are also dominant. Second, we show how some new and recent hardness results on popular matching problems can be deduced from the NP-hardness of certain problems on stable matchings, also studied in this paper, thus showing that stable matchings can be employed to show not only positive results on popular matchings (as is known) but also most negative ones. Problems for which we show new hardness results include finding a min-size (resp., max-size) popular matching that is not stable (resp., dominant). A known result for which we give a new and simple proof is the NP-hardness of finding a popular matching when G is nonbipartite.
KW  - popular matching
KW  - stable matching
KW  - complexity
KW  - dominant matching
Y1  - 2022
U6  - https://doi.org/10.1137/19M124770X
SN  - 0895-4801
SN  - 1095-7146
VL  - 36
IS  - 1
SP  - 188
EP  - 213
PB  - Society for Industrial and Applied Mathematics
CY  - Philadelphia
ER  - 
TY  - THES
A1  - Huegle, Johannes
T1  - Causal discovery in practice: Non-parametric conditional independence testing and tooling for causal discovery
T1  - Kausale Entdeckung in der Praxis: Nichtparametrische bedingte Unabhängigkeitstests und Werkzeuge für die Kausalentdeckung
N2  - Knowledge about causal structures is crucial for decision support in various domains. For example, in discrete manufacturing, identifying the root causes of failures and quality deviations that interrupt the highly automated production process requires causal structural knowledge. However, in practice, root cause analysis is usually built upon individual expert knowledge about associative relationships. But, "correlation does not imply causation", and misinterpreting associations often leads to incorrect conclusions. Recent developments in methods for causal discovery from observational data have opened the opportunity for a data-driven examination. Despite its potential for data-driven decision support, omnipresent challenges impede causal discovery in real-world scenarios. In this thesis, we make a threefold contribution to improving causal discovery in practice.

(1) The growing interest in causal discovery has led to a broad spectrum of methods with specific assumptions on the data and various implementations. Hence, application in practice requires careful consideration of existing methods, which becomes laborious when dealing with various parameters, assumptions, and implementations in different programming languages. Additionally, evaluation is challenging due to the lack of ground truth in practice and limited benchmark data that reflect real-world data characteristics.
To address these issues, we present a platform-independent modular pipeline for causal discovery and a ground truth framework for synthetic data generation that provides comprehensive evaluation opportunities, e.g., to examine the accuracy of causal discovery methods in case of inappropriate assumptions.

(2) Applying constraint-based methods for causal discovery requires selecting a conditional independence (CI) test, which is particularly challenging in mixed discrete-continuous data omnipresent in many real-world scenarios. In this context, inappropriate assumptions on the data or the commonly applied discretization of continuous variables reduce the accuracy of CI decisions, leading to incorrect causal structures. 
Therefore, we contribute a non-parametric CI test leveraging k-nearest neighbors methods and prove its statistical validity and power in mixed discrete-continuous data, as well as the asymptotic consistency when used in constraint-based causal discovery. An extensive evaluation of synthetic and real-world data shows that the proposed CI test outperforms state-of-the-art approaches in the accuracy of CI testing and causal discovery, particularly in settings with low sample sizes. 

(3) To show the applicability and opportunities of causal discovery in practice, we examine our contributions in real-world discrete manufacturing use cases. For example, we showcase how causal structural knowledge helps to understand unforeseen production downtimes or adds decision support in case of failures and quality deviations in automotive body shop assembly lines.
N2  - Kenntnisse über die Strukturen zugrundeliegender kausaler Mechanismen sind eine Voraussetzung für die Entscheidungsunterstützung in verschiedenen Bereichen. In der Fertigungsindustrie beispielsweise erfordert die Fehler-Ursachen-Analyse von Störungen und Qualitätsabweichungen, die den hochautomatisierten Produktionsprozess unterbrechen, kausales Strukturwissen. In Praxis stützt sich die Fehler-Ursachen-Analyse in der Regel jedoch auf individuellem Expertenwissen über assoziative Zusammenhänge. Aber "Korrelation impliziert nicht Kausalität", und die Fehlinterpretation assoziativer Zusammenhänge führt häufig zu falschen Schlussfolgerungen. Neueste Entwicklungen von Methoden des kausalen Strukturlernens haben die Möglichkeit einer datenbasierten Betrachtung eröffnet. Trotz seines Potenzials zur datenbasierten Entscheidungsunterstützung wird das kausale Strukturlernen in der Praxis jedoch durch allgegenwärtige Herausforderungen erschwert. In dieser Dissertation leisten wir einen dreifachen Beitrag zur Verbesserung des kausalen Strukturlernens in der Praxis.

(1) Das wachsende Interesse an kausalem Strukturlernen hat zu einer Vielzahl von Methoden mit spezifischen statistischen Annahmen über die Daten und verschiedenen Implementierungen geführt. Daher erfordert die Anwendung in der Praxis eine sorgfältige Prüfung der vorhandenen Methoden, was eine Herausforderung darstellt, wenn verschiedene Parameter, Annahmen und Implementierungen in unterschiedlichen Programmiersprachen betrachtet werden. Hierbei wird die Evaluierung von Methoden des kausalen Strukturlernens zusätzlich durch das Fehlen von "Ground Truth" in der Praxis und begrenzten Benchmark-Daten, welche die Eigenschaften realer Datencharakteristiken widerspiegeln, erschwert.
Um diese Probleme zu adressieren, stellen wir eine plattformunabhängige modulare Pipeline für kausales Strukturlernen und ein Tool zur Generierung synthetischer Daten vor, die umfassende Evaluierungsmöglichkeiten bieten, z.B. um Ungenauigkeiten von Methoden des Lernens kausaler Strukturen bei falschen Annahmen an die Daten aufzuzeigen.

(2) Die Anwendung von constraint-basierten Methoden des kausalen Strukturlernens erfordert die Wahl eines bedingten Unabhängigkeitstests (CI-Test), was insbesondere bei gemischten diskreten und kontinuierlichen Daten, die in vielen realen Szenarien allgegenwärtig sind, die Anwendung erschwert. Beispielsweise führen falsche Annahmen der CI-Tests oder die Diskretisierung kontinuierlicher Variablen zu einer Verschlechterung der Korrektheit der Testentscheidungen, was in fehlerhaften kausalen Strukturen resultiert. 
Um diese Probleme zu adressieren, stellen wir einen nicht-parametrischen CI-Test vor, der auf Nächste-Nachbar-Methoden basiert, und beweisen dessen statistische Validität und Trennschärfe bei gemischten diskreten und kontinuierlichen Daten, sowie dessen asymptotische Konsistenz in constraint-basiertem kausalem Strukturlernen. Eine umfangreiche Evaluation auf synthetischen und realen Daten zeigt, dass der vorgeschlagene CI-Test bestehende Verfahren hinsichtlich der Korrektheit der Testentscheidung und gelernter kausaler Strukturen übertrifft, insbesondere bei geringen Stichprobengrößen. 

(3) Um die Anwendbarkeit und Möglichkeiten kausalen Strukturlernens in der Praxis aufzuzeigen, untersuchen wir unsere Beiträge in realen Anwendungsfällen aus der Fertigungsindustrie. Wir zeigen an mehreren Beispielen aus der automobilen Karosseriefertigungen wie kausales Strukturwissen helfen kann, unvorhergesehene Produktionsausfälle zu verstehen oder eine Entscheidungsunterstützung bei Störungen und Qualitätsabweichungen zu geben.
KW  - causal discovery
KW  - causal structure learning
KW  - causal AI
KW  - non-parametric conditional independence testing
KW  - manufacturing
KW  - causal reasoning
KW  - mixed data
KW  - kausale KI
KW  - kausale Entdeckung
KW  - kausale Schlussfolgerung
KW  - kausales Strukturlernen
KW  - Fertigung
KW  - gemischte Daten
KW  - nicht-parametrische bedingte Unabhängigkeitstests
Y1  - 2024
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-635820
ER  - 
TY  - JOUR
A1  - Casel, Katrin
A1  - Fernau, Henning
A1  - Ghadikolaei, Mehdi Khosravian
A1  - Monnot, Jerome
A1  - Sikora, Florian
T1  - On the complexity of solution extension of optimization problems
JF  - Theoretical computer science : the journal of the EATCS
N2  - The question if a given partial solution to a problem can be extended reasonably occurs in many algorithmic approaches for optimization problems. 
For instance, when enumerating minimal vertex covers of a graph G = (V, E), one usually arrives at the problem to decide for a vertex set U subset of V (pre-solution), if there exists a minimal vertex cover S (i.e., a vertex cover S subset of V such that no proper subset of S is a vertex cover) with U subset of S (minimal extension of U). 
We propose a general, partial-order based formulation of such extension problems which allows to model parameterization and approximation aspects of extension, and also highlights relationships between extension tasks for different specific problems. 
As examples, we study a number of specific problems which can be expressed and related in this framework. In particular, we discuss extension variants of the problems dominating set and feedback vertex/edge set. 
All these problems are shown to be NP-complete even when restricted to bipartite graphs of bounded degree, with the exception of our extension version of feedback edge set on undirected graphs which is shown to be solvable in polynomial time. 
For the extension variants of dominating and feedback vertex set, we also show NP-completeness for the restriction to planar graphs of bounded degree. 
As non-graph problem, we also study an extension version of the bin packing problem. We further consider the parameterized complexity of all these extension variants, where the parameter is a measure of the pre-solution as defined by our framework.
KW  - extension problems
KW  - NP-hardness
KW  - parameterized complexity
Y1  - 2022
U6  - https://doi.org/10.1016/j.tcs.2021.10.017
SN  - 0304-3975
SN  - 1879-2294
VL  - 904
SP  - 48
EP  - 65
PB  - Elsevier
CY  - Amsterdam [u.a.]
ER  - 
TY  - JOUR
A1  - Coupette, Corinna
A1  - Hartung, Dirk
A1  - Beckedorf, Janis
A1  - Böther, Maximilian
A1  - Katz, Daniel Martin
T1  - Law smells
BT  - defining and detecting problematic patterns in legal drafting
JF  - Artificial intelligence and law
N2  - Building on the computer science concept of code smells, we initiate the study of law smells, i.e., patterns in legal texts that pose threats to the comprehensibility and maintainability of the law. With five intuitive law smells as running examples-namely, duplicated phrase, long element, large reference tree, ambiguous syntax, and natural language obsession-, we develop a comprehensive law smell taxonomy. This taxonomy classifies law smells by when they can be detected, which aspects of law they relate to, and how they can be discovered. We introduce text-based and graph-based methods to identify instances of law smells, confirming their utility in practice using the United States Code as a test case. Our work demonstrates how ideas from software engineering can be leveraged to assess and improve the quality of legal code, thus drawing attention to an understudied area in the intersection of law and computer science and highlighting the potential of computational legal drafting.
KW  - Refactoring
KW  - Software engineering
KW  - Law
KW  - Natural language processing
KW  - Network analysis
Y1  - 2022
U6  - https://doi.org/10.1007/s10506-022-09315-w
SN  - 0924-8463
SN  - 1572-8382
VL  - 31
SP  - 335
EP  - 368
PB  - Springer
CY  - Dordrecht
ER  - 
TY  - JOUR
A1  - Tang, Mitchell
A1  - Nakamoto, Carter H.
A1  - Stern, Ariel Dora
A1  - Mehrotra, Ateev
T1  - Trends in remote patient monitoring use in traditional Medicare
JF  - JAMA Internal Medicine
N2  - This cross-sectional study uses traditional Medicare claims data to assess trends in general remote patient monitoring from January 2018 through September 2021.
Y1  - 2022
U6  - https://doi.org/10.1001/jamainternmed.2022.3043
SN  - 2168-6106
SN  - 2168-6114
VL  - 182
IS  - 9
SP  - 1005
EP  - 1006
PB  - American Veterinary Medical Association
CY  - Chicago
ER  - 
TY  - JOUR
A1  - Cseh, Ágnes
A1  - Juhos, Attila
T1  - Pairwise preferences in the stable marriage problem
JF  - ACM Transactions on Economics and Computation / Association for Computing Machinery
N2  - We study the classical, two-sided stable marriage problem under pairwise preferences. In the most general setting, agents are allowed to express their preferences as comparisons of any two of their edges, and they also have the right to declare a draw or even withdraw from such a comparison. This freedom is then gradually restricted as we specify six stages of orderedness in the preferences, ending with the classical case of strictly ordered lists. We study all cases occurring when combining the three known notions of stability-weak, strong, and super-stability-under the assumption that each side of the bipartite market obtains one of the six degrees of orderedness. By designing three polynomial algorithms and two NP-completeness proofs, we determine the complexity of all cases not yet known and thus give an exact boundary in terms of preference structure between tractable and intractable cases.
KW  - Stable marriage
KW  - intransitivity
KW  - acyclic preferences
KW  - poset
KW  - weakly
KW  - stable matching
KW  - strongly stable matching
KW  - super stable matching
Y1  - 2021
U6  - https://doi.org/10.1145/3434427
SN  - 2167-8375
SN  - 2167-8383
VL  - 9
IS  - 1
PB  - Association for Computing Machinery
CY  - New York
ER  - 
TY  - JOUR
A1  - Cseh, Ágnes
A1  - Kavitha, Telikepalli
T1  - Popular matchings in complete graphs
JF  - Algorithmica : an international journal in computer science
N2  - Our input is a complete graph G on n vertices where each vertex has a strict ranking of all other vertices in G. The goal is to construct a matching in G that is popular. A matching M is popular if M does not lose a head-to-head election against any matching M ': here each vertex casts a vote for the matching in {M,M '} in which it gets a better assignment. Popular matchings need not exist in the given instance G and the popular matching problem is to decide whether one exists or not. The popular matching problem in G is easy to solve for odd n. Surprisingly, the problem becomes NP-complete for even n, as we show here. This is one of the few graph theoretic problems efficiently solvable when n has one parity and NP-complete when n has the other parity.
KW  - Popular matching
KW  - Complexity
KW  - Stable matching
Y1  - 2021
U6  - https://doi.org/10.1007/s00453-020-00791-7
SN  - 0178-4617
SN  - 1432-0541
VL  - 83
IS  - 5
SP  - 1493
EP  - 1523
PB  - Springer
CY  - New York
ER  - 
TY  - JOUR
A1  - Genske, Ulrich
A1  - Jahnke, Paul
T1  - Human observer net
BT  - a platform tool for human observer studies of image data
JF  - Radiology
N2  - Background: 
Current software applications for human observer studies of images lack flexibility in study design, platform independence, multicenter use, and assessment methods and are not open source, limiting accessibility and expandability.

Purpose: 
To develop a user-friendly software platform that enables efficient human observer studies in medical imaging with flexibility of study design. 

Materials and Methods: 
Software for human observer imaging studies was designed as an open-source web application to facilitate access, platform-independent usability, and multicenter studies. Different interfaces for study creation, participation, and management of results were implemented. The software was evaluated in human observer experiments between May 2019 and March 2021, in which duration of observer responses was tracked. Fourteen radiologists evaluated and graded software usability using the 100-point system usability scale. The application was tested in Chrome, Firefox, Safari, and Edge browsers. 

Results: 
Software function was designed to allow visual grading analysis (VGA), multiple-alternative forced-choice (m-AFC), receiver operating characteristic (ROC), localization ROC, free-response ROC, and customized designs. The mean duration of reader responses per image or per image set was 6.2 seconds 6 4.8 (standard deviation), 5.8 seconds 6 4.7, 8.7 seconds 6 5.7, and 6.0 seconds 6 4.5 in four-AFC with 160 image quartets per reader, four-AFC with 640 image quartets per reader, localization ROC, and experimental studies, respectively. The mean system usability scale score was 83 6 11 (out of 100). The documented code and a demonstration of the application are available online (https://github.com/genskeu/HON, https://hondemo.pythonanywhere.com/). 

Conclusion: 
A user-friendly and efficient open-source application was developed for human reader experiments that enables study design versatility, as well as platform-independent and multicenter usability.
Y1  - 2022
U6  - https://doi.org/10.1148/radiol.211832
SN  - 0033-8419
SN  - 1527-1315
VL  - 303
IS  - 3
SP  - 524
EP  - 530
PB  - Radiological Society of North America
CY  - Oak Brook, Ill.
ER  - 
TY  - JOUR
A1  - Puri, Manish
A1  - Varde, Aparna S.
A1  - Melo, Gerard de
T1  - Commonsense based text mining on urban policy
JF  - Language resources and evaluation
N2  - Local laws on urban policy, i.e., ordinances directly affect our daily life in various ways (health, business etc.), yet in practice, for many citizens they remain impervious and complex. This article focuses on an approach to make urban policy more accessible and comprehensible to the general public and to government officials, while also addressing pertinent social media postings. Due to the intricacies of the natural language, ranging from complex legalese in ordinances to informal lingo in tweets, it is practical to harness human judgment here. To this end, we mine ordinances and tweets via reasoning based on commonsense knowledge so as to better account for pragmatics and semantics in the text. Ours is pioneering work in ordinance mining, and thus there is no prior labeled training data available for learning. This gap is filled by commonsense knowledge, a prudent choice in situations involving a lack of adequate training data. The ordinance mining can be beneficial to the public in fathoming policies and to officials in assessing policy effectiveness based on public reactions. This work contributes to smart governance, leveraging transparency in governing processes via public involvement. We focus significantly on ordinances contributing to smart cities, hence an important goal is to assess how well an urban region heads towards a smart city as per its policies mapping with smart city characteristics, and the corresponding public satisfaction.
KW  - Commonsense reasoning
KW  - Opinion mining
KW  - Ordinances
KW  - Smart cities
KW  - Social
KW  - media
KW  - Text mining
Y1  - 2022
U6  - https://doi.org/10.1007/s10579-022-09584-6
SN  - 1574-020X
SN  - 1574-0218
VL  - 57
SP  - 733
EP  - 763
PB  - Springer
CY  - Dordrecht [u.a.]
ER  - 
TY  - JOUR
A1  - Bonnet, Philippe
A1  - Dong, Xin Luna
A1  - Naumann, Felix
A1  - Tözün, Pınar
T1  - VLDB 2021
BT  - Designing a hybrid conference
JF  - SIGMOD record
N2  - The 47th International Conference on Very Large Databases (VLDB'21) was held on August 16-20, 2021 as a hybrid conference. It attracted 180 in-person attendees in Copenhagen and 840 remote attendees. In this paper, we describe our key decisions as general chairs and program committee chairs and share the lessons we learned.
Y1  - 2021
U6  - https://doi.org/10.1145/3516431.3516447
SN  - 0163-5808
SN  - 1943-5835
VL  - 50
IS  - 4
SP  - 50
EP  - 53
PB  - Association for Computing Machinery
CY  - New York
ER  - 
TY  - JOUR
A1  - Hagedorn, Christiane
A1  - Serth, Sebastian
A1  - Meinel, Christoph
T1  - The mysterious adventures of Detective Duke
BT  - how storified programming MOOCs support learners in achieving their learning goals
JF  - Frontiers in education
N2  - About 15 years ago, the first Massive Open Online Courses (MOOCs) appeared and revolutionized online education with more interactive and engaging course designs. Yet, keeping learners motivated and ensuring high satisfaction is one of the challenges today's course designers face. Therefore, many MOOC providers employed gamification elements that only boost extrinsic motivation briefly and are limited to platform support. In this article, we introduce and evaluate a gameful learning design we used in several iterations on computer science education courses. For each of the courses on the fundamentals of the Java programming language, we developed a self-contained, continuous story that accompanies learners through their learning journey and helps visualize key concepts. Furthermore, we share our approach to creating the surrounding story in our MOOCs and provide a guideline for educators to develop their own stories. Our data and the long-term evaluation spanning over four Java courses between 2017 and 2021 indicates the openness of learners toward storified programming courses in general and highlights those elements that had the highest impact. While only a few learners did not like the story at all, most learners consumed the additional story elements we provided. However, learners' interest in influencing the story through majority voting was negligible and did not show a considerable positive impact, so we continued with a fixed story instead. We did not find evidence that learners just participated in the narrative because they worked on all materials. Instead, for 10-16% of learners, the story was their main course motivation. We also investigated differences in the presentation format and concluded that several longer audio-book style videos were most preferred by learners in comparison to animated videos or different textual formats. Surprisingly, the availability of a coherent story embedding examples and providing a context for the practical programming exercises also led to a slightly higher ranking in the perceived quality of the learning material (by 4%). With our research in the context of storified MOOCs, we advance gameful learning designs, foster learner engagement and satisfaction in online courses, and help educators ease knowledge transfer for their learners.
KW  - gameful learning
KW  - storytelling
KW  - programming
KW  - learner engagement
KW  - course design
KW  - MOOCs
KW  - content gamification
KW  - narrative
Y1  - 2023
U6  - https://doi.org/10.3389/feduc.2022.1016401
SN  - 2504-284X
VL  - 7
PB  - Frontiers Media
CY  - Lausanne
ER  - 
TY  - THES
A1  - Halfpap, Stefan
T1  - Integer linear programming-based heuristics for partially replicated database clusters and selecting indexes
T1  - Auf ganzzahliger linearer Optimierung basierende Heuristiken für partiell-replizierte Datenbankcluster und das Auswählen von Indizes
N2  - Column-oriented database systems can efficiently process transactional and analytical queries on a single node. However, increasing or peak analytical loads can quickly saturate single-node database systems. Then, a common scale-out option is using a database cluster with a single primary node for transaction processing and read-only replicas. Using (the naive) full replication, queries are distributed among nodes independently of the accessed data. This approach is relatively expensive because all nodes must store all data and apply all data modifications caused by inserts, deletes, or updates.
In contrast to full replication, partial replication is a more cost-efficient implementation: Instead of duplicating all data to all replica nodes, partial replicas store only a subset of the data while being able to process a large workload share. Besides lower storage costs, partial replicas enable (i) better scaling because replicas must potentially synchronize only subsets of the data modifications and thus have more capacity for read-only queries and (ii) better elasticity because replicas have to load less data and can be set up faster. However, splitting the overall workload evenly among the replica nodes while optimizing the data allocation is a challenging assignment problem.
The calculation of optimized data allocations in a partially replicated database cluster can be modeled using integer linear programming (ILP). ILP is a common approach for solving assignment problems, also in the context of database systems. Because ILP is not scalable, existing approaches (also for calculating partial allocations) often fall back to simple (e.g., greedy) heuristics for larger problem instances. Simple heuristics may work well but can lose optimization potential.
In this thesis, we present optimal and ILP-based heuristic programming models for calculating data fragment allocations for partially replicated database clusters. Using ILP, we are flexible to extend our models to (i) consider data modifications and reallocations and (ii) increase the robustness of allocations to compensate for node failures and workload uncertainty. We evaluate our approaches for TPC-H, TPC-DS, and a real-world accounting workload and compare the results to state-of-the-art allocation approaches. Our evaluations show significant improvements for varied allocation’s properties: Compared to existing approaches, we can, for example, (i) almost halve the amount of allocated data, (ii) improve the throughput in case of node failures and workload uncertainty while using even less memory, (iii) halve the costs of data modifications, and (iv) reallocate less than 90% of data when adding a node to the cluster. Importantly, we can calculate the corresponding ILP-based heuristic solutions within a few seconds. Finally, we demonstrate that the ideas of our ILP-based heuristics are also applicable to the index selection problem.
N2  - Spaltenorientierte Datenbanksysteme können transaktionale und analytische Abfragen effizient auf einem einzigen Rechenknoten verarbeiten. Steigende Lasten oder Lastspitzen können Datenbanksysteme mit nur einem Rechenknoten jedoch schnell überlasten. Dann besteht eine gängige Skalierungsmöglichkeit darin, einen Datenbankcluster mit einem einzigen Rechenknoten für die Transaktionsverarbeitung und Replikatknoten für lesende Datenbankanfragen zu verwenden. Bei der (naiven) vollständigen Replikation werden Anfragen unabhängig von den Daten, auf die zugegriffen wird, auf die Knoten verteilt. Dieser Ansatz ist relativ teuer, da alle Knoten alle Daten speichern und alle Datenänderungen anwenden müssen, die durch das Einfügen, Löschen oder Aktualisieren von Datenbankeinträgen verursacht werden.
Im Gegensatz zur vollständigen Replikation ist die partielle Replikation eine kostengünstige Alternative: Anstatt alle Daten auf alle Replikationsknoten zu duplizieren, speichern partielle Replikate nur eine Teilmenge der Daten und können gleichzeitig einen großen Anteil der Anfragelast verarbeiten. Neben niedrigeren Speicherkosten ermöglichen partielle Replikate (i) eine bessere Skalierung, da Replikate potenziell nur Teilmengen der Datenänderungen synchronisieren müssen und somit mehr Kapazität für lesende Anfragen haben, und (ii) eine bessere Elastizität, da Replikate weniger Daten laden müssen und daher schneller eingesetzt werden können. Die gleichmäßige Lastbalancierung auf die Replikatknoten bei gleichzeitiger Optimierung der Datenzuweisung ist jedoch ein schwieriges Zuordnungsproblem.
Die Berechnung einer optimierten Datenverteilung in einem Datenbankcluster mit partiellen Replikaten kann mithilfe der ganzzahligen linearen Optimierung (engl. integer linear programming, ILP) durchgeführt werden. ILP ist ein gängiger Ansatz zur Lösung von Zuordnungsproblemen, auch im Kontext von Datenbanksystemen. Da ILP nicht skalierbar ist, greifen bestehende Ansätze (auch zur Berechnung von partiellen Replikationen) für größere Probleminstanzen oft auf einfache Heuristiken (z.B. Greedy-Algorithmen) zurück. Einfache Heuristiken können gut funktionieren, aber auch Optimierungspotenzial einbüßen.
In dieser Arbeit stellen wir optimale und ILP-basierte heuristische Ansätze zur Berechnung von Datenzuweisungen für partiell-replizierte Datenbankcluster vor. Mithilfe von ILP können wir unsere Ansätze flexibel erweitern, um (i) Datenänderungen und -umverteilungen zu berücksichtigen und (ii) die Robustheit von Zuweisungen zu erhöhen, um Knotenausfälle und Unsicherheiten bezüglich der Anfragelast zu kompensieren. Wir evaluieren unsere Ansätze für TPC-H, TPC-DS und eine reale Buchhaltungsanfragelast und vergleichen die Ergebnisse mit herkömmlichen Verteilungsansätzen. Unsere Auswertungen zeigen signifikante Verbesserungen für verschiedene Eigenschaften der berechneten Datenzuordnungen: Im Vergleich zu bestehenden Ansätzen können wir beispielsweise (i) die Menge der gespeicherten Daten in Cluster fast halbieren, (ii) den Anfragedurchsatz bei Knotenausfällen und unsicherer Anfragelast verbessern und benötigen dafür auch noch weniger Speicher, (iii) die Kosten von Datenänderungen halbieren, und (iv) weniger als 90 % der Daten umverteilen, wenn ein Rechenknoten zum Cluster hinzugefügt wird. Wichtig ist, dass wir die entsprechenden ILP-basierten heuristischen Lösungen innerhalb weniger Sekunden berechnen können. Schließlich demonstrieren wir, dass die Ideen von unseren ILP-basierten Heuristiken auch auf das Indexauswahlproblem anwendbar sind.
KW  - database systems
KW  - integer linear programming
KW  - partial replication
KW  - index selection
KW  - load balancing
KW  - Datenbanksysteme
KW  - Indexauswahl
KW  - ganzzahlige lineare Optimierung
KW  - Lastverteilung
KW  - partielle Replikation
Y1  - 2024
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-633615
ER  - 
TY  - JOUR
A1  - Bläsius, Thomas
A1  - Friedrich, Tobias
A1  - Lischeid, Julius
A1  - Meeks, Kitty
A1  - Schirneck, Friedrich Martin
T1  - Efficiently enumerating hitting sets of hypergraphs arising in data profiling
JF  - Journal of computer and system sciences : JCSS
N2  - The transversal hypergraph problem asks to enumerate the minimal hitting sets of a hypergraph. If the solutions have bounded size, Eiter and Gottlob [SICOMP'95] gave an algorithm running in output-polynomial time, but whose space requirement also scales with the output. We improve this to polynomial delay and space. Central to our approach is the extension problem, deciding for a set X of vertices whether it is contained in any minimal hitting set. We show that this is one of the first natural problems to be W[3]-complete. We give an algorithm for the extension problem running in time O(m(vertical bar X vertical bar+1) n) and prove a SETH-lower bound showing that this is close to optimal. We apply our enumeration method to the discovery problem of minimal unique column combinations from data profiling. Our empirical evaluation suggests that the algorithm outperforms its worst-case guarantees on hypergraphs stemming from real-world databases.
KW  - Data profiling
KW  - Enumeration algorithm
KW  - Minimal hitting set
KW  - Transversal hypergraph
KW  - Unique column combination
KW  - W[3]-Completeness
Y1  - 2022
U6  - https://doi.org/10.1016/j.jcss.2021.10.002
SN  - 0022-0000
SN  - 1090-2724
VL  - 124
SP  - 192
EP  - 213
PB  - Elsevier
CY  - San Diego
ER  - 
TY  - JOUR
A1  - Schlosser, Rainer
A1  - Chenavaz, Régis Y.
A1  - Dimitrov, Stanko
T1  - Circular economy
BT  - joint dynamic pricing and recycling investments
JF  - International journal of production economics
N2  - In a circular economy, the use of recycled resources in production is a key performance indicator for management. Yet, academic studies are still unable to inform managers on appropriate recycling and pricing policies. We develop an optimal control model integrating a firm's recycling rate, which can use both virgin and recycled resources in the production process. Our model accounts for recycling influence both at the supply- and demandsides. The positive effect of a firm's use of recycled resources diminishes over time but may increase through investments. Using general formulations for demand and cost, we analytically examine joint dynamic pricing and recycling investment policies in order to determine their optimal interplay over time. We provide numerical experiments to assess the existence of a steady-state and to calculate sensitivity analyses with respect to various model parameters. The analysis shows how to dynamically adapt jointly optimized controls to reach sustainability in the production process. Our results pave the way to sounder sustainable practices for firms operating within a circular economy.
KW  - Dynamic pricing
KW  - Recycling investments
KW  - Optimal control
KW  - General demand function
KW  - Circular economy
Y1  - 2021
U6  - https://doi.org/10.1016/j.ijpe.2021.108117
SN  - 0925-5273
SN  - 1873-7579
VL  - 236
PB  - Elsevier
CY  - Amsterdam
ER  - 
TY  - JOUR
A1  - Thienen, Julia von
A1  - Weinstein, Theresa Julia
A1  - Meinel, Christoph
T1  - Creative metacognition in design thinking
BT  - exploring theories, educational practices, and their implications for measurement
JF  - Frontiers in psychology
N2  - Design thinking is a well-established practical and educational approach to fostering high-level creativity and innovation, which has been refined since the 1950s with the participation of experts like Joy Paul Guilford and Abraham Maslow. Through real-world projects, trainees learn to optimize their creative outcomes by developing and practicing creative cognition and metacognition. This paper provides a holistic perspective on creativity, enabling the formulation of a comprehensive theoretical framework of creative metacognition. It focuses on the design thinking approach to creativity and explores the role of metacognition in four areas of creativity expertise: Products, Processes, People, and Places. The analysis includes task-outcome relationships (product metacognition), the monitoring of strategy effectiveness (process metacognition), an understanding of individual or group strengths and weaknesses (people metacognition), and an examination of the mutual impact between environments and creativity (place metacognition). It also reviews measures taken in design thinking education, including a distribution of cognition and metacognition, to support students in their development of creative mastery. On these grounds, we propose extended methods for measuring creative metacognition with the goal of enhancing comprehensive assessments of the phenomenon. Proposed methodological advancements include accuracy sub-scales, experimental tasks where examinees explore problem and solution spaces, combinations of naturalistic observations with capability testing, as well as physiological assessments as indirect measures of creative metacognition.
KW  - accuracy
KW  - creativity
KW  - design thinking
KW  - education
KW  - measurement
KW  - metacognition
KW  - innovation
KW  - framework
Y1  - 2023
U6  - https://doi.org/10.3389/fpsyg.2023.1157001
SN  - 1664-1078
VL  - 14
PB  - Frontiers Research Foundation
CY  - Lausanne
ER  - 
TY  - JOUR
A1  - Belaid, Mohamed Karim
A1  - Rabus, Maximilian
A1  - Krestel, Ralf
T1  - CrashNet
BT  - an encoder-decoder architecture to predict crash test outcomes
JF  - Data mining and knowledge discovery
N2  - Destructive car crash tests are an elaborate, time-consuming, and expensive necessity of the automotive development process. Today, finite element method (FEM) simulations are used to reduce costs by simulating car crashes computationally. We propose CrashNet, an encoder-decoder deep neural network architecture that reduces costs further and models specific outcomes of car crashes very accurately. We achieve this by formulating car crash events as time series prediction enriched with a set of scalar features. Traditional sequence-to-sequence models are usually composed of convolutional neural network (CNN) and CNN transpose layers. We propose to concatenate those with an MLP capable of learning how to inject the given scalars into the output time series. In addition, we replace the CNN transpose with 2D CNN transpose layers in order to force the model to process the hidden state of the set of scalars as one time series. The proposed CrashNet model can be trained efficiently and is able to process scalars and time series as input in order to infer the results of crash tests. CrashNet produces results faster and at a lower cost compared to destructive tests and FEM simulations. Moreover, it represents a novel approach in the car safety management domain.
KW  - Predictive models
KW  - Time series analysis
KW  - Supervised deep neural
KW  - networks
KW  - Car safety management
Y1  - 2021
U6  - https://doi.org/10.1007/s10618-021-00761-9
SN  - 1384-5810
SN  - 1573-756X
VL  - 35
IS  - 4
SP  - 1688
EP  - 1709
PB  - Springer
CY  - Dordrecht
ER  - 
TY  - GEN
A1  - Benson, Lawrence
A1  - Makait, Hendrik
A1  - Rabl, Tilmann
T1  - Viper
BT  - An Efficient Hybrid PMem-DRAM Key-Value Store
T2  - Zweitveröffentlichungen der Universität Potsdam : Reihe der Digital Engineering Fakultät
N2  - Key-value stores (KVSs) have found wide application in modern software systems. For persistence, their data resides in slow secondary storage, which requires KVSs to employ various techniques to increase their read and write performance from and to the underlying medium. Emerging persistent memory (PMem) technologies offer data persistence at close-to-DRAM speed, making them a promising alternative to classical disk-based storage. However, simply drop-in replacing existing storage with PMem does not yield good results, as block-based access behaves differently in PMem than on disk and ignores PMem's byte addressability, layout, and unique performance characteristics. In this paper, we propose three PMem-specific access patterns and implement them in a hybrid PMem-DRAM KVS called Viper. We employ a DRAM-based hash index and a PMem-aware storage layout to utilize the random-write speed of DRAM and efficient sequential-write performance PMem. Our evaluation shows that Viper significantly outperforms existing KVSs for core KVS operations while providing full data persistence. Moreover, Viper outperforms existing PMem-only, hybrid, and disk-based KVSs by 4-18x for write workloads, while matching or surpassing their get performance.
T3  - Zweitveröffentlichungen der Universität Potsdam : Reihe der Digital Engineering Fakultät - 20 
KW  - memory
Y1  - 2021
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-559664
SN  - 2150-8097
IS  - 9
ER  - 
TY  - GEN
A1  - Kruse, Sebastian
A1  - Kaoudi, Zoi
A1  - Contreras-Rojas, Bertty
A1  - Chawla, Sanjay
A1  - Naumann, Felix
A1  - Quiané-Ruiz, Jorge-Arnulfo
T1  - RHEEMix in the data jungle
BT  - a cost-based optimizer for cross-platform systems
T2  - Zweitveröffentlichungen der Universität Potsdam : Reihe der Digital Engineering Fakultät
N2  - Data analytics are moving beyond the limits of a single platform. In this paper, we present the cost-based optimizer of Rheem, an open-source cross-platform system that copes with these new requirements. The optimizer allocates the subtasks of data analytic tasks to the most suitable platforms. Our main contributions are: (i) a mechanism based on graph transformations to explore alternative execution strategies; (ii) a novel graph-based approach to determine efficient data movement plans among subtasks and platforms; and (iii) an efficient plan enumeration algorithm, based on a novel enumeration algebra. We extensively evaluate our optimizer under diverse real tasks. We show that our optimizer can perform tasks more than one order of magnitude faster when using multiple platforms than when using a single platform.
T3  - Zweitveröffentlichungen der Universität Potsdam : Reihe der Digital Engineering Fakultät - 22 
KW  - cross-platform
KW  - polystore
KW  - query optimization
KW  - data processing
Y1  - 2020
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-519443
IS  - 6
ER  - 
TY  - THES
A1  - Limberger, Daniel
T1  - Concepts and techniques for 3D-embedded treemaps and their application to software visualization
T1  - Konzepte und Techniken für 3D-eingebettete Treemaps und ihre Anwendung auf Softwarevisualisierung
N2  - This thesis addresses concepts and techniques for interactive visualization of hierarchical data using treemaps. It explores (1) how treemaps can be embedded in 3D space to improve their information content and expressiveness, (2) how the readability of treemaps can be improved using level-of-detail and degree-of-interest techniques, and (3) how to design and implement a software framework for the real-time web-based rendering of treemaps embedded in 3D. With a particular emphasis on their application, use cases from software analytics are taken to test and evaluate the presented concepts and techniques.

Concerning the first challenge, this thesis shows that a 3D attribute space offers enhanced possibilities for the visual mapping of data compared to classical 2D treemaps. In particular, embedding in 3D allows for improved implementation of visual variables (e.g., by sketchiness and color weaving), provision of new visual variables (e.g., by physically based materials and in situ templates), and integration of visual metaphors (e.g., by reference surfaces and renderings of natural phenomena) into the three-dimensional representation of treemaps.

For the second challenge—the readability of an information visualization—the work shows that the generally higher visual clutter and increased cognitive load typically associated with three-dimensional information representations can be kept low in treemap-based representations of both small and large hierarchical datasets. By introducing an adaptive level-of-detail technique, we cannot only declutter the visualization results, thereby reducing cognitive load and mitigating occlusion problems, but also summarize and highlight relevant data. Furthermore, this approach facilitates automatic labeling, supports the emphasis on data outliers, and allows visual variables to be adjusted via degree-of-interest measures.

The third challenge is addressed by developing a real-time rendering framework with WebGL and accumulative multi-frame rendering. The framework removes hardware constraints and graphics API requirements, reduces interaction response times, and simplifies high-quality rendering. At the same time, the implementation effort for a web-based deployment of treemaps is kept reasonable.

The presented visualization concepts and techniques are applied and evaluated for use cases in software analysis. In this domain, data about software systems, especially about the state and evolution of the source code, does not have a descriptive appearance or natural geometric mapping, making information visualization a key technology here. In particular, software source code can be visualized with treemap-based approaches because of its inherently hierarchical structure. With treemaps embedded in 3D, we can create interactive software maps that visually map, software metrics, software developer activities, or information about the evolution of software systems alongside their hierarchical module structure.

Discussions on remaining challenges and opportunities for future research for 3D-embedded treemaps and their applications conclude the thesis.
N2  - Diese Doktorarbeit behandelt Konzepte und Techniken zur interaktiven Visualisierung hierarchischer Daten mit Hilfe von Treemaps. Sie untersucht (1), wie Treemaps im 3D-Raum eingebettet werden können, um ihre Informationsinhalte und Ausdrucksfähigkeit zu verbessern, (2) wie die Lesbarkeit von Treemaps durch Techniken wie Level-of-Detail und Degree-of-Interest verbessert werden kann, und (3) wie man ein Software-Framework für das Echtzeit-Rendering von Treemaps im 3D-Raum entwirft und implementiert. Dabei werden Anwendungsfälle aus der Software-Analyse besonders betont und zur Verprobung und Bewertung der Konzepte und Techniken verwendet.

Hinsichtlich der ersten Herausforderung zeigt diese Arbeit, dass ein 3D-Attributraum im Vergleich zu klassischen 2D-Treemaps verbesserte Möglichkeiten für die visuelle Kartierung von Daten bietet. Insbesondere ermöglicht die Einbettung in 3D eine verbesserte Umsetzung von visuellen Variablen (z.B. durch Skizzenhaftigkeit und Farbverwebungen), die Bereitstellung neuer visueller Variablen (z.B. durch physikalisch basierte Materialien und In-situ-Vorlagen) und die Integration visueller Metaphern (z.B. durch Referenzflächen und Darstellungen natürlicher Phänomene) in die dreidimensionale Darstellung von Treemaps.

Für die zweite Herausforderung – die Lesbarkeit von Informationsvisualisierungen – zeigt die Arbeit, dass die allgemein höhere visuelle Unübersichtlichkeit und die damit einhergehende, erhöhte kognitive Belastung, die typischerweise mit dreidimensionalen Informationsdarstellungen verbunden sind, in Treemap-basierten Darstellungen sowohl kleiner als auch großer hierarchischer Datensätze niedrig gehalten werden können. Durch die Einführung eines adaptiven Level-of-Detail-Verfahrens lassen sich nicht nur die Visualisierungsergebnisse übersichtlicher gestalten, die kognitive Belastung reduzieren und Verdeckungsprobleme verringern, sondern auch relevante Daten zusammenfassen und hervorheben. Darüber hinaus erleichtert dieser Ansatz eine automatische Beschriftung, unterstützt die Hervorhebung von Daten-Ausreißern und ermöglicht die Anpassung von visuellen Variablen über Degree-of-Interest-Maße.

Die dritte Herausforderung wird durch die Entwicklung eines Echtzeit-Rendering-Frameworks mit WebGL und akkumulativem Multi-Frame-Rendering angegangen. Das Framework hebt mehrere Hardwarebeschränkungen und Anforderungen an die Grafik-API auf, verkürzt die Reaktionszeiten auf Interaktionen und vereinfacht qualitativ hochwertiges Rendering.
Gleichzeitig wird der Implementierungsaufwand für einen webbasierten Einsatz von Treemaps geringgehalten.

Die vorgestellten Visualisierungskonzepte und -techniken werden für Anwendungsfälle in der Softwareanalyse eingesetzt und evaluiert. In diesem Bereich haben Daten über Softwaresysteme, insbesondere über den Zustand und die Evolution des Quellcodes, keine anschauliche Erscheinung oder natürliche geometrische Zuordnung, so dass die Informationsvisualisierung hier eine Schlüsseltechnologie darstellt. Insbesondere Softwarequellcode kann aufgrund seiner inhärenten hierarchischen Struktur mit Hilfe von Treemap-basierten Ansätzen visualisiert werden. Mit in 3D-eingebetteten Treemaps können wir interaktive Softwarelagekarten erstellen, die z.B. Softwaremetriken, Aktivitäten von Softwareentwickler*innen und Informationen über die Evolution von Softwaresystemen in ihrer hierarchischen Modulstruktur abbilden und veranschaulichen.

Diskussionen über verbleibende Herausforderungen und Möglichkeiten für zukünftige Forschung zu 3D-eingebetteten Treemaps und deren Anwendungen schließen die Arbeit ab.
KW  - treemaps
KW  - software visualization
KW  - software analytics
KW  - web-based rendering
KW  - degree-of-interest techniques
KW  - labeling
KW  - 3D-embedding
KW  - interactive visualization
KW  - progressive rendering
KW  - hierarchical data
KW  - 3D-Einbettung
KW  - Interessengrad-Techniken
KW  - hierarchische Daten
KW  - interaktive Visualisierung
KW  - Beschriftung
KW  - progressives Rendering
KW  - Softwareanalytik
KW  - Softwarevisualisierung
KW  - Treemaps
KW  - Web-basiertes Rendering
Y1  - 2024
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-632014
ER  - 
TY  - CHAP
A1  - Corazza, Giovanni Emanuele
A1  - Thienen, Julia von
ED  - Glăveanu, Vlad Petre
T1  - Invention
T2  - The Palgrave encyclopedia of the possible
N2  - This entry addresses invention from five different perspectives: (i) definition of the term, (ii) mechanisms underlying invention processes, (iii) (pre-)history of human inventions, (iv) intellectual property protection vs open innovation, and (v) case studies of great inventors. Regarding the definition, an invention is the outcome of a creative process taking place within a technological milieu, which is recognized as successful in terms of its effectiveness as an original technology. In the process of invention, a technological possibility becomes realized. Inventions are distinct from either discovery or innovation. In human creative processes, seven mechanisms of invention can be observed, yielding characteristic outcomes: (1) basic inventions, (2) invention branches, (3) invention combinations, (4) invention toolkits, (5) invention exaptations, (6) invention values, and (7) game-changing inventions. The development of humanity has been strongly shaped by inventions ever since early stone tools and the conception of agriculture. An “explosion of creativity” has been associated with Homo sapiens, and inventions in all fields of human endeavor have followed suit, engendering an exponential growth of cumulative culture. This culture development emerges essentially through a reuse of previous inventions, their revision, amendment and rededication. In sociocultural terms, humans have increasingly regulated processes of invention and invention-reuse through concepts such as intellectual property, patents, open innovation and licensing methods. Finally, three case studies of great inventors are considered: Edison, Marconi, and Montessori, next to a discussion of human invention processes as collaborative endeavors.
KW  - invention
KW  - creativity
KW  - invention mechanism
KW  - cumulative culture
KW  - technology
KW  - innovation
KW  - patent
KW  - open innovation
Y1  - 2023
SN  - 978-3-030-90912-3
SN  - 978-3-030-90913-0
U6  - https://doi.org/10.1007/978-3-030-90913-0_14
SP  - 806
EP  - 814
PB  - Springer International Publishing
CY  - Cham
ER  - 
TY  - JOUR
A1  - Hiort, Pauline
A1  - Schlaffner, Christoph N.
A1  - Steen, Judith A.
A1  - Renard, Bernhard Y.
A1  - Steen, Hanno
T1  - multiFLEX-LF: a computational approach to quantify the modification stoichiometries in label-free proteomics data sets
JF  - Journal of proteome research
N2  - In liquid-chromatography-tandem-mass-spectrometry-based proteomics, information about the presence and stoichiometry ofprotein modifications is not readily available. To overcome this problem,we developed multiFLEX-LF, a computational tool that builds uponFLEXIQuant, which detects modified peptide precursors and quantifiestheir modification extent by monitoring the differences between observedand expected intensities of the unmodified precursors. multiFLEX-LFrelies on robust linear regression to calculate the modification extent of agiven precursor relative to a within-study reference. multiFLEX-LF cananalyze entire label-free discovery proteomics data sets in a precursor-centric manner without preselecting a protein of interest. To analyzemodification dynamics and coregulated modifications, we hierarchicallyclustered the precursors of all proteins based on their computed relativemodification scores. We applied multiFLEX-LF to a data-independent-acquisition-based data set acquired using the anaphase-promoting complex/cyclosome (APC/C) isolated at various time pointsduring mitosis. The clustering of the precursors allows for identifying varying modification dynamics and ordering the modificationevents. Overall, multiFLEX-LF enables the fast identification of potentially differentially modified peptide precursors and thequantification of their differential modification extent in large data sets using a personal computer. Additionally, multiFLEX-LF candrive the large-scale investigation of the modification dynamics of peptide precursors in time-series and case-control studies.multiFLEX-LF is available athttps://gitlab.com/SteenOmicsLab/multiflex-lf.
KW  - bioinformatics tool
KW  - label-free quantification
KW  - LC-MS
KW  - MS
KW  - post-translational modification
KW  - modification stoichiometry
KW  - PTM
KW  - quantification
Y1  - 2022
U6  - https://doi.org/10.1021/acs.jproteome.1c00669
SN  - 1535-3893
SN  - 1535-3907
VL  - 21
IS  - 4
SP  - 899
EP  - 909
PB  - American Chemical Society
CY  - Washington
ER  - 
TY  - JOUR
A1  - Wittig, Alice
A1  - Miranda, Fabio Malcher
A1  - Hölzer, Martin
A1  - Altenburg, Tom
A1  - Bartoszewicz, Jakub Maciej
A1  - Beyvers, Sebastian
A1  - Dieckmann, Marius Alfred
A1  - Genske, Ulrich
A1  - Giese, Sven Hans-Joachim
A1  - Nowicka, Melania
A1  - Richard, Hugues
A1  - Schiebenhoefer, Henning
A1  - Schmachtenberg, Anna-Juliane
A1  - Sieben, Paul
A1  - Tang, Ming
A1  - Tembrockhaus, Julius
A1  - Renard, Bernhard Y.
A1  - Fuchs, Stephan
T1  - CovRadar
BT  - continuously tracking and filtering SARS-CoV-2 mutations for genomic surveillance
JF  - Bioinformatics
N2  - The ongoing pandemic caused by SARS-CoV-2 emphasizes the importance of genomic surveillance to understand the evolution of the virus, to monitor the viral population, and plan epidemiological responses. Detailed analysis, easy visualization and intuitive filtering of the latest viral sequences are powerful for this purpose. We present CovRadar, a tool for genomic surveillance of the SARS-CoV-2 Spike protein. CovRadar consists of an analytical pipeline and a web application that enable the analysis and visualization of hundreds of thousand sequences. First, CovRadar extracts the regions of interest using local alignment, then builds a multiple sequence alignment, infers variants and consensus and finally presents the results in an interactive app, making accessing and reporting simple, flexible and fast.
Y1  - 2022
U6  - https://doi.org/10.1093/bioinformatics/btac411
SN  - 1367-4803
SN  - 1367-4811
VL  - 38
IS  - 17
SP  - 4223
EP  - 4225
PB  - Oxford Univ. Press
CY  - Oxford
ER  - 
TY  - JOUR
A1  - Omolaoye, Temidayo S.
A1  - Omolaoye, Victor Adelakun
A1  - Kandasamy, Richard K.
A1  - Hachim, Mahmood Yaseen
A1  - Du Plessis, Stefan S.
T1  - Omics and male infertility
BT  - highlighting the application of transcriptomic data
JF  - Life : open access journal
N2  - Male infertility is a multifaceted disorder affecting approximately 50% of male partners in infertile couples. 
Over the years, male infertility has been diagnosed mainly through semen analysis, hormone evaluations, medical records and physical examinations, which of course are fundamental, but yet inefficient, because 30% of male infertility cases remain idiopathic. This dilemmatic status of the unknown needs to be addressed with more sophisticated and result-driven technologies and/or techniques. 
Genetic alterations have been linked with male infertility, thereby unveiling the practicality of investigating this disorder from the "omics" perspective. 
Omics aims at analyzing the structure and functions of a whole constituent of a given biological function at different levels, including the molecular gene level (genomics), transcript level (transcriptomics), protein level (proteomics) and metabolites level (metabolomics). In the current study, an overview of the four branches of omics and their roles in male infertility are briefly discussed; the potential usefulness of assessing transcriptomic data to understand this pathology is also elucidated. 
After assessing the publicly obtainable transcriptomic data for datasets on male infertility, a total of 1385 datasets were retrieved, of which 10 datasets met the inclusion criteria and were used for further analysis. 
These datasets were classified into groups according to the disease or cause of male infertility. 
The groups include non-obstructive azoospermia (NOA), obstructive azoospermia (OA), non-obstructive and obstructive azoospermia (NOA and OA), spermatogenic dysfunction, sperm dysfunction, and Y chromosome microdeletion. 
Findings revealed that 8 genes (LDHC, PDHA2, TNP1, TNP2, ODF1, ODF2, SPINK2, PCDHB3) were commonly differentially expressed between all disease groups. 
Likewise, 56 genes were common between NOA versus NOA and OA (ADAD1, BANF2, BCL2L14, C12orf50, C20orf173, C22orf23, C6orf99, C9orf131, C9orf24, CABS1, CAPZA3, CCDC187, CCDC54, CDKN3, CEP170, CFAP206, CRISP2, CT83, CXorf65, FAM209A, FAM71F1, FAM81B, GALNTL5, GTSF1, H1FNT, HEMGN, HMGB4, KIF2B, LDHC, LOC441601, LYZL2, ODF1, ODF2, PCDHB3, PDHA2, PGK2, PIH1D2, PLCZ1, PROCA1, RIMBP3, ROPN1L, SHCBP1L, SMCP, SPATA16, SPATA19, SPINK2, TEX33, TKTL2, TMCO2, TMCO5A, TNP1, TNP2, TSPAN16, TSSK1B, TTLL2, UBQLN3). 
These genes, particularly the above-mentioned 8 genes, are involved in diverse biological processes such as germ cell development, spermatid development, spermatid differentiation, regulation of proteolysis, spermatogenesis and metabolic processes. 
Owing to the stage-specific expression of these genes, any mal-expression can ultimately lead to male infertility. 
Therefore, currently available data on all branches of omics relating to male fertility can be used to identify biomarkers for diagnosing male infertility, which can potentially help in unravelling some idiopathic cases.
KW  - male infertility
KW  - omics
KW  - genomics
KW  - transcriptomics
KW  - proteomics
KW  - metabolomics
Y1  - 2022
U6  - https://doi.org/10.3390/life12020280
SN  - 2075-1729
VL  - 12
IS  - 2
PB  - MDPI
CY  - Basel
ER  -