TY - JOUR A1 - Kühne, Katharina A1 - Herbold, Erika A1 - Bendel, Oliver A1 - Zhou, Yuefang A1 - Fischer, Martin H. T1 - “Ick bin een Berlina” BT - dialect proficiency impacts a robot’s trustworthiness and competence evaluation JF - Frontiers in robotics and AI N2 - Background: Robots are increasingly used as interaction partners with humans. Social robots are designed to follow expected behavioral norms when engaging with humans and are available with different voices and even accents. Some studies suggest that people prefer robots to speak in the user’s dialect, while others indicate a preference for different dialects. Methods: Our study examined the impact of the Berlin dialect on perceived trustworthiness and competence of a robot. One hundred and twenty German native speakers (Mage = 32 years, SD = 12 years) watched an online video featuring a NAO robot speaking either in the Berlin dialect or standard German and assessed its trustworthiness and competence. Results: We found a positive relationship between participants’ self-reported Berlin dialect proficiency and trustworthiness in the dialect-speaking robot. Only when controlled for demographic factors, there was a positive association between participants’ dialect proficiency, dialect performance and their assessment of robot’s competence for the standard German-speaking robot. Participants’ age, gender, length of residency in Berlin, and device used to respond also influenced assessments. Finally, the robot’s competence positively predicted its trustworthiness. Discussion: Our results inform the design of social robots and emphasize the importance of device control in online experiments. KW - competence KW - dialect KW - human-robot interaction KW - robot voice KW - social robot KW - trust Y1 - 2024 U6 - https://doi.org/10.3389/frobt.2023.1241519 SN - 2296-9144 VL - 10 PB - Frontiers Media S.A. CY - Lausanne ER - TY - CHAP A1 - Corazza, Giovanni Emanuele A1 - Thienen, Julia von ED - Glăveanu, Vlad Petre T1 - Invention T2 - The Palgrave encyclopedia of the possible N2 - This entry addresses invention from five different perspectives: (i) definition of the term, (ii) mechanisms underlying invention processes, (iii) (pre-)history of human inventions, (iv) intellectual property protection vs open innovation, and (v) case studies of great inventors. Regarding the definition, an invention is the outcome of a creative process taking place within a technological milieu, which is recognized as successful in terms of its effectiveness as an original technology. In the process of invention, a technological possibility becomes realized. Inventions are distinct from either discovery or innovation. In human creative processes, seven mechanisms of invention can be observed, yielding characteristic outcomes: (1) basic inventions, (2) invention branches, (3) invention combinations, (4) invention toolkits, (5) invention exaptations, (6) invention values, and (7) game-changing inventions. The development of humanity has been strongly shaped by inventions ever since early stone tools and the conception of agriculture. An “explosion of creativity” has been associated with Homo sapiens, and inventions in all fields of human endeavor have followed suit, engendering an exponential growth of cumulative culture. This culture development emerges essentially through a reuse of previous inventions, their revision, amendment and rededication. In sociocultural terms, humans have increasingly regulated processes of invention and invention-reuse through concepts such as intellectual property, patents, open innovation and licensing methods. Finally, three case studies of great inventors are considered: Edison, Marconi, and Montessori, next to a discussion of human invention processes as collaborative endeavors. KW - invention KW - creativity KW - invention mechanism KW - cumulative culture KW - technology KW - innovation KW - patent KW - open innovation Y1 - 2023 SN - 978-3-030-90912-3 SN - 978-3-030-90913-0 U6 - https://doi.org/10.1007/978-3-030-90913-0_14 SP - 806 EP - 814 PB - Springer International Publishing CY - Cham ER - TY - JOUR A1 - Hagedorn, Christiane A1 - Serth, Sebastian A1 - Meinel, Christoph T1 - The mysterious adventures of Detective Duke BT - how storified programming MOOCs support learners in achieving their learning goals JF - Frontiers in education N2 - About 15 years ago, the first Massive Open Online Courses (MOOCs) appeared and revolutionized online education with more interactive and engaging course designs. Yet, keeping learners motivated and ensuring high satisfaction is one of the challenges today's course designers face. Therefore, many MOOC providers employed gamification elements that only boost extrinsic motivation briefly and are limited to platform support. In this article, we introduce and evaluate a gameful learning design we used in several iterations on computer science education courses. For each of the courses on the fundamentals of the Java programming language, we developed a self-contained, continuous story that accompanies learners through their learning journey and helps visualize key concepts. Furthermore, we share our approach to creating the surrounding story in our MOOCs and provide a guideline for educators to develop their own stories. Our data and the long-term evaluation spanning over four Java courses between 2017 and 2021 indicates the openness of learners toward storified programming courses in general and highlights those elements that had the highest impact. While only a few learners did not like the story at all, most learners consumed the additional story elements we provided. However, learners' interest in influencing the story through majority voting was negligible and did not show a considerable positive impact, so we continued with a fixed story instead. We did not find evidence that learners just participated in the narrative because they worked on all materials. Instead, for 10-16% of learners, the story was their main course motivation. We also investigated differences in the presentation format and concluded that several longer audio-book style videos were most preferred by learners in comparison to animated videos or different textual formats. Surprisingly, the availability of a coherent story embedding examples and providing a context for the practical programming exercises also led to a slightly higher ranking in the perceived quality of the learning material (by 4%). With our research in the context of storified MOOCs, we advance gameful learning designs, foster learner engagement and satisfaction in online courses, and help educators ease knowledge transfer for their learners. KW - gameful learning KW - storytelling KW - programming KW - learner engagement KW - course design KW - MOOCs KW - content gamification KW - narrative Y1 - 2023 U6 - https://doi.org/10.3389/feduc.2022.1016401 SN - 2504-284X VL - 7 PB - Frontiers Media CY - Lausanne ER - TY - JOUR A1 - Puri, Manish A1 - Varde, Aparna S. A1 - Melo, Gerard de T1 - Commonsense based text mining on urban policy JF - Language resources and evaluation N2 - Local laws on urban policy, i.e., ordinances directly affect our daily life in various ways (health, business etc.), yet in practice, for many citizens they remain impervious and complex. This article focuses on an approach to make urban policy more accessible and comprehensible to the general public and to government officials, while also addressing pertinent social media postings. Due to the intricacies of the natural language, ranging from complex legalese in ordinances to informal lingo in tweets, it is practical to harness human judgment here. To this end, we mine ordinances and tweets via reasoning based on commonsense knowledge so as to better account for pragmatics and semantics in the text. Ours is pioneering work in ordinance mining, and thus there is no prior labeled training data available for learning. This gap is filled by commonsense knowledge, a prudent choice in situations involving a lack of adequate training data. The ordinance mining can be beneficial to the public in fathoming policies and to officials in assessing policy effectiveness based on public reactions. This work contributes to smart governance, leveraging transparency in governing processes via public involvement. We focus significantly on ordinances contributing to smart cities, hence an important goal is to assess how well an urban region heads towards a smart city as per its policies mapping with smart city characteristics, and the corresponding public satisfaction. KW - Commonsense reasoning KW - Opinion mining KW - Ordinances KW - Smart cities KW - Social KW - media KW - Text mining Y1 - 2022 U6 - https://doi.org/10.1007/s10579-022-09584-6 SN - 1574-020X SN - 1574-0218 VL - 57 SP - 733 EP - 763 PB - Springer CY - Dordrecht [u.a.] ER - TY - JOUR A1 - Piro, Vitor C. A1 - Renard, Bernhard Y. T1 - Contamination detection and microbiome exploration with GRIMER JF - GigaScience N2 - Background: Contamination detection is a important step that should be carefully considered in early stages when designing and performing microbiome studies to avoid biased outcomes. Detecting and removing true contaminants is challenging, especially in low-biomass samples or in studies lacking proper controls. Interactive visualizations and analysis platforms are crucial to better guide this step, to help to identify and detect noisy patterns that could potentially be contamination. Additionally, external evidence, like aggregation of several contamination detection methods and the use of common contaminants reported in the literature, could help to discover and mitigate contamination. Results: We propose GRIMER, a tool that performs automated analyses and generates a portable and interactive dashboard integrating annotation, taxonomy, and metadata. It unifies several sources of evidence to help detect contamination. GRIMER is independent of quantification methods and directly analyzes contingency tables to create an interactive and offline report. Reports can be created in seconds and are accessible for nonspecialists, providing an intuitive set of charts to explore data distribution among observations and samples and its connections with external sources. Further, we compiled and used an extensive list of possible external contaminant taxa and common contaminants with 210 genera and 627 species reported in 22 published articles. Conclusion: GRIMER enables visual data exploration and analysis, supporting contamination detection in microbiome studies. The tool and data presented are open source and available at https://gitlab.com/dacs-hpi/grimer. KW - Contamination KW - Microbiome KW - Visualization KW - Taxonomy Y1 - 2023 U6 - https://doi.org/10.1093/gigascience/giad017 SN - 2047-217X VL - 12 PB - Oxford Univ. Press CY - Oxford ER - TY - JOUR A1 - Cohen, Sarel A1 - Hershcovitch, Moshik A1 - Taraz, Martin A1 - Kissig, Otto A1 - Issac, Davis A1 - Wood, Andrew A1 - Waddington, Daniel A1 - Chin, Peter A1 - Friedrich, Tobias T1 - Improved and optimized drug repurposing for the SARS-CoV-2 pandemic JF - PLoS one N2 - The active global SARS-CoV-2 pandemic caused more than 426 million cases and 5.8 million deaths worldwide. The development of completely new drugs for such a novel disease is a challenging, time intensive process. Despite researchers around the world working on this task, no effective treatments have been developed yet. This emphasizes the importance of drug repurposing, where treatments are found among existing drugs that are meant for different diseases. A common approach to this is based on knowledge graphs, that condense relationships between entities like drugs, diseases and genes. Graph neural networks (GNNs) can then be used for the task at hand by predicting links in such knowledge graphs. Expanding on state-of-the-art GNN research, Doshi et al. recently developed the Dr-COVID model. We further extend their work using additional output interpretation strategies. The best aggregation strategy derives a top-100 ranking of 8,070 candidate drugs, 32 of which are currently being tested in COVID-19-related clinical trials. Moreover, we present an alternative application for the model, the generation of additional candidates based on a given pre-selection of drug candidates using collaborative filtering. In addition, we improved the implementation of the Dr-COVID model by significantly shortening the inference and pre-processing time by exploiting data-parallelism. As drug repurposing is a task that requires high computation and memory resources, we further accelerate the post-processing phase using a new emerging hardware-we propose a new approach to leverage the use of high-capacity Non-Volatile Memory for aggregate drug ranking. Y1 - 2023 U6 - https://doi.org/10.1371/journal.pone.0266572 SN - 1932-6203 VL - 18 IS - 3 PB - PLoS CY - San Fransisco ER - TY - JOUR A1 - Kappattanavar, Arpita Mallikarjuna A1 - Hecker, Pascal A1 - Moontaha, Sidratul A1 - Steckhan, Nico A1 - Arnrich, Bert T1 - Food choices after cognitive load BT - an affective computing approach JF - Sensors N2 - Psychology and nutritional science research has highlighted the impact of negative emotions and cognitive load on calorie consumption behaviour using subjective questionnaires. Isolated studies in other domains objectively assess cognitive load without considering its effects on eating behaviour. This study aims to explore the potential for developing an integrated eating behaviour assistant system that incorporates cognitive load factors. Two experimental sessions were conducted using custom-developed experimentation software to induce different stimuli. During these sessions, we collected 30 h of physiological, food consumption, and affective states questionnaires data to automatically detect cognitive load and analyse its effect on food choice. Utilising grid search optimisation and leave-one-subject-out cross-validation, a support vector machine model achieved a mean classification accuracy of 85.12% for the two cognitive load tasks using eight relevant features. Statistical analysis was performed on calorie consumption and questionnaire data. Furthermore, 75% of the subjects with higher negative affect significantly increased consumption of specific foods after high-cognitive-load tasks. These findings offer insights into the intricate relationship between cognitive load, affective states, and food choice, paving the way for an eating behaviour assistant system to manage food choices during cognitive load. Future research should enhance system capabilities and explore real-world applications. KW - cognitive load KW - eating behaviour KW - machine learning KW - physiological signals KW - photoplethysmography KW - electrodermal activity KW - sensors Y1 - 2023 U6 - https://doi.org/10.3390/s23146597 SN - 1424-8220 VL - 23 IS - 14 PB - MDPI CY - Basel ER - TY - JOUR A1 - Garrels, Tim A1 - Khodabakhsh, Athar A1 - Renard, Bernhard Y. A1 - Baum, Katharina T1 - LazyFox: fast and parallelized overlapping community detection in large graphs JF - PEERJ Computer Science N2 - The detection of communities in graph datasets provides insight about a graph's underlying structure and is an important tool for various domains such as social sciences, marketing, traffic forecast, and drug discovery. While most existing algorithms provide fast approaches for community detection, their results usually contain strictly separated communities. However, most datasets would semantically allow for or even require overlapping communities that can only be determined at much higher computational cost. We build on an efficient algorithm, FOX, that detects such overlapping communities. FOX measures the closeness of a node to a community by approximating the count of triangles which that node forms with that community. We propose LAZYFOX, a multi-threaded adaptation of the FOX algorithm, which provides even faster detection without an impact on community quality. This allows for the analyses of significantly larger and more complex datasets. LAZYFOX enables overlapping community detection on complex graph datasets with millions of nodes and billions of edges in days instead of weeks. As part of this work, LAZYFOX's implementation was published and is available as a tool under an MIT licence at https://github.com/TimGarrels/LazyFox. KW - Overlapping community detection KW - Large networks KW - Weighted clustering coefficient KW - Heuristic triangle estimation KW - Parallelized algorithm KW - C++ tool KW - Runtime improvement KW - Open source KW - Graph algorithm KW - Community analysis Y1 - 2023 U6 - https://doi.org/10.7717/peerj-cs.1291 SN - 2376-5992 VL - 9 PB - PeerJ Inc. CY - London ER - TY - JOUR A1 - Gärtner, Thomas A1 - Schneider, Juliana A1 - Arnrich, Bert A1 - Konigorski, Stefan T1 - Comparison of Bayesian Networks, G-estimation and linear models to estimate causal treatment effects in aggregated N-of-1 trials with carry-over effects JF - BMC Medical Research Methodology N2 - Background The aggregation of a series of N-of-1 trials presents an innovative and efficient study design, as an alternative to traditional randomized clinical trials. Challenges for the statistical analysis arise when there is carry-over or complex dependencies of the treatment effect of interest. Methods In this study, we evaluate and compare methods for the analysis of aggregated N-of-1 trials in different scenarios with carry-over and complex dependencies of treatment effects on covariates. For this, we simulate data of a series of N-of-1 trials for Chronic Nonspecific Low Back Pain based on assumed causal relationships parameterized by directed acyclic graphs. In addition to existing statistical methods such as regression models, Bayesian Networks, and G-estimation, we introduce a carry-over adjusted parametric model (COAPM). Results The results show that all evaluated existing models have a good performance when there is no carry-over and no treatment dependence. When there is carry-over, COAPM yields unbiased and more efficient estimates while all other methods show some bias in the estimation. When there is known treatment dependence, all approaches that are capable to model it yield unbiased estimates. Finally, the efficiency of all methods decreases slightly when there are missing values, and the bias in the estimates can also increase. Conclusions This study presents a systematic evaluation of existing and novel approaches for the statistical analysis of a series of N-of-1 trials. We derive practical recommendations which methods may be best in which scenarios. KW - N-of-1 trials KW - Randomized clinical trials KW - Bayesian Networks; KW - G-estimation KW - Linear model KW - Simulation study KW - Chronic Nonspecific Low KW - Back Pain Y1 - 2023 U6 - https://doi.org/10.1186/s12874-023-02012-5 SN - 1471-2288 VL - 23 IS - 1 PB - BMC CY - London ER - TY - JOUR A1 - Moontaha, Sidratul A1 - Schumann, Franziska Elisabeth Friederike A1 - Arnrich, Bert T1 - Online learning for wearable EEG-Based emotion classification JF - Sensors N2 - Giving emotional intelligence to machines can facilitate the early detection and prediction of mental diseases and symptoms. Electroencephalography (EEG)-based emotion recognition is widely applied because it measures electrical correlates directly from the brain rather than indirect measurement of other physiological responses initiated by the brain. Therefore, we used non-invasive and portable EEG sensors to develop a real-time emotion classification pipeline. The pipeline trains different binary classifiers for Valence and Arousal dimensions from an incoming EEG data stream achieving a 23.9% (Arousal) and 25.8% (Valence) higher F1-Score on the state-of-art AMIGOS dataset than previous work. Afterward, the pipeline was applied to the curated dataset from 15 participants using two consumer-grade EEG devices while watching 16 short emotional videos in a controlled environment. Mean F1-Scores of 87% (Arousal) and 82% (Valence) were achieved for an immediate label setting. Additionally, the pipeline proved to be fast enough to achieve predictions in real-time in a live scenario with delayed labels while continuously being updated. The significant discrepancy from the readily available labels on the classification scores leads to future work to include more data. Thereafter, the pipeline is ready to be used for real-time applications of emotion classification. KW - online learning KW - real-time KW - emotion classification KW - AMIGOS dataset KW - wearable EEG (muse and neurosity crown) KW - psychopy experiments Y1 - 2023 U6 - https://doi.org/10.3390/s23052387 SN - 1424-8220 VL - 23 IS - 5 PB - MDPI CY - Basel ER - TY - JOUR A1 - Lewkowicz, Daniel A1 - Böttinger, Erwin A1 - Siegel, Martin T1 - Economic evaluation of digital therapeutic care apps for unsupervised treatment of low back pain BT - Monte Carlo Simulation JF - JMIR mhealth and uhealth N2 - Background: Digital therapeutic care (DTC) programs are unsupervised app-based treatments that provide video exercises and educational material to patients with nonspecific low back pain during episodes of pain and functional disability. German statutory health insurance can reimburse DTC programs since 2019, but evidence on efficacy and reasonable pricing remains scarce. This paper presents a probabilistic sensitivity analysis (PSA) to evaluate the efficacy and cost-utility of a DTC app against treatment as usual (TAU) in Germany. Objective: The aim of this study was to perform a PSA in the form of a Monte Carlo simulation based on the deterministic base case analysis to account for model assumptions and parameter uncertainty. We also intend to explore to what extent the results in this probabilistic analysis differ from the results in the base case analysis and to what extent a shortage of outcome data concerning quality-of-life (QoL) metrics impacts the overall results. Methods: The PSA builds upon a state-transition Markov chain with a 4-week cycle length over a model time horizon of 3 years from a recently published deterministic cost-utility analysis. A Monte Carlo simulation with 10,000 iterations and a cohort size of 10,000 was employed to evaluate the cost-utility from a societal perspective. Quality-adjusted life years (QALYs) were derived from Veterans RAND 6-Dimension (VR-6D) and Short-Form 6-Dimension (SF-6D) single utility scores. Finally, we also simulated reducing the price for a 3-month app prescription to analyze at which price threshold DTC would result in being the dominant strategy over TAU in Germany. Results: The Monte Carlo simulation yielded on average a euro135.97 (a currency exchange rate of EUR euro1=US $1.069 is applicable) incremental cost and 0.004 incremental QALYs per person and year for the unsupervised DTC app strategy compared to in-person physiotherapy in Germany. The corresponding incremental cost-utility ratio (ICUR) amounts to an additional euro34,315.19 per additional QALY. DTC yielded more QALYs in 54.96% of the iterations. DTC dominates TAU in 24.04% of the iterations for QALYs. Reducing the app price in the simulation from currently euro239.96 to euro164.61 for a 3-month prescription could yield a negative ICUR and thus make DTC the dominant strategy, even though the estimated probability of DTC being more effective than TAU is only 54.96%. Conclusions: Decision-makers should be cautious when considering the reimbursement of DTC apps since no significant treatment effect was found, and the probability of cost-effectiveness remains below 60% even for an infinite willingness-to-pay threshold. More app-based studies involving the utilization of QoL outcome parameters are urgently needed to account for the low and limited precision of the available QoL input parameters, which are crucial to making profound recommendations concerning the cost-utility of novel apps. KW - cost-utility analysis KW - cost KW - probabilistic sensitivity analysis KW - Monte Carlo simulation KW - low back pain KW - pain KW - economic KW - cost-effectiveness KW - Markov model KW - digital therapy KW - digital health app KW - mHealth KW - mobile health KW - health app KW - mobile app KW - orthopedic KW - QUALY KW - DALY KW - quality-adjusted life years KW - disability-adjusted life years KW - time horizon KW - veteran KW - statistics Y1 - 2023 U6 - https://doi.org/10.2196/44585 SN - 2291-5222 VL - 11 PB - JMIR Publications CY - Toronto ER - TY - JOUR A1 - Ehrig, Lukas A1 - Wagner, Ann-Christin A1 - Wolter, Heike A1 - Correll, Christoph U. A1 - Geisel, Olga A1 - Konigorski, Stefan T1 - FASDetect as a machine learning-based screening app for FASD in youth with ADHD JF - npj Digital Medicine N2 - Fetal alcohol-spectrum disorder (FASD) is underdiagnosed and often misdiagnosed as attention-deficit/hyperactivity disorder (ADHD). Here, we develop a screening tool for FASD in youth with ADHD symptoms. To develop the prediction model, medical record data from a German University outpatient unit are assessed including 275 patients aged 0-19 years old with FASD with or without ADHD and 170 patients with ADHD without FASD aged 0-19 years old. We train 6 machine learning models based on 13 selected variables and evaluate their performance. Random forest models yield the best prediction models with a cross-validated AUC of 0.92 (95% confidence interval [0.84, 0.99]). Follow-up analyses indicate that a random forest model with 6 variables - body length and head circumference at birth, IQ, socially intrusive behaviour, poor memory and sleep disturbance - yields equivalent predictive accuracy. We implement the prediction model in a web-based app called FASDetect - a user-friendly, clinically scalable FASD risk calculator that is freely available at https://fasdetect.dhc-lab.hpi.de. KW - Medical research KW - Psychiatric disorders Y1 - 2023 U6 - https://doi.org/10.1038/s41746-023-00864-1 SN - 2398-6352 VL - 6 IS - 1 PB - Macmillan Publishers Limited CY - Basingstoke ER - TY - JOUR A1 - Slosarek, Tamara A1 - Ibing, Susanne A1 - Schormair, Barbara A1 - Heyne, Henrike A1 - Böttinger, Erwin A1 - Andlauer, Till A1 - Schurmann, Claudia T1 - Implementation and evaluation of personal genetic testing as part of genomics analysis courses in German universities JF - BMC Medical Genomics N2 - Purpose Due to the increasing application of genome analysis and interpretation in medical disciplines, professionals require adequate education. Here, we present the implementation of personal genotyping as an educational tool in two genomics courses targeting Digital Health students at the Hasso Plattner Institute (HPI) and medical students at the Technical University of Munich (TUM). Methods We compared and evaluated the courses and the students ' perceptions on the course setup using questionnaires. Results During the course, students changed their attitudes towards genotyping (HPI: 79% [15 of 19], TUM: 47% [25 of 53]). Predominantly, students became more critical of personal genotyping (HPI: 73% [11 of 15], TUM: 72% [18 of 25]) and most students stated that genetic analyses should not be allowed without genetic counseling (HPI: 79% [15 of 19], TUM: 70% [37 of 53]). Students found the personal genotyping component useful (HPI: 89% [17 of 19], TUM: 92% [49 of 53]) and recommended its inclusion in future courses (HPI: 95% [18 of 19], TUM: 98% [52 of 53]). Conclusion Students perceived the personal genotyping component as valuable in the described genomics courses. The implementation described here can serve as an example for future courses in Europe. KW - Genomics education KW - Personal genotyping KW - Personalized medicine Y1 - 2023 U6 - https://doi.org/10.1186/s12920-023-01503-0 SN - 1755-8794 VL - 16 IS - 1 PB - BMC CY - London ER - TY - JOUR A1 - Thienen, Julia von A1 - Weinstein, Theresa Julia A1 - Meinel, Christoph T1 - Creative metacognition in design thinking BT - exploring theories, educational practices, and their implications for measurement JF - Frontiers in psychology N2 - Design thinking is a well-established practical and educational approach to fostering high-level creativity and innovation, which has been refined since the 1950s with the participation of experts like Joy Paul Guilford and Abraham Maslow. Through real-world projects, trainees learn to optimize their creative outcomes by developing and practicing creative cognition and metacognition. This paper provides a holistic perspective on creativity, enabling the formulation of a comprehensive theoretical framework of creative metacognition. It focuses on the design thinking approach to creativity and explores the role of metacognition in four areas of creativity expertise: Products, Processes, People, and Places. The analysis includes task-outcome relationships (product metacognition), the monitoring of strategy effectiveness (process metacognition), an understanding of individual or group strengths and weaknesses (people metacognition), and an examination of the mutual impact between environments and creativity (place metacognition). It also reviews measures taken in design thinking education, including a distribution of cognition and metacognition, to support students in their development of creative mastery. On these grounds, we propose extended methods for measuring creative metacognition with the goal of enhancing comprehensive assessments of the phenomenon. Proposed methodological advancements include accuracy sub-scales, experimental tasks where examinees explore problem and solution spaces, combinations of naturalistic observations with capability testing, as well as physiological assessments as indirect measures of creative metacognition. KW - accuracy KW - creativity KW - design thinking KW - education KW - measurement KW - metacognition KW - innovation KW - framework Y1 - 2023 U6 - https://doi.org/10.3389/fpsyg.2023.1157001 SN - 1664-1078 VL - 14 PB - Frontiers Research Foundation CY - Lausanne ER - TY - JOUR A1 - Vitagliano, Gerardo A1 - Hameed, Mazhar A1 - Jiang, Lan A1 - Reisener, Lucas A1 - Wu, Eugene A1 - Naumann, Felix T1 - Pollock: a data loading benchmark JF - Proceedings of the VLDB Endowment N2 - Any system at play in a data-driven project has a fundamental requirement: the ability to load data. The de-facto standard format to distribute and consume raw data is CSV. Yet, the plain text and flexible nature of this format make such files often difficult to parse and correctly load their content, requiring cumbersome data preparation steps. We propose a benchmark to assess the robustness of systems in loading data from non-standard CSV formats and with structural inconsistencies. First, we formalize a model to describe the issues that affect real-world files and use it to derive a systematic lpollutionz process to generate dialects for any given grammar. Our benchmark leverages the pollution framework for the csv format. To guide pollution, we have surveyed thousands of real-world, publicly available csv files, recording the problems we encountered. We demonstrate the applicability of our benchmark by testing and scoring 16 different systems: popular csv parsing frameworks, relational database tools, spreadsheet systems, and a data visualization tool. Y1 - 2023 U6 - https://doi.org/10.14778/3594512.3594518 SN - 2150-8097 VL - 16 IS - 8 SP - 1870 EP - 1882 PB - Association for Computing Machinery CY - New York ER - TY - JOUR A1 - Fehr, Jana A1 - Piccininni, Marco A1 - Kurth, Tobias A1 - Konigorski, Stefan T1 - Assessing the transportability of clinical prediction models for cognitive impairment using causal models JF - BMC medical research methodology N2 - Background Machine learning models promise to support diagnostic predictions, but may not perform well in new settings. Selecting the best model for a new setting without available data is challenging. We aimed to investigate the transportability by calibration and discrimination of prediction models for cognitive impairment in simulated external settings with different distributions of demographic and clinical characteristics. Methods We mapped and quantified relationships between variables associated with cognitive impairment using causal graphs, structural equation models, and data from the ADNI study. These estimates were then used to generate datasets and evaluate prediction models with different sets of predictors. We measured transportability to external settings under guided interventions on age, APOE & epsilon;4, and tau-protein, using performance differences between internal and external settings measured by calibration metrics and area under the receiver operating curve (AUC). Results Calibration differences indicated that models predicting with causes of the outcome were more transportable than those predicting with consequences. AUC differences indicated inconsistent trends of transportability between the different external settings. Models predicting with consequences tended to show higher AUC in the external settings compared to internal settings, while models predicting with parents or all variables showed similar AUC. Conclusions We demonstrated with a practical prediction task example that predicting with causes of the outcome results in better transportability compared to anti-causal predictions when considering calibration differences. We conclude that calibration performance is crucial when assessing model transportability to external settings. KW - Alzheimer's Disease KW - Clinical risk prediction KW - DAG KW - Causality; KW - Transportability Y1 - 2023 U6 - https://doi.org/10.1186/s12874-023-02003-6 SN - 1471-2288 VL - 23 IS - 1 PB - BMC CY - London ER - TY - JOUR A1 - Konak, Orhan A1 - Döring, Valentin A1 - Fiedler, Tobias A1 - Liebe, Lucas A1 - Masopust, Leander A1 - Postnov, Kirill A1 - Sauerwald, Franz A1 - Treykorn, Felix A1 - Wischmann, Alexander A1 - Kalabakov, Stefan A1 - Gjoreski, Hristijan A1 - Luštrek, Mitja A1 - Arnrich, Bert T1 - SONAR BT - a nursing activity dataset with inertial sensors JF - Scientific data N2 - Accurate and comprehensive nursing documentation is essential to ensure quality patient care. To streamline this process, we present SONAR, a publicly available dataset of nursing activities recorded using inertial sensors in a nursing home. The dataset includes 14 sensor streams, such as acceleration and angular velocity, and 23 activities recorded by 14 caregivers using five sensors for 61.7 hours. The caregivers wore the sensors as they performed their daily tasks, allowing for continuous monitoring of their activities. We additionally provide machine learning models that recognize the nursing activities given the sensor data. In particular, we present benchmarks for three deep learning model architectures and evaluate their performance using different metrics and sensor locations. Our dataset, which can be used for research on sensor-based human activity recognition in real-world settings, has the potential to improve nursing care by providing valuable insights that can identify areas for improvement, facilitate accurate documentation, and tailor care to specific patient conditions. Y1 - 2023 U6 - https://doi.org/10.1038/s41597-023-02620-2 SN - 2052-4463 VL - 10 IS - 1 PB - Nature Publ. Group CY - London ER - TY - JOUR A1 - Omolaoye, Temidayo S. A1 - Omolaoye, Victor Adelakun A1 - Kandasamy, Richard K. A1 - Hachim, Mahmood Yaseen A1 - Du Plessis, Stefan S. T1 - Omics and male infertility BT - highlighting the application of transcriptomic data JF - Life : open access journal N2 - Male infertility is a multifaceted disorder affecting approximately 50% of male partners in infertile couples. Over the years, male infertility has been diagnosed mainly through semen analysis, hormone evaluations, medical records and physical examinations, which of course are fundamental, but yet inefficient, because 30% of male infertility cases remain idiopathic. This dilemmatic status of the unknown needs to be addressed with more sophisticated and result-driven technologies and/or techniques. Genetic alterations have been linked with male infertility, thereby unveiling the practicality of investigating this disorder from the "omics" perspective. Omics aims at analyzing the structure and functions of a whole constituent of a given biological function at different levels, including the molecular gene level (genomics), transcript level (transcriptomics), protein level (proteomics) and metabolites level (metabolomics). In the current study, an overview of the four branches of omics and their roles in male infertility are briefly discussed; the potential usefulness of assessing transcriptomic data to understand this pathology is also elucidated. After assessing the publicly obtainable transcriptomic data for datasets on male infertility, a total of 1385 datasets were retrieved, of which 10 datasets met the inclusion criteria and were used for further analysis. These datasets were classified into groups according to the disease or cause of male infertility. The groups include non-obstructive azoospermia (NOA), obstructive azoospermia (OA), non-obstructive and obstructive azoospermia (NOA and OA), spermatogenic dysfunction, sperm dysfunction, and Y chromosome microdeletion. Findings revealed that 8 genes (LDHC, PDHA2, TNP1, TNP2, ODF1, ODF2, SPINK2, PCDHB3) were commonly differentially expressed between all disease groups. Likewise, 56 genes were common between NOA versus NOA and OA (ADAD1, BANF2, BCL2L14, C12orf50, C20orf173, C22orf23, C6orf99, C9orf131, C9orf24, CABS1, CAPZA3, CCDC187, CCDC54, CDKN3, CEP170, CFAP206, CRISP2, CT83, CXorf65, FAM209A, FAM71F1, FAM81B, GALNTL5, GTSF1, H1FNT, HEMGN, HMGB4, KIF2B, LDHC, LOC441601, LYZL2, ODF1, ODF2, PCDHB3, PDHA2, PGK2, PIH1D2, PLCZ1, PROCA1, RIMBP3, ROPN1L, SHCBP1L, SMCP, SPATA16, SPATA19, SPINK2, TEX33, TKTL2, TMCO2, TMCO5A, TNP1, TNP2, TSPAN16, TSSK1B, TTLL2, UBQLN3). These genes, particularly the above-mentioned 8 genes, are involved in diverse biological processes such as germ cell development, spermatid development, spermatid differentiation, regulation of proteolysis, spermatogenesis and metabolic processes. Owing to the stage-specific expression of these genes, any mal-expression can ultimately lead to male infertility. Therefore, currently available data on all branches of omics relating to male fertility can be used to identify biomarkers for diagnosing male infertility, which can potentially help in unravelling some idiopathic cases. KW - male infertility KW - omics KW - genomics KW - transcriptomics KW - proteomics KW - metabolomics Y1 - 2022 U6 - https://doi.org/10.3390/life12020280 SN - 2075-1729 VL - 12 IS - 2 PB - MDPI CY - Basel ER - TY - JOUR A1 - Bläsius, Thomas A1 - Friedrich, Tobias A1 - Lischeid, Julius A1 - Meeks, Kitty A1 - Schirneck, Friedrich Martin T1 - Efficiently enumerating hitting sets of hypergraphs arising in data profiling JF - Journal of computer and system sciences : JCSS N2 - The transversal hypergraph problem asks to enumerate the minimal hitting sets of a hypergraph. If the solutions have bounded size, Eiter and Gottlob [SICOMP'95] gave an algorithm running in output-polynomial time, but whose space requirement also scales with the output. We improve this to polynomial delay and space. Central to our approach is the extension problem, deciding for a set X of vertices whether it is contained in any minimal hitting set. We show that this is one of the first natural problems to be W[3]-complete. We give an algorithm for the extension problem running in time O(m(vertical bar X vertical bar+1) n) and prove a SETH-lower bound showing that this is close to optimal. We apply our enumeration method to the discovery problem of minimal unique column combinations from data profiling. Our empirical evaluation suggests that the algorithm outperforms its worst-case guarantees on hypergraphs stemming from real-world databases. KW - Data profiling KW - Enumeration algorithm KW - Minimal hitting set KW - Transversal hypergraph KW - Unique column combination KW - W[3]-Completeness Y1 - 2022 U6 - https://doi.org/10.1016/j.jcss.2021.10.002 SN - 0022-0000 SN - 1090-2724 VL - 124 SP - 192 EP - 213 PB - Elsevier CY - San Diego ER - TY - JOUR A1 - Serth, Sebastian A1 - Staubitz, Thomas A1 - van Elten, Martin A1 - Meinel, Christoph ED - Gamage, Dilrukshi T1 - Measuring the effects of course modularizations in online courses for life-long learners JF - Frontiers in Education N2 - Many participants in Massive Open Online Courses are full-time employees seeking greater flexibility in their time commitment and the available learning paths. We recently addressed these requirements by splitting up our 6-week courses into three 2-week modules followed by a separate exam. Modularizing courses offers many advantages: Shorter modules are more sustainable and can be combined, reused, and incorporated into learning paths more easily. Time flexibility for learners is also improved as exams can now be offered multiple times per year, while the learning content is available independently. In this article, we answer the question of which impact this modularization has on key learning metrics, such as course completion rates, learning success, and no-show rates. Furthermore, we investigate the influence of longer breaks between modules on these metrics. According to our analysis, course modules facilitate more selective learning behaviors that encourage learners to focus on topics they are the most interested in. At the same time, participation in overarching exams across all modules seems to be less appealing compared to an integrated exam of a 6-week course. While breaks between the modules increase the distinctive appearance of individual modules, a break before the final exam further reduces initial interest in the exams. We further reveal that participation in self-paced courses as a preparation for the final exam is unlikely to attract new learners to the course offerings, even though learners' performance is comparable to instructor-paced courses. The results of our long-term study on course modularization provide a solid foundation for future research and enable educators to make informed decisions about the design of their courses. KW - Massive Open Online Course (MOOC) KW - course design KW - modularization KW - learning path KW - flexibility KW - e-learning KW - assignments KW - self-paced learning Y1 - 2022 U6 - https://doi.org/10.3389/feduc.2022.1008545 SN - 2504-284X VL - 7 PB - Frontiers CY - Lausanne, Schweiz ER - TY - JOUR A1 - Ihde, Sven A1 - Pufahl, Luise A1 - Völker, Maximilian A1 - Goel, Asvin A1 - Weske, Mathias T1 - A framework for modeling and executing task BT - specific resource allocations in business processes JF - Computing : archives for informatics and numerical computation N2 - As resources are valuable assets, organizations have to decide which resources to allocate to business process tasks in a way that the process is executed not only effectively but also efficiently. Traditional role-based resource allocation leads to effective process executions, since each task is performed by a resource that has the required skills and competencies to do so. However, the resulting allocations are typically not as efficient as they could be, since optimization techniques have yet to find their way in traditional business process management scenarios. On the other hand, operations research provides a rich set of analytical methods for supporting problem-specific decisions on resource allocation. This paper provides a novel framework for creating transparency on existing tasks and resources, supporting individualized allocations for each activity in a process, and the possibility to integrate problem-specific analytical methods of the operations research domain. To validate the framework, the paper reports on the design and prototypical implementation of a software architecture, which extends a traditional process engine with a dedicated resource management component. This component allows us to define specific resource allocation problems at design time, and it also facilitates optimized resource allocation at run time. The framework is evaluated using a real-world parcel delivery process. The evaluation shows that the quality of the allocation results increase significantly with a technique from operations research in contrast to the traditional applied rule-based approach. KW - Process Execution KW - Business Process Management KW - Resource Allocation KW - Resource Management KW - Activity-oriented Optimization Y1 - 2022 U6 - https://doi.org/10.1007/s00607-022-01093-2 SN - 0010-485X SN - 1436-5057 VL - 104 SP - 2405 EP - 2429 PB - Springer CY - Wien ER - TY - JOUR A1 - Koorn, Jelmer Jan A1 - Lu, Xixi A1 - Leopold, Henrik A1 - Reijers, Hajo A. T1 - From action to response to effect BT - mining statistical relations in work processes JF - Information systems : IS ; an international journal ; data bases N2 - Process mining techniques are valuable to gain insights into and help improve (work) processes. Many of these techniques focus on the sequential order in which activities are performed. Few of these techniques consider the statistical relations within processes. In particular, existing techniques do not allow insights into how responses to an event (action) result in desired or undesired outcomes (effects). We propose and formalize the ARE miner, a novel technique that allows us to analyze and understand these action-response-effect patterns. We take a statistical approach to uncover potential dependency relations in these patterns. The goal of this research is to generate processes that are: (1) appropriately represented, and (2) effectively filtered to show meaningful relations. We evaluate the ARE miner in two ways. First, we use an artificial data set to demonstrate the effectiveness of the ARE miner compared to two traditional process-oriented approaches. Second, we apply the ARE miner to a real-world data set from a Dutch healthcare institution. We show that the ARE miner generates comprehensible representations that lead to informative insights into statistical relations between actions, responses, and effects. KW - Process discovery KW - Statistical process mining KW - Effect measurement Y1 - 2022 U6 - https://doi.org/10.1016/j.is.2022.102035 SN - 0306-4379 SN - 0094-453X VL - 109 PB - Elsevier CY - Amsterdam ER - TY - JOUR A1 - de Paula, Danielly A1 - Marx, Carolin A1 - Wolf, Ella A1 - Dremel, Christian A1 - Cormican, Kathryn A1 - Uebernickel, Falk T1 - A managerial mental model to drive innovation in the context of digital transformation JF - Industry and innovation N2 - Industry 4.0 is transforming how businesses innovate and, as a result, companies are spearheading the movement towards 'Digital Transformation'. While some scholars advocate the use of design thinking to identify new innovative behaviours, cognition experts emphasise the importance of top managers in supporting employees to develop these behaviours. However, there is a dearth of research in this domain and companies are struggling to implement the required behaviours. To address this gap, this study aims to identify and prioritise behavioural strategies conducive to design thinking to inform the creation of a managerial mental model. We identify 20 behavioural strategies from 45 interviewees with practitioners and educators and combine them with the concepts of 'paradigm-mindset-mental model' from cognition theory. The paper contributes to the body of knowledge by identifying and prioritising specific behavioural strategies to form a novel set of survival conditions aligned to the new industrial paradigm of Industry 4.0. KW - Strategic cognition KW - mental models KW - industry 4.0 KW - digital transformation KW - design thinking Y1 - 2022 U6 - https://doi.org/10.1080/13662716.2022.2072711 SN - 1366-2716 SN - 1469-8390 PB - Routledge, Taylor & Francis Group CY - Abingdon ER - TY - JOUR A1 - Cseh, Agnes A1 - Faenza, Yuri A1 - Kavitha, Telikepalli A1 - Powers, Vladlena T1 - Understanding popular matchings via stable matchings JF - SIAM journal on discrete mathematics N2 - An instance of the marriage problem is given by a graph G = (A boolean OR B, E), together with, for each vertex of G, a strict preference order over its neighbors. A matching M of G is popular in the marriage instance if M does not lose a head-to-head election against any matching where vertices are voters. Every stable matching is a min-size popular matching; another subclass of popular matchings that always exists and can be easily computed is the set of dominant matchings. A popular matching M is dominant if M wins the head-to-head election against any larger matching. Thus, every dominant matching is a max-size popular matching, and it is known that the set of dominant matchings is the linear image of the set of stable matchings in an auxiliary graph. Results from the literature seem to suggest that stable and dominant matchings behave, from a complexity theory point of view, in a very similar manner within the class of popular matchings. The goal of this paper is to show that there are instead differences in the tractability of stable and dominant matchings and to investigate further their importance for popular matchings. First, we show that it is easy to check if all popular matchings are also stable; however, it is co-NP hard to check if all popular matchings are also dominant. Second, we show how some new and recent hardness results on popular matching problems can be deduced from the NP-hardness of certain problems on stable matchings, also studied in this paper, thus showing that stable matchings can be employed to show not only positive results on popular matchings (as is known) but also most negative ones. Problems for which we show new hardness results include finding a min-size (resp., max-size) popular matching that is not stable (resp., dominant). A known result for which we give a new and simple proof is the NP-hardness of finding a popular matching when G is nonbipartite. KW - popular matching KW - stable matching KW - complexity KW - dominant matching Y1 - 2022 U6 - https://doi.org/10.1137/19M124770X SN - 0895-4801 SN - 1095-7146 VL - 36 IS - 1 SP - 188 EP - 213 PB - Society for Industrial and Applied Mathematics CY - Philadelphia ER - TY - JOUR A1 - Coupette, Corinna A1 - Hartung, Dirk A1 - Beckedorf, Janis A1 - Böther, Maximilian A1 - Katz, Daniel Martin T1 - Law smells BT - defining and detecting problematic patterns in legal drafting JF - Artificial intelligence and law N2 - Building on the computer science concept of code smells, we initiate the study of law smells, i.e., patterns in legal texts that pose threats to the comprehensibility and maintainability of the law. With five intuitive law smells as running examples-namely, duplicated phrase, long element, large reference tree, ambiguous syntax, and natural language obsession-, we develop a comprehensive law smell taxonomy. This taxonomy classifies law smells by when they can be detected, which aspects of law they relate to, and how they can be discovered. We introduce text-based and graph-based methods to identify instances of law smells, confirming their utility in practice using the United States Code as a test case. Our work demonstrates how ideas from software engineering can be leveraged to assess and improve the quality of legal code, thus drawing attention to an understudied area in the intersection of law and computer science and highlighting the potential of computational legal drafting. KW - Refactoring KW - Software engineering KW - Law KW - Natural language processing KW - Network analysis Y1 - 2022 U6 - https://doi.org/10.1007/s10506-022-09315-w SN - 0924-8463 SN - 1572-8382 VL - 31 SP - 335 EP - 368 PB - Springer CY - Dordrecht ER - TY - JOUR A1 - Tang, Mitchell A1 - Nakamoto, Carter H. A1 - Stern, Ariel Dora A1 - Mehrotra, Ateev T1 - Trends in remote patient monitoring use in traditional Medicare JF - JAMA Internal Medicine N2 - This cross-sectional study uses traditional Medicare claims data to assess trends in general remote patient monitoring from January 2018 through September 2021. Y1 - 2022 U6 - https://doi.org/10.1001/jamainternmed.2022.3043 SN - 2168-6106 SN - 2168-6114 VL - 182 IS - 9 SP - 1005 EP - 1006 PB - American Veterinary Medical Association CY - Chicago ER - TY - JOUR A1 - Casel, Katrin A1 - Fernau, Henning A1 - Ghadikolaei, Mehdi Khosravian A1 - Monnot, Jerome A1 - Sikora, Florian T1 - On the complexity of solution extension of optimization problems JF - Theoretical computer science : the journal of the EATCS N2 - The question if a given partial solution to a problem can be extended reasonably occurs in many algorithmic approaches for optimization problems. For instance, when enumerating minimal vertex covers of a graph G = (V, E), one usually arrives at the problem to decide for a vertex set U subset of V (pre-solution), if there exists a minimal vertex cover S (i.e., a vertex cover S subset of V such that no proper subset of S is a vertex cover) with U subset of S (minimal extension of U). We propose a general, partial-order based formulation of such extension problems which allows to model parameterization and approximation aspects of extension, and also highlights relationships between extension tasks for different specific problems. As examples, we study a number of specific problems which can be expressed and related in this framework. In particular, we discuss extension variants of the problems dominating set and feedback vertex/edge set. All these problems are shown to be NP-complete even when restricted to bipartite graphs of bounded degree, with the exception of our extension version of feedback edge set on undirected graphs which is shown to be solvable in polynomial time. For the extension variants of dominating and feedback vertex set, we also show NP-completeness for the restriction to planar graphs of bounded degree. As non-graph problem, we also study an extension version of the bin packing problem. We further consider the parameterized complexity of all these extension variants, where the parameter is a measure of the pre-solution as defined by our framework. KW - extension problems KW - NP-hardness KW - parameterized complexity Y1 - 2022 U6 - https://doi.org/10.1016/j.tcs.2021.10.017 SN - 0304-3975 SN - 1879-2294 VL - 904 SP - 48 EP - 65 PB - Elsevier CY - Amsterdam [u.a.] ER - TY - JOUR A1 - Genske, Ulrich A1 - Jahnke, Paul T1 - Human observer net BT - a platform tool for human observer studies of image data JF - Radiology N2 - Background: Current software applications for human observer studies of images lack flexibility in study design, platform independence, multicenter use, and assessment methods and are not open source, limiting accessibility and expandability. Purpose: To develop a user-friendly software platform that enables efficient human observer studies in medical imaging with flexibility of study design. Materials and Methods: Software for human observer imaging studies was designed as an open-source web application to facilitate access, platform-independent usability, and multicenter studies. Different interfaces for study creation, participation, and management of results were implemented. The software was evaluated in human observer experiments between May 2019 and March 2021, in which duration of observer responses was tracked. Fourteen radiologists evaluated and graded software usability using the 100-point system usability scale. The application was tested in Chrome, Firefox, Safari, and Edge browsers. Results: Software function was designed to allow visual grading analysis (VGA), multiple-alternative forced-choice (m-AFC), receiver operating characteristic (ROC), localization ROC, free-response ROC, and customized designs. The mean duration of reader responses per image or per image set was 6.2 seconds 6 4.8 (standard deviation), 5.8 seconds 6 4.7, 8.7 seconds 6 5.7, and 6.0 seconds 6 4.5 in four-AFC with 160 image quartets per reader, four-AFC with 640 image quartets per reader, localization ROC, and experimental studies, respectively. The mean system usability scale score was 83 6 11 (out of 100). The documented code and a demonstration of the application are available online (https://github.com/genskeu/HON, https://hondemo.pythonanywhere.com/). Conclusion: A user-friendly and efficient open-source application was developed for human reader experiments that enables study design versatility, as well as platform-independent and multicenter usability. Y1 - 2022 U6 - https://doi.org/10.1148/radiol.211832 SN - 0033-8419 SN - 1527-1315 VL - 303 IS - 3 SP - 524 EP - 530 PB - Radiological Society of North America CY - Oak Brook, Ill. ER - TY - JOUR A1 - Bartoszewicz, Jakub M. A1 - Nasri, Ferdous A1 - Nowicka, Melania A1 - Renard, Bernhard Y. T1 - Detecting DNA of novel fungal pathogens using ResNets and a curated fungi-hosts data collection JF - Bioinformatics N2 - Background: Emerging pathogens are a growing threat, but large data collections and approaches for predicting the risk associated with novel agents are limited to bacteria and viruses. Pathogenic fungi, which also pose a constant threat to public health, remain understudied. Relevant data remain comparatively scarce and scattered among many different sources, hindering the development of sequencing-based detection workflows for novel fungal pathogens. No prediction method working for agents across all three groups is available, even though the cause of an infection is often difficult to identify from symptoms alone. Results: We present a curated collection of fungal host range data, comprising records on human, animal and plant pathogens, as well as other plant-associated fungi, linked to publicly available genomes. We show that it can be used to predict the pathogenic potential of novel fungal species directly from DNA sequences with either sequence homology or deep learning. We develop learned, numerical representations of the collected genomes and visualize the landscape of fungal pathogenicity. Finally, we train multi-class models predicting if next-generation sequencing reads originate from novel fungal, bacterial or viral threats. Conclusions: The neural networks trained using our data collection enable accurate detection of novel fungal pathogens. A curated set of over 1400 genomes with host and pathogenicity metadata supports training of machine-learning models and sequence comparison, not limited to the pathogen detection task. Y1 - 2022 U6 - https://doi.org/10.1093/bioinformatics/btac495 SN - 1367-4803 SN - 1367-4811 VL - 38 SP - ii168 EP - ii174 PB - Oxford Univ. Press CY - Oxford ER - TY - GEN A1 - Rabl, Tilmann T1 - Reminiscences on influential papers T2 - SIGMOD record N2 - When I started my PhD, I wanted to do something related to systems but I wasn't sure exactly what. I didn't consider data management systems initially, because I was unaware of the richness of the systems work that data management systems were build on. I thought the field was mainly about SQL. Luckily, that view changed quickly. Y1 - 2023 U6 - https://doi.org/10.1145/3582302.3582310 SN - 0163-5808 SN - 1943-5835 VL - 51 IS - 4 SP - 42 EP - 44 PB - Association for Computing Machinery CY - New York ER - TY - JOUR A1 - Perscheid, Michael A1 - Plattner, Hasso A1 - Ritter, Daniel A1 - Schlosser, Rainer A1 - Teusner, Ralf T1 - Enterprise platform and integration concepts research at HPI JF - SIGMOD record N2 - The Hasso Plattner Institute (HPI), academically structured as the independent Faculty of Digital Engineering at the University of Potsdam, unites computer science research and teaching with the advantages of a privately financed institute and a tuition-free study program. Founder and namesake of the institute is the SAP co-founder Hasso Plattner, who also heads the Enterprise Platform and Integration Concepts (EPIC) research center which focuses on the technical aspects of business software with a vision to provide the fastest way to get insights out of enterprise data. Founded in 2006, the EPIC combines three research groups comprising autonomous data management, enterprise software engineering, and data-driven decision support. Y1 - 2023 U6 - https://doi.org/10.1145/3582302.3582322 SN - 0163-5808 SN - 1943-5835 VL - 51 IS - 4 SP - 68 EP - 73 PB - Association for Computing Machinery CY - New York ER - TY - JOUR A1 - Casel, Katrin A1 - Fischbeck, Philipp A1 - Friedrich, Tobias A1 - Göbel, Andreas A1 - Lagodzinski, Julius Albert Gregor T1 - Zeros and approximations of Holant polynomials on the complex plane JF - Computational complexity : CC N2 - We present fully polynomial time approximation schemes for a broad class of Holant problems with complex edge weights, which we call Holant polynomials. We transform these problems into partition functions of abstract combinatorial structures known as polymers in statistical physics. Our method involves establishing zero-free regions for the partition functions of polymer models and using the most significant terms of the cluster expansion to approximate them. Results of our technique include new approximation and sampling algorithms for a diverse class of Holant polynomials in the low-temperature regime (i.e. small external field) and approximation algorithms for general Holant problems with small signature weights. Additionally, we give randomised approximation and sampling algorithms with faster running times for more restrictive classes. Finally, we improve the known zero-free regions for a perfect matching polynomial. KW - Holant problems KW - approximate counting KW - partition functions KW - graph KW - polynomials Y1 - 2022 U6 - https://doi.org/10.1007/s00037-022-00226-5 SN - 1016-3328 SN - 1420-8954 VL - 31 IS - 2 PB - Springer CY - Basel ER - TY - JOUR A1 - Hecker, Pascal A1 - Steckhan, Nico A1 - Eyben, Florian A1 - Schuller, Björn Wolfgang A1 - Arnrich, Bert T1 - Voice Analysis for Neurological Disorder Recognition – A Systematic Review and Perspective on Emerging Trends JF - Frontiers in Digital Health N2 - Quantifying neurological disorders from voice is a rapidly growing field of research and holds promise for unobtrusive and large-scale disorder monitoring. The data recording setup and data analysis pipelines are both crucial aspects to effectively obtain relevant information from participants. Therefore, we performed a systematic review to provide a high-level overview of practices across various neurological disorders and highlight emerging trends. PRISMA-based literature searches were conducted through PubMed, Web of Science, and IEEE Xplore to identify publications in which original (i.e., newly recorded) datasets were collected. Disorders of interest were psychiatric as well as neurodegenerative disorders, such as bipolar disorder, depression, and stress, as well as amyotrophic lateral sclerosis amyotrophic lateral sclerosis, Alzheimer's, and Parkinson's disease, and speech impairments (aphasia, dysarthria, and dysphonia). Of the 43 retrieved studies, Parkinson's disease is represented most prominently with 19 discovered datasets. Free speech and read speech tasks are most commonly used across disorders. Besides popular feature extraction toolkits, many studies utilise custom-built feature sets. Correlations of acoustic features with psychiatric and neurodegenerative disorders are presented. In terms of analysis, statistical analysis for significance of individual features is commonly used, as well as predictive modeling approaches, especially with support vector machines and a small number of artificial neural networks. An emerging trend and recommendation for future studies is to collect data in everyday life to facilitate longitudinal data collection and to capture the behavior of participants more naturally. Another emerging trend is to record additional modalities to voice, which can potentially increase analytical performance. KW - neurological disorders KW - voice KW - speech KW - everyday life KW - multiple modalities KW - machine learning KW - disorder recognition Y1 - 2022 U6 - https://doi.org/10.3389/fdgth.2022.842301 SN - 2673-253X PB - Frontiers Media SA CY - Lausanne, Schweiz ER - TY - JOUR A1 - Fehr, Jana A1 - Jaramillo-Gutierrez, Giovanna A1 - Oala, Luis A1 - Gröschel, Matthias I. A1 - Bierwirth, Manuel A1 - Balachandran, Pradeep A1 - Werneck-Leite, Alixandro A1 - Lippert, Christoph T1 - Piloting a Survey-Based Assessment of Transparency and Trustworthiness with Three Medical AI Tools JF - Healthcare N2 - Artificial intelligence (AI) offers the potential to support healthcare delivery, but poorly trained or validated algorithms bear risks of harm. Ethical guidelines stated transparency about model development and validation as a requirement for trustworthy AI. Abundant guidance exists to provide transparency through reporting, but poorly reported medical AI tools are common. To close this transparency gap, we developed and piloted a framework to quantify the transparency of medical AI tools with three use cases. Our framework comprises a survey to report on the intended use, training and validation data and processes, ethical considerations, and deployment recommendations. The transparency of each response was scored with either 0, 0.5, or 1 to reflect if the requested information was not, partially, or fully provided. Additionally, we assessed on an analogous three-point scale if the provided responses fulfilled the transparency requirement for a set of trustworthiness criteria from ethical guidelines. The degree of transparency and trustworthiness was calculated on a scale from 0% to 100%. Our assessment of three medical AI use cases pin-pointed reporting gaps and resulted in transparency scores of 67% for two use cases and one with 59%. We report anecdotal evidence that business constraints and limited information from external datasets were major obstacles to providing transparency for the three use cases. The observed transparency gaps also lowered the degree of trustworthiness, indicating compliance gaps with ethical guidelines. All three pilot use cases faced challenges to provide transparency about medical AI tools, but more studies are needed to investigate those in the wider medical AI sector. Applying this framework for an external assessment of transparency may be infeasible if business constraints prevent the disclosure of information. New strategies may be necessary to enable audits of medical AI tools while preserving business secrets. KW - artificial intelligence for health KW - quality assessment KW - transparency KW - trustworthiness Y1 - 2022 U6 - https://doi.org/10.3390/healthcare10101923 SN - 2227-9032 VL - 10 IS - 10 PB - MDPI CY - Basel, Schweiz ER - TY - JOUR A1 - Kirchler, Matthias A1 - Konigorski, Stefan A1 - Norden, Matthias A1 - Meltendorf, Christian A1 - Kloft, Marius A1 - Schurmann, Claudia A1 - Lippert, Christoph T1 - transferGWAS BT - GWAS of images using deep transfer learning JF - Bioinformatics N2 - Motivation: Medical images can provide rich information about diseases and their biology. However, investigating their association with genetic variation requires non-standard methods. We propose transferGWAS, a novel approach to perform genome-wide association studies directly on full medical images. First, we learn semantically meaningful representations of the images based on a transfer learning task, during which a deep neural network is trained on independent but similar data. Then, we perform genetic association tests with these representations. Results: We validate the type I error rates and power of transferGWAS in simulation studies of synthetic images. Then we apply transferGWAS in a genome-wide association study of retinal fundus images from the UK Biobank. This first-of-a-kind GWAS of full imaging data yielded 60 genomic regions associated with retinal fundus images, of which 7 are novel candidate loci for eye-related traits and diseases. Y1 - 2022 U6 - https://doi.org/10.1093/bioinformatics/btac369 SN - 1367-4803 SN - 1460-2059 VL - 38 IS - 14 SP - 3621 EP - 3628 PB - Oxford Univ. Press CY - Oxford ER - TY - JOUR A1 - Zenner, Alexander M. A1 - Böttinger, Erwin A1 - Konigorski, Stefan T1 - StudyMe BT - a new mobile app for user-centric N-of-1 trials JF - Trials N2 - N-of-1 trials are multi-crossover self-experiments that allow individuals to systematically evaluate the effect of interventions on their personal health goals. Although several tools for N-of-1 trials exist, there is a gap in supporting non-experts in conducting their own user-centric trials. In this study, we present StudyMe, an open-source mobile application that is freely available from https://play.google.com/store/apps/details?id=health.studyu.me and offers users flexibility and guidance in configuring every component of their trials. We also present research that informed the development of StudyMe, focusing on trial creation. Through an initial survey with 272 participants, we learned that individuals are interested in a variety of personal health aspects and have unique ideas on how to improve them. In an iterative, user-centered development process with intermediate user tests, we developed StudyMe that features an educational part to communicate N-of-1 trial concepts. A final empirical evaluation of StudyMe showed that all participants were able to create their own trials successfully using StudyMe and the app achieved a very good usability rating. Our findings suggest that StudyMe provides a significant step towards enabling individuals to apply a systematic science-oriented approach to personalize health-related interventions and behavior modifications in their everyday lives. Y1 - 2022 U6 - https://doi.org/10.1186/s13063-022-06893-7 SN - 1745-6215 VL - 23 PB - BioMed Central CY - London ER - TY - JOUR A1 - Shams, Boshra A1 - Wang, Ziqian A1 - Roine, Timo A1 - Aydogan, Dogu Baran A1 - Vajkoczy, Peter A1 - Lippert, Christoph A1 - Picht, Thomas A1 - Fekonja, Lucius Samo T1 - Machine learning-based prediction of motor status in glioma patients using diffusion MRI metrics along the corticospinal tract JF - Brain communications N2 - Shams et al. report that glioma patients' motor status is predicted accurately by diffusion MRI metrics along the corticospinal tract based on support vector machine method, reaching an overall accuracy of 77%. They show that these metrics are more effective than demographic and clinical variables. Along tract statistics enables white matter characterization using various diffusion MRI metrics. These diffusion models reveal detailed insights into white matter microstructural changes with development, pathology and function. Here, we aim at assessing the clinical utility of diffusion MRI metrics along the corticospinal tract, investigating whether motor glioma patients can be classified with respect to their motor status. We retrospectively included 116 brain tumour patients suffering from either left or right supratentorial, unilateral World Health Organization Grades II, III and IV gliomas with a mean age of 53.51 +/- 16.32 years. Around 37% of patients presented with preoperative motor function deficits according to the Medical Research Council scale. At group level comparison, the highest non-overlapping diffusion MRI differences were detected in the superior portion of the tracts' profiles. Fractional anisotropy and fibre density decrease, apparent diffusion coefficient axial diffusivity and radial diffusivity increase. To predict motor deficits, we developed a method based on a support vector machine using histogram-based features of diffusion MRI tract profiles (e.g. mean, standard deviation, kurtosis and skewness), following a recursive feature elimination method. Our model achieved high performance (74% sensitivity, 75% specificity, 74% overall accuracy and 77% area under the curve). We found that apparent diffusion coefficient, fractional anisotropy and radial diffusivity contributed more than other features to the model. Incorporating the patient demographics and clinical features such as age, tumour World Health Organization grade, tumour location, gender and resting motor threshold did not affect the model's performance, revealing that these features were not as effective as microstructural measures. These results shed light on the potential patterns of tumour-related microstructural white matter changes in the prediction of functional deficits. KW - machine learning KW - support vector machine KW - tractography KW - diffusion MRI; KW - corticospinal tract Y1 - 2022 U6 - https://doi.org/10.1093/braincomms/fcac141 SN - 2632-1297 VL - 4 IS - 3 PB - Oxford University Press CY - Oxford ER - TY - JOUR A1 - Bilo, Davide A1 - Bilo, Vittorio A1 - Lenzner, Pascal A1 - Molitor, Louise T1 - Topological influence and locality in swap schelling games JF - Autonomous Agents and Multi-Agent Systems N2 - Residential segregation is a wide-spread phenomenon that can be observed in almost every major city. In these urban areas residents with different racial or socioeconomic background tend to form homogeneous clusters. Schelling's famous agent-based model for residential segregation explains how such clusters can form even if all agents are tolerant, i.e., if they agree to live in mixed neighborhoods. For segregation to occur, all it needs is a slight bias towards agents preferring similar neighbors. Very recently, Schelling's model has been investigated from a game-theoretic point of view with selfish agents that strategically select their residential location. In these games, agents can improve on their current location by performing a location swap with another agent who is willing to swap. We significantly deepen these investigations by studying the influence of the underlying topology modeling the residential area on the existence of equilibria, the Price of Anarchy and on the dynamic properties of the resulting strategic multi-agent system. Moreover, as a new conceptual contribution, we also consider the influence of locality, i.e., if the location swaps are restricted to swaps of neighboring agents. We give improved almost tight bounds on the Price of Anarchy for arbitrary underlying graphs and we present (almost) tight bounds for regular graphs, paths and cycles. Moreover, we give almost tight bounds for grids, which are commonly used in empirical studies. For grids we also show that locality has a severe impact on the game dynamics. KW - residential segregation KW - Schelling's segregation model KW - non-cooperative games KW - price of anarchy KW - game dynamics Y1 - 2022 U6 - https://doi.org/10.1007/s10458-022-09573-7 SN - 1387-2532 SN - 1573-7454 VL - 36 IS - 2 PB - Springer CY - Dordrecht ER - TY - JOUR A1 - Ziegler, Joceline A1 - Pfitzner, Bjarne A1 - Schulz, Heinrich A1 - Saalbach, Axel A1 - Arnrich, Bert T1 - Defending against Reconstruction Attacks through Differentially Private Federated Learning for Classification of Heterogeneous Chest X-ray Data JF - Sensors N2 - Privacy regulations and the physical distribution of heterogeneous data are often primary concerns for the development of deep learning models in a medical context. This paper evaluates the feasibility of differentially private federated learning for chest X-ray classification as a defense against data privacy attacks. To the best of our knowledge, we are the first to directly compare the impact of differentially private training on two different neural network architectures, DenseNet121 and ResNet50. Extending the federated learning environments previously analyzed in terms of privacy, we simulated a heterogeneous and imbalanced federated setting by distributing images from the public CheXpert and Mendeley chest X-ray datasets unevenly among 36 clients. Both non-private baseline models achieved an area under the receiver operating characteristic curve (AUC) of 0.940.94 on the binary classification task of detecting the presence of a medical finding. We demonstrate that both model architectures are vulnerable to privacy violation by applying image reconstruction attacks to local model updates from individual clients. The attack was particularly successful during later training stages. To mitigate the risk of a privacy breach, we integrated Rényi differential privacy with a Gaussian noise mechanism into local model training. We evaluate model performance and attack vulnerability for privacy budgets ε∈{1,3,6,10}�∈{1,3,6,10}. The DenseNet121 achieved the best utility-privacy trade-off with an AUC of 0.940.94 for ε=6�=6. Model performance deteriorated slightly for individual clients compared to the non-private baseline. The ResNet50 only reached an AUC of 0.760.76 in the same privacy setting. Its performance was inferior to that of the DenseNet121 for all considered privacy constraints, suggesting that the DenseNet121 architecture is more robust to differentially private training. KW - federated learning KW - privacy and security KW - privacy attack KW - X-ray Y1 - 2022 U6 - https://doi.org/10.3390/s22145195 SN - 1424-8220 VL - 22 PB - MDPI CY - Basel, Schweiz ET - 14 ER - TY - JOUR A1 - Chandran, Sunil L. A1 - Issac, Davis A1 - Lauri, Juho A1 - van Leeuwen, Erik Jan T1 - Upper bounding rainbow connection number by forest number JF - Discrete mathematics N2 - A path in an edge-colored graph is rainbow if no two edges of it are colored the same, and the graph is rainbow-connected if there is a rainbow path between each pair of its vertices. The minimum number of colors needed to rainbow-connect a graph G is the rainbow connection number of G, denoted by rc(G).& nbsp;A simple way to rainbow-connect a graph G is to color the edges of a spanning tree with distinct colors and then re-use any of these colors to color the remaining edges of G. This proves that rc(G) <= |V (G)|-1. We ask whether there is a stronger connection between tree-like structures and rainbow coloring than that is implied by the above trivial argument. For instance, is it possible to find an upper bound of t(G)-1 for rc(G), where t(G) is the number of vertices in the largest induced tree of G? The answer turns out to be negative, as there are counter-examples that show that even c .t(G) is not an upper bound for rc(G) for any given constant c.& nbsp;In this work we show that if we consider the forest number f(G), the number of vertices in a maximum induced forest of G, instead of t(G), then surprisingly we do get an upper bound. More specifically, we prove that rc(G) <= f(G) + 2. Our result indicates a stronger connection between rainbow connection and tree-like structures than that was suggested by the simple spanning tree based upper bound. KW - rainbow connection KW - forest number KW - upper bound Y1 - 2022 U6 - https://doi.org/10.1016/j.disc.2022.112829 SN - 0012-365X SN - 1872-681X VL - 345 IS - 7 PB - Elsevier CY - Amsterdam [u.a.] ER - TY - JOUR A1 - Rosin, Paul L. A1 - Lai, Yu-Kun A1 - Mould, David A1 - Yi, Ran A1 - Berger, Itamar A1 - Doyle, Lars A1 - Lee, Seungyong A1 - Li, Chuan A1 - Liu, Yong-Jin A1 - Semmo, Amir A1 - Shamir, Ariel A1 - Son, Minjung A1 - Winnemöller, Holger T1 - NPRportrait 1.0: A three-level benchmark for non-photorealistic rendering of portraits JF - Computational visual media N2 - Recently, there has been an upsurge of activity in image-based non-photorealistic rendering (NPR), and in particular portrait image stylisation, due to the advent of neural style transfer (NST). However, the state of performance evaluation in this field is poor, especially compared to the norms in the computer vision and machine learning communities. Unfortunately, the task of evaluating image stylisation is thus far not well defined, since it involves subjective, perceptual, and aesthetic aspects. To make progress towards a solution, this paper proposes a new structured, three-level, benchmark dataset for the evaluation of stylised portrait images. Rigorous criteria were used for its construction, and its consistency was validated by user studies. Moreover, a new methodology has been developed for evaluating portrait stylisation algorithms, which makes use of the different benchmark levels as well as annotations provided by user studies regarding the characteristics of the faces. We perform evaluation for a wide variety of image stylisation methods (both portrait-specific and general purpose, and also both traditional NPR approaches and NST) using the new benchmark dataset. KW - non-photorealistic rendering (NPR) KW - image stylization KW - style transfer KW - portrait KW - evaluation KW - benchmark Y1 - 2022 U6 - https://doi.org/10.1007/s41095-021-0255-3 SN - 2096-0433 SN - 2096-0662 VL - 8 IS - 3 SP - 445 EP - 465 PB - Springer Nature CY - London ER - TY - JOUR A1 - Ruipérez-Valiente, José A. A1 - Staubitz, Thomas A1 - Jenner, Matt A1 - Halawa, Sherif A1 - Zhang, Jiayin A1 - Despujol, Ignacio A1 - Maldonado-Mahauad, Jorge A1 - Montoro, German A1 - Peffer, Melanie A1 - Rohloff, Tobias A1 - Lane, Jenny A1 - Turro, Carlos A1 - Li, Xitong A1 - Pérez-Sanagustín, Mar A1 - Reich, Justin T1 - Large scale analytics of global and regional MOOC providers: Differences in learners' demographics, preferences, and perceptions JF - Computers & education N2 - Massive Open Online Courses (MOOCs) remarkably attracted global media attention, but the spotlight has been concentrated on a handful of English-language providers. While Coursera, edX, Udacity, and FutureLearn received most of the attention and scrutiny, an entirely new ecosystem of local MOOC providers was growing in parallel. This ecosystem is harder to study than the major players: they are spread around the world, have less staff devoted to maintaining research data, and operate in multiple languages with university and corporate regional partners. To better understand how online learning opportunities are expanding through this regional MOOC ecosystem, we created a research partnership among 15 different MOOC providers from nine countries. We gathered data from over eight million learners in six thousand MOOCs, and we conducted a large-scale survey with more than 10 thousand participants. From our analysis, we argue that these regional providers may be better positioned to meet the goals of expanding access to higher education in their regions than the better-known global providers. To make this claim we highlight three trends: first, regional providers attract a larger local population with more inclusive demographic profiles; second, students predominantly choose their courses based on topical interest, and regional providers do a better job at catering to those needs; and third, many students feel more at ease learning from institutions they already know and have references from. Our work raises the importance of local education in the global MOOC ecosystem, while calling for additional research and conversations across the diversity of MOOC providers. KW - Learning analytics KW - Educational data mining KW - Massive open online courses KW - Large scale analytics KW - Cultural factors KW - Equity KW - Distance learning Y1 - 2022 U6 - https://doi.org/10.1016/j.compedu.2021.104426 SN - 0360-1315 SN - 1873-782X VL - 180 PB - Elsevier CY - Oxford ER - TY - JOUR A1 - Bläsius, Thomas A1 - Friedrich, Tobias A1 - Krejca, Martin S. A1 - Molitor, Louise T1 - The impact of geometry on monochrome regions in the flip Schelling process JF - Computational geometry N2 - Schelling's classical segregation model gives a coherent explanation for the wide-spread phenomenon of residential segregation. We introduce an agent-based saturated open-city variant, the Flip Schelling Process (FSP), in which agents, placed on a graph, have one out of two types and, based on the predominant type in their neighborhood, decide whether to change their types; similar to a new agent arriving as soon as another agent leaves the vertex. We investigate the probability that an edge {u,v} is monochrome, i.e., that both vertices u and v have the same type in the FSP, and we provide a general framework for analyzing the influence of the underlying graph topology on residential segregation. In particular, for two adjacent vertices, we show that a highly decisive common neighborhood, i.e., a common neighborhood where the absolute value of the difference between the number of vertices with different types is high, supports segregation and, moreover, that large common neighborhoods are more decisive. As an application, we study the expected behavior of the FSP on two common random graph models with and without geometry: (1) For random geometric graphs, we show that the existence of an edge {u,v} makes a highly decisive common neighborhood for u and v more likely. Based on this, we prove the existence of a constant c>0 such that the expected fraction of monochrome edges after the FSP is at least 1/2+c. (2) For Erdős–Rényi graphs we show that large common neighborhoods are unlikely and that the expected fraction of monochrome edges after the FSP is at most 1/2+o(1). Our results indicate that the cluster structure of the underlying graph has a significant impact on the obtained segregation strength. KW - Agent-based model KW - Schelling segregation KW - Spin system Y1 - 2022 U6 - https://doi.org/10.1016/j.comgeo.2022.101902 SN - 0925-7721 SN - 1879-081X VL - 108 PB - Elsevier CY - Amsterdam ER - TY - JOUR A1 - Dressler, Falko A1 - Chiasserini, Carla Fabiana A1 - Fitzek, Frank H. P. A1 - Karl, Holger A1 - Cigno, Renato Lo A1 - Capone, Antonio A1 - Casetti, Claudio A1 - Malandrino, Francesco A1 - Mancuso, Vincenzo A1 - Klingler, Florian A1 - Rizzo, Gianluca T1 - V-Edge BT - virtual edge computing as an enabler for novel microservices and cooperative computing JF - IEEE network N2 - As we move from 5G to 6G, edge computing is one of the concepts that needs revisiting. Its core idea is still intriguing: Instead of sending all data and tasks from an end user's device to the cloud, possibly covering thousands of kilometers and introducing delays lower-bounded by propagation speed, edge servers deployed in close proximity to the user (e.g., at some base station) serve as proxy for the cloud. This is particularly interesting for upcoming machine-learning-based intelligent services, which require substantial computational and networking performance for continuous model training. However, this promising idea is hampered by the limited number of such edge servers. In this article, we discuss a way forward, namely the V-Edge concept. V-Edge helps bridge the gap between cloud, edge, and fog by virtualizing all available resources including the end users' devices and making these resources widely available. Thus, V-Edge acts as an enabler for novel microservices as well as cooperative computing solutions in next-generation networks. We introduce the general V-Edge architecture, and we characterize some of the key research challenges to overcome in order to enable wide-spread and intelligent edge services. KW - Training KW - Performance evaluation KW - Cloud computing KW - Microservice KW - architectures KW - Computer architecture KW - Delays KW - Servers Y1 - 2022 U6 - https://doi.org/10.1109/MNET.001.2100491 SN - 0890-8044 SN - 1558-156X VL - 36 IS - 3 SP - 24 EP - 31 PB - Inst. of Electr. and Electronics Engineers CY - Piscataway ER - TY - JOUR A1 - Ring, Raphaela M. A1 - Eisenmann, Clemens A1 - Kandil, Farid A1 - Steckhan, Nico A1 - Demmrich, Sarah A1 - Klatte, Caroline A1 - Kessler, Christian S. A1 - Jeitler, Michael A1 - Boschmann, Michael A1 - Michalsen, Andreas A1 - Blakeslee, Sarah B. A1 - Stöckigt, Barbara A1 - Stritter, Wiebke A1 - Koppold-Liebscher, Daniela A. T1 - Mental and behavioural responses to Bahá’í fasting: Looking behind the scenes of a religiously motivated intermittent fast using a mixed methods approach JF - Nutrients N2 - Background/Objective: Historically, fasting has been practiced not only for medical but also for religious reasons. Baha'is follow an annual religious intermittent dry fast of 19 days. We inquired into motivation behind and subjective health impacts of Baha'i fasting. Methods: A convergent parallel mixed methods design was embedded in a clinical single arm observational study. Semi-structured individual interviews were conducted before (n = 7), during (n = 8), and after fasting (n = 8). Three months after the fasting period, two focus group interviews were conducted (n = 5/n = 3). A total of 146 Baha'i volunteers answered an online survey at five time points before, during, and after fasting. Results: Fasting was found to play a central role for the religiosity of interviewees, implying changes in daily structures, spending time alone, engaging in religious practices, and experiencing social belonging. Results show an increase in mindfulness and well-being, which were accompanied by behavioural changes and experiences of self-efficacy and inner freedom. Survey scores point to an increase in mindfulness and well-being during fasting, while stress, anxiety, and fatigue decreased. Mindfulness remained elevated even three months after the fast. Conclusion: Baha'i fasting seems to enhance participants' mindfulness and well-being, lowering stress levels and reducing fatigue. Some of these effects lasted more than three months after fasting. KW - intermittent food restriction KW - mindfulness KW - self-efficacy KW - well-being KW - mixed methods KW - health behaviour KW - coping ability KW - religiously motivated KW - dry fasting Y1 - 2022 U6 - https://doi.org/10.3390/nu14051038 SN - 2072-6643 VL - 14 IS - 5 PB - MDPI CY - Basel ER - TY - JOUR A1 - Wiemker, Veronika A1 - Bunova, Anna A1 - Neufeld, Maria A1 - Gornyi, Boris A1 - Yurasova, Elena A1 - Konigorski, Stefan A1 - Kalinina, Anna A1 - Kontsevaya, Anna A1 - Ferreira-Borges, Carina A1 - Probst, Charlotte T1 - Pilot study to evaluate usability and acceptability of the 'Animated Alcohol Assessment Tool' in Russian primary healthcare JF - Digital health N2 - Background and aims: Accurate and user-friendly assessment tools quantifying alcohol consumption are a prerequisite to effective prevention and treatment programmes, including Screening and Brief Intervention. Digital tools offer new potential in this field. We developed the ‘Animated Alcohol Assessment Tool’ (AAA-Tool), a mobile app providing an interactive version of the World Health Organization's Alcohol Use Disorders Identification Test (AUDIT) that facilitates the description of individual alcohol consumption via culturally informed animation features. This pilot study evaluated the Russia-specific version of the Animated Alcohol Assessment Tool with regard to (1) its usability and acceptability in a primary healthcare setting, (2) the plausibility of its alcohol consumption assessment results and (3) the adequacy of its Russia-specific vessel and beverage selection. Methods: Convenience samples of 55 patients (47% female) and 15 healthcare practitioners (80% female) in 2 Russian primary healthcare facilities self-administered the Animated Alcohol Assessment Tool and rated their experience on the Mobile Application Rating Scale – User Version. Usage data was automatically collected during app usage, and additional feedback on regional content was elicited in semi-structured interviews. Results: On average, patients completed the Animated Alcohol Assessment Tool in 6:38 min (SD = 2.49, range = 3.00–17.16). User satisfaction was good, with all subscale Mobile Application Rating Scale – User Version scores averaging >3 out of 5 points. A majority of patients (53%) and practitioners (93%) would recommend the tool to ‘many people’ or ‘everyone’. Assessed alcohol consumption was plausible, with a low number (14%) of logically impossible entries. Most patients reported the Animated Alcohol Assessment Tool to reflect all vessels (78%) and all beverages (71%) they typically used. Conclusion: High acceptability ratings by patients and healthcare practitioners, acceptable completion time, plausible alcohol usage assessment results and perceived adequacy of region-specific content underline the Animated Alcohol Assessment Tool's potential to provide a novel approach to alcohol assessment in primary healthcare. After its validation, the Animated Alcohol Assessment Tool might contribute to reducing alcohol-related harm by facilitating Screening and Brief Intervention implementation in Russia and beyond. KW - Alcohol use assessment KW - Alcohol Use Disorders Identification Test KW - screening tools KW - digital health KW - mobile applications KW - Russia KW - primary healthcare KW - usability KW - acceptability Y1 - 2022 U6 - https://doi.org/10.1177/20552076211074491 SN - 2055-2076 VL - 8 PB - Sage Publications CY - London ER - TY - JOUR A1 - Essen, Anna A1 - Stern, Ariel Dora A1 - Haase, Christoffer Bjerre A1 - Car, Josip A1 - Greaves, Felix A1 - Paparova, Dragana A1 - Vandeput, Steven A1 - Wehrens, Rik A1 - Bates, David W. T1 - Health app policy BT - international comparison of nine countries' approaches JF - npj digital medicine N2 - An abundant and growing supply of digital health applications (apps) exists in the commercial tech-sector, which can be bewildering for clinicians, patients, and payers. A growing challenge for the health care system is therefore to facilitate the identification of safe and effective apps for health care practitioners and patients to generate the most health benefit as well as guide payer coverage decisions. Nearly all developed countries are attempting to define policy frameworks to improve decision-making, patient care, and health outcomes in this context. This study compares the national policy approaches currently in development/use for health apps in nine countries. We used secondary data, combined with a detailed review of policy and regulatory documents, and interviews with key individuals and experts in the field of digital health policy to collect data about implemented and planned policies and initiatives. We found that most approaches aim for centralized pipelines for health app approvals, although some countries are adding decentralized elements. While the countries studied are taking diverse paths, there is nevertheless broad, international convergence in terms of requirements in the areas of transparency, health content, interoperability, and privacy and security. The sheer number of apps on the market in most countries represents a challenge for clinicians and patients. Our analyses of the relevant policies identified challenges in areas such as reimbursement, safety, and privacy and suggest that more regulatory work is needed in the areas of operationalization, implementation and international transferability of approvals. Cross-national efforts are needed around regulation and for countries to realize the benefits of these technologies. Y1 - 2022 U6 - https://doi.org/10.1038/s41746-022-00573-1 SN - 2398-6352 VL - 5 IS - 1 PB - Macmillan Publishers Limited CY - Basingstoke ER - TY - JOUR A1 - Hagedorn, Christopher A1 - Huegle, Johannes A1 - Schlosser, Rainer T1 - Understanding unforeseen production downtimes in manufacturing processes using log data-driven causal reasoning JF - Journal of intelligent manufacturing N2 - In discrete manufacturing, the knowledge about causal relationships makes it possible to avoid unforeseen production downtimes by identifying their root causes. Learning causal structures from real-world settings remains challenging due to high-dimensional data, a mix of discrete and continuous variables, and requirements for preprocessing log data under the causal perspective. In our work, we address these challenges proposing a process for causal reasoning based on raw machine log data from production monitoring. Within this process, we define a set of transformation rules to extract independent and identically distributed observations. Further, we incorporate a variable selection step to handle high-dimensionality and a discretization step to include continuous variables. We enrich a commonly used causal structure learning algorithm with domain-related orientation rules, which provides a basis for causal reasoning. We demonstrate the process on a real-world dataset from a globally operating precision mechanical engineering company. The dataset contains over 40 million log data entries from production monitoring of a single machine. In this context, we determine the causal structures embedded in operational processes. Further, we examine causal effects to support machine operators in avoiding unforeseen production stops, i.e., by detaining machine operators from drawing false conclusions on impacting factors of unforeseen production stops based on experience. KW - Causal structure learning KW - Log data KW - Causal inference KW - Manufacturing KW - industry Y1 - 2022 U6 - https://doi.org/10.1007/s10845-022-01952-x SN - 0956-5515 SN - 1572-8145 VL - 33 IS - 7 SP - 2027 EP - 2043 PB - Springer CY - Dordrecht ER - TY - JOUR A1 - Ulrich, Jens-Uwe A1 - Lutfi, Ahmad A1 - Rutzen, Kilian A1 - Renard, Bernhard Y. T1 - ReadBouncer BT - precise and scalable adaptive sampling for nanopore sequencing JF - Bioinformatics N2 - Motivation: Nanopore sequencers allow targeted sequencing of interesting nucleotide sequences by rejecting other sequences from individual pores. This feature facilitates the enrichment of low-abundant sequences by depleting overrepresented ones in-silico. Existing tools for adaptive sampling either apply signal alignment, which cannot handle human-sized reference sequences, or apply read mapping in sequence space relying on fast graphical processing units (GPU) base callers for real-time read rejection. Using nanopore long-read mapping tools is also not optimal when mapping shorter reads as usually analyzed in adaptive sampling applications. Results: Here, we present a new approach for nanopore adaptive sampling that combines fast CPU and GPU base calling with read classification based on Interleaved Bloom Filters. ReadBouncer improves the potential enrichment of low abundance sequences by its high read classification sensitivity and specificity, outperforming existing tools in the field. It robustly removes even reads belonging to large reference sequences while running on commodity hardware without GPUs, making adaptive sampling accessible for in-field researchers. Readbouncer also provides a user-friendly interface and installer files for end-users without a bioinformatics background. Y1 - 2022 U6 - https://doi.org/10.1093/bioinformatics/btac223 SN - 1367-4803 SN - 1367-4811 VL - 38 IS - SUPPL 1 SP - 153 EP - 160 PB - Oxford Univ. Press CY - Oxford ER - TY - JOUR A1 - Hiort, Pauline A1 - Schlaffner, Christoph N. A1 - Steen, Judith A. A1 - Renard, Bernhard Y. A1 - Steen, Hanno T1 - multiFLEX-LF: a computational approach to quantify the modification stoichiometries in label-free proteomics data sets JF - Journal of proteome research N2 - In liquid-chromatography-tandem-mass-spectrometry-based proteomics, information about the presence and stoichiometry ofprotein modifications is not readily available. To overcome this problem,we developed multiFLEX-LF, a computational tool that builds uponFLEXIQuant, which detects modified peptide precursors and quantifiestheir modification extent by monitoring the differences between observedand expected intensities of the unmodified precursors. multiFLEX-LFrelies on robust linear regression to calculate the modification extent of agiven precursor relative to a within-study reference. multiFLEX-LF cananalyze entire label-free discovery proteomics data sets in a precursor-centric manner without preselecting a protein of interest. To analyzemodification dynamics and coregulated modifications, we hierarchicallyclustered the precursors of all proteins based on their computed relativemodification scores. We applied multiFLEX-LF to a data-independent-acquisition-based data set acquired using the anaphase-promoting complex/cyclosome (APC/C) isolated at various time pointsduring mitosis. The clustering of the precursors allows for identifying varying modification dynamics and ordering the modificationevents. Overall, multiFLEX-LF enables the fast identification of potentially differentially modified peptide precursors and thequantification of their differential modification extent in large data sets using a personal computer. Additionally, multiFLEX-LF candrive the large-scale investigation of the modification dynamics of peptide precursors in time-series and case-control studies.multiFLEX-LF is available athttps://gitlab.com/SteenOmicsLab/multiflex-lf. KW - bioinformatics tool KW - label-free quantification KW - LC-MS KW - MS KW - post-translational modification KW - modification stoichiometry KW - PTM KW - quantification Y1 - 2022 U6 - https://doi.org/10.1021/acs.jproteome.1c00669 SN - 1535-3893 SN - 1535-3907 VL - 21 IS - 4 SP - 899 EP - 909 PB - American Chemical Society CY - Washington ER - TY - JOUR A1 - Wittig, Alice A1 - Miranda, Fabio Malcher A1 - Hölzer, Martin A1 - Altenburg, Tom A1 - Bartoszewicz, Jakub Maciej A1 - Beyvers, Sebastian A1 - Dieckmann, Marius Alfred A1 - Genske, Ulrich A1 - Giese, Sven Hans-Joachim A1 - Nowicka, Melania A1 - Richard, Hugues A1 - Schiebenhoefer, Henning A1 - Schmachtenberg, Anna-Juliane A1 - Sieben, Paul A1 - Tang, Ming A1 - Tembrockhaus, Julius A1 - Renard, Bernhard Y. A1 - Fuchs, Stephan T1 - CovRadar BT - continuously tracking and filtering SARS-CoV-2 mutations for genomic surveillance JF - Bioinformatics N2 - The ongoing pandemic caused by SARS-CoV-2 emphasizes the importance of genomic surveillance to understand the evolution of the virus, to monitor the viral population, and plan epidemiological responses. Detailed analysis, easy visualization and intuitive filtering of the latest viral sequences are powerful for this purpose. We present CovRadar, a tool for genomic surveillance of the SARS-CoV-2 Spike protein. CovRadar consists of an analytical pipeline and a web application that enable the analysis and visualization of hundreds of thousand sequences. First, CovRadar extracts the regions of interest using local alignment, then builds a multiple sequence alignment, infers variants and consensus and finally presents the results in an interactive app, making accessing and reporting simple, flexible and fast. Y1 - 2022 U6 - https://doi.org/10.1093/bioinformatics/btac411 SN - 1367-4803 SN - 1367-4811 VL - 38 IS - 17 SP - 4223 EP - 4225 PB - Oxford Univ. Press CY - Oxford ER - TY - JOUR A1 - Pawlitzki, Marc A1 - Acar, Laura A1 - Masanneck, Lars A1 - Willison, Alice A1 - Regner-Nelke, Liesa A1 - Nelke, Christopher A1 - L'hoest, Helmut A1 - Marschall, Ursula A1 - Schmidt, Jens A1 - Meuth, Sven G. A1 - Ruck, Tobias T1 - Myositis in Germany: epidemiological insights over 15 years from 2005 to 2019 JF - Neurological research and practice : official journal of the German Neurological Society N2 - Background: The medical care of patients with myositis is a great challenge in clinical practice. This is due to the rarity of these disease, the complexity of diagnosis and management as well as the lack of systematic analyses. Objectives: Therefore, the aim of this project was to obtain an overview of the current care of myositis patients in Germany and to evaluate epidemiological trends in recent years. Methods: In collaboration with BARMER Insurance, retrospective analysis of outpatient and inpatient data from an average of approximately 8.7 million insured patients between January 2005 and December 2019 was performed using ICD-10 codes for myositis for identification of relevant data. In addition, a comparative analysis was performed between myositis patients and an age-matched comparison group from other populations insured by BARMER. Results: 45,800 BARMER-insured individuals received a diagnosis of myositis during the observation period, with a relatively stable prevalence throughout. With regard to comorbidities, a significantly higher rate of cardiovascular disease as well as neoplasm was observed compared to the control group within the BARMER-insured population. In addition, myositis patients suffer more frequently from psychiatric disorders, such as depression and somatoform disorders. However, the ICD-10 catalogue only includes the specific coding of "dermatomyositis" and "polymyositis" and thus does not allow for a sufficient analysis of all idiopathic inflammatory myopathies subtypes. Conclusion: The current data provide a comprehensive epidemiological analysis of myositis in Germany, highlighting the multimorbidity of myositis patients. This underlines the need for multidisciplinary management. However, the ICD-10 codes currently still in use do not allow for specific analysis of the subtypes of myositis. The upcoming ICD-11 coding may improve future analyses in this regard. Y1 - 2022 U6 - https://doi.org/10.1186/s42466-022-00226-4 SN - 2524-3489 VL - 4 IS - 1 PB - BioMed Central CY - London ER - TY - JOUR A1 - Graf, Martin A1 - Laskowski, Lukas A1 - Papsdorf, Florian A1 - Sold, Florian A1 - Gremmelspacher, Roland A1 - Naumann, Felix A1 - Panse, Fabian T1 - Frost: a platform for benchmarking and exploring data matching results JF - Proceedings of the VLDB Endowment N2 - "Bad" data has a direct impact on 88% of companies, with the average company losing 12% of its revenue due to it. Duplicates - multiple but different representations of the same real-world entities are among the main reasons for poor data quality, so finding and configuring the right deduplication solution is essential. Existing data matching benchmarks focus on the quality of matching results and neglect other important factors, such as business requirements. Additionally, they often do not support the exploration of data matching results. To address this gap between the mere counting of record pairs vs. a comprehensive means to evaluate data matching solutions, we present the Frost platform. It combines existing benchmarks, established quality metrics, cost and effort metrics, and exploration techniques, making it the first platform to allow systematic exploration to understand matching results. Frost is implemented and published in the open-source application Snowman, which includes the visual exploration of matching results, as shown in Figure 1. Y1 - 2022 U6 - https://doi.org/10.14778/3554821.3554823 SN - 2150-8097 VL - 15 IS - 12 SP - 3292 EP - 3305 PB - Association for Computing Machinery CY - New York ER - TY - JOUR A1 - Monti, Remo A1 - Rautenstrauch, Pia A1 - Ghanbari, Mahsa A1 - James, Alva Rani A1 - Kirchler, Matthias A1 - Ohler, Uwe A1 - Konigorski, Stefan A1 - Lippert, Christoph T1 - Identifying interpretable gene-biomarker associations with functionally informed kernel-based tests in 190,000 exomes JF - Nature Communications N2 - Here we present an exome-wide rare genetic variant association study for 30 blood biomarkers in 191,971 individuals in the UK Biobank. We compare gene- based association tests for separate functional variant categories to increase interpretability and identify 193 significant gene-biomarker associations. Genes associated with biomarkers were ~ 4.5-fold enriched for conferring Mendelian disorders. In addition to performing weighted gene-based variant collapsing tests, we design and apply variant-category-specific kernel-based tests that integrate quantitative functional variant effect predictions for mis- sense variants, splicing and the binding of RNA-binding proteins. For these tests, we present a computationally efficient combination of the likelihood- ratio and score tests that found 36% more associations than the score test alone while also controlling the type-1 error. Kernel-based tests identified 13% more associations than their gene-based collapsing counterparts and had advantages in the presence of gain of function missense variants. We introduce local collapsing by amino acid position for missense variants and use it to interpret associations and identify potential novel gain of function variants in PIEZO1. Our results show the benefits of investigating different functional mechanisms when performing rare-variant association tests, and demonstrate pervasive rare-variant contribution to biomarker variability. Y1 - 2022 U6 - https://doi.org/10.1038/s41467-022-32864-2 SN - 2041-1723 VL - 13 PB - Nature Publishing Group UK CY - London ER - TY - CHAP A1 - Hiort, Pauline A1 - Hugo, Julian A1 - Zeinert, Justus A1 - Müller, Nataniel A1 - Kashyap, Spoorthi A1 - Rajapakse, Jagath C. A1 - Azuaje, Francisco A1 - Renard, Bernhard Y. A1 - Baum, Katharina T1 - DrDimont: explainable drug response prediction from differential analysis of multi-omics networks T2 - Bioinformatics N2 - Motivation: While it has been well established that drugs affect and help patients differently, personalized drug response predictions remain challenging. Solutions based on single omics measurements have been proposed, and networks provide means to incorporate molecular interactions into reasoning. However, how to integrate the wealth of information contained in multiple omics layers still poses a complex problem. Results: We present DrDimont, Drug response prediction from Differential analysis of multi-omics networks. It allows for comparative conclusions between two conditions and translates them into differential drug response predictions. DrDimont focuses on molecular interactions. It establishes condition-specific networks from correlation within an omics layer that are then reduced and combined into heterogeneous, multi-omics molecular networks. A novel semi-local, path-based integration step ensures integrative conclusions. Differential predictions are derived from comparing the condition-specific integrated networks. DrDimont's predictions are explainable, i.e. molecular differences that are the source of high differential drug scores can be retrieved. We predict differential drug response in breast cancer using transcriptomics, proteomics, phosphosite and metabolomics measurements and contrast estrogen receptor positive and receptor negative patients. DrDimont performs better than drug prediction based on differential protein expression or PageRank when evaluating it on ground truth data from cancer cell lines. We find proteomic and phosphosite layers to carry most information for distinguishing drug response. Y1 - 2022 U6 - https://doi.org/10.1093/bioinformatics/btac477 SN - 1367-4803 SN - 1367-4811 VL - 38 SP - ii113 EP - ii119 PB - Oxford Univ. Press CY - Oxford ER - TY - JOUR A1 - Hacker, Philipp A1 - Naumann, Felix A1 - Friedrich, Tobias A1 - Grundmann, Stefan A1 - Lehmann, Anja A1 - Zech, Herbert T1 - AI compliance - challenges of bridging data science and law JF - Journal of Data and Information Quality (JDIQ) N2 - This vision article outlines the main building blocks of what we term AI Compliance, an effort to bridge two complementary research areas: computer science and the law. Such research has the goal to model, measure, and affect the quality of AI artifacts, such as data, models, and applications, to then facilitate adherence to legal standards. KW - AI Act KW - compliance KW - liability KW - privacy KW - transparency KW - information quality Y1 - 2022 U6 - https://doi.org/10.1145/3531532 SN - 1936-1955 SN - 1936-1963 VL - 14 IS - 3 PB - Association for Computing Machinery CY - New York ER - TY - JOUR A1 - Richly, Keven A1 - Schlosser, Rainer A1 - Boissier, Martin T1 - Budget-conscious fine-grained configuration optimization for spatio-temporal applications JF - Proceedings of the VLDB Endowment N2 - Based on the performance requirements of modern spatio-temporal data mining applications, in-memory database systems are often used to store and process the data. To efficiently utilize the scarce DRAM capacities, modern database systems support various tuning possibilities to reduce the memory footprint (e.g., data compression) or increase performance (e.g., additional indexes). However, the selection of cost and performance balancing configurations is challenging due to the vast number of possible setups consisting of mutually dependent individual decisions. In this paper, we introduce a novel approach to jointly optimize the compression, sorting, indexing, and tiering configuration for spatio-temporal workloads. Further, we consider horizontal data partitioning, which enables the independent application of different tuning options on a fine-grained level. We propose different linear programming (LP) models addressing cost dependencies at different levels of accuracy to compute optimized tuning configurations for a given workload and memory budgets. To yield maintainable and robust configurations, we extend our LP-based approach to incorporate reconfiguration costs as well as a worst-case optimization for potential workload scenarios. Further, we demonstrate on a real-world dataset that our models allow to significantly reduce the memory footprint with equal performance or increase the performance with equal memory size compared to existing tuning heuristics. KW - General Earth and Planetary Sciences KW - Water Science and Technology KW - Geography, Planning and Development Y1 - 2022 U6 - https://doi.org/10.14778/3565838.3565858 SN - 2150-8097 VL - 15 IS - 13 SP - 4079 EP - 4092 PB - Association for Computing Machinery (ACM) CY - [New York] ER - TY - JOUR A1 - Tausch, Simon H. A1 - Loka, Tobias P. A1 - Schulze, Jakob M. A1 - Andrusch, Andreas A1 - Klenner, Jeanette A1 - Dabrowski, Piotr Wojciech A1 - Lindner, Martin S. A1 - Nitsche, Andreas A1 - Renard, Bernhard Y. T1 - PathoLive-real-time pathogen identification from metagenomic illumina datasets JF - Life N2 - Over the past years, NGS has become a crucial workhorse for open-view pathogen diagnostics. Yet, long turnaround times result from using massively parallel high-throughput technologies as the analysis can only be performed after sequencing has finished. The interpretation of results can further be challenged by contaminations, clinically irrelevant sequences, and the sheer amount and complexity of the data. We implemented PathoLive, a real-time diagnostics pipeline for the detection of pathogens from clinical samples hours before sequencing has finished. Based on real-time alignment with HiLive2, mappings are scored with respect to common contaminations, low-entropy areas, and sequences of widespread, non-pathogenic organisms. The results are visualized using an interactive taxonomic tree that provides an easily interpretable overview of the relevance of hits. For a human plasma sample that was spiked in vitro with six pathogenic viruses, all agents were clearly detected after only 40 of 200 sequencing cycles. For a real-world sample from Sudan, the results correctly indicated the presence of Crimean-Congo hemorrhagic fever virus. In a second real-world dataset from the 2019 SARS-CoV-2 outbreak in Wuhan, we found the presence of a SARS coronavirus as the most relevant hit without the novel virus reference genome being included in the database. For all samples, clinically irrelevant hits were correctly de-emphasized. Our approach is valuable to obtain fast and accurate NGS-based pathogen identifications and correctly prioritize and visualize them based on their clinical significance: PathoLive is open source and available on GitLab and BioConda. KW - NGS KW - metagenomics KW - viruses KW - infectious diseases KW - diagnostics KW - live sequencing Y1 - 2022 U6 - https://doi.org/10.3390/life12091345 SN - 2075-1729 VL - 12 IS - 9 PB - MDPI CY - Basel ER - TY - JOUR A1 - Masanneck, Lars A1 - Rolfes, Leoni A1 - Regner-Nelke, Liesa A1 - Willison, Alice A1 - Räuber, Saskia A1 - Steffen, Falk A1 - Bittner, Stefan A1 - Zipp, Frauke A1 - Albrecht, Philipp A1 - Ruck, Tobias A1 - Hartung, Hans-Peter A1 - Meuth, Sven G. A1 - Pawlitzki, Marc T1 - Detecting ongoing disease activity in mildly affected multiple sclerosis patients under first-line therapies JF - Multiple Sclerosis and Related Disorders N2 - Background: The current range of disease-modifying treatments (DMTs) for relapsing-remitting multiple sclerosis (RRMS) has placed more importance on the accurate monitoring of disease progression for timely and appropriate treatment decisions. With a rising number of measurements for disease progression, it is currently unclear how well these measurements or combinations of them can monitor more mildly affected RRMS patients. Objectives: To investigate several composite measures for monitoring disease activity and their potential relation to the biomarker neurofilament light chain (NfL) in a clearly defined early RRMS patient cohort with a milder disease course. Methods: From a total of 301 RRMS patients, a subset of 46 patients being treated with a continuous first-line therapy was analyzed for loss of no evidence of disease activity (lo-NEDA-3) status, relapse-associated worsening (RAW) and progression independent of relapse activity (PIRA), up to seven years after treatment initialization. Kaplan-Meier estimates were used for time-to-event analysis. Additionally, a Cox regression model was used to analyze the effect of NIL levels on outcome measures in this cohort. Results: In this mildly affected cohort, both lo-NEDA-3 and PIRA frequently occurred over a median observational period of 67.2 months and were observed in 39 (84.8%) and 23 (50.0%) patients, respectively. Additionally, 12 out of 26 PIRA manifestations (46.2%) were observed without a corresponding lo-NEDA-3 status. Jointly, either PIRA or lo-NEDA-3 showed disease activity in all patients followed-up for at least the median duration (67.2 months). NfL values demonstrated an association with the occurrence of relapses and RAW. Conclusion: The complementary use of different disease progression measures helps mirror ongoing disease activity in mildly affected early RRMS patients being treated with continuous first-line therapy. KW - relapsing-remitting multiple sclerosis KW - neurofilament light chain KW - PIRA KW - NEDA KW - RAW KW - early MS KW - disease activity measurements KW - biomarker Y1 - 2022 U6 - https://doi.org/10.1016/j.msard.2022.103927 SN - 2211-0348 SN - 2211-0356 VL - 63 PB - Elsevier CY - Amsterdam ER - TY - JOUR A1 - Heyne, Henrike O. T1 - Polygenic risk scores in epilepsy JF - Medizinische Genetik N2 - An epilepsy diagnosis has large consequences for an individual but is often difficult to make in clinical practice. Novel biomarkers are thus greatly needed. Here, we give an overview of how thousands of common genetic factors that increase the risk for epilepsy can be summarized as epilepsy polygenic risk scores (PRS). We discuss the current state of research on how epilepsy PRS can serve as a biomarker for the risk for epilepsy. The high heritability of common forms of epilepsy, particularly genetic generalized epilepsy, indicates a promising potential for epilepsy PRS in diagnosis and risk prediction. Small sample sizes and low ancestral diversity of current epilepsy genome-wide association studies show, however, a need for larger and more diverse studies before epilepsy PRS could be properly implemented in the clinic. KW - epilepsy KW - genome-wide association study KW - complex disease KW - polygenic score KW - risk prediction Y1 - 2022 U6 - https://doi.org/10.1515/medgen-2022-2146 SN - 0936-5931 SN - 1863-5490 VL - 34 IS - 3 SP - 225 EP - 230 PB - De Gruyter CY - Berlin ER - TY - JOUR A1 - Torous, John A1 - Stern, Ariel D. A1 - Bourgeois, Florence T. T1 - Regulatory considerations to keep pace with innovation in digital health products JF - npj digital medicine N2 - Rapid innovation and proliferation of software as a medical device have accelerated the clinical use of digital technologies across a wide array of medical conditions. Current regulatory pathways were developed for traditional (hardware) medical devices and offer a useful structure, but the evolution of digital devices requires concomitant innovation in regulatory approaches to maximize the potential benefits of these emerging technologies. A number of specific adaptations could strengthen current regulatory oversight while promoting ongoing innovation. Y1 - 2022 U6 - https://doi.org/10.1038/s41746-022-00668-9 SN - 2398-6352 VL - 5 IS - 1 PB - Macmillan Publishers Limited CY - Basingstoke ER - TY - JOUR A1 - Figge, Frank A1 - Dimitrov, Stanko A1 - Schlosser, Rainer A1 - Chenavaz, Regis T1 - Does the circular economy fuel the throwaway society? The role of opportunity costs for products that lose value over time JF - Journal of cleaner production N2 - The efficient use of natural resources is considered a necessary condition for their sustainable use. Extending the lifetime of products and using resources circularly are two popular strategies to increase the efficiency of resource use. Both strategies are usually assumed to contribute to the eco-efficiency of resource use independently. We argue that a move to a circular economy creates opportunity costs for consumers holding on to their products, due to the resource embedded in the product. Assuming rational consumers, we develop a model that determines optimal replacement times for products subject to minimizing average costs over time. We find that in a perfectly circular economy, consumers are incentivized to discard their products more quickly than in a perfectly linear economy. A direct consequence of our finding is that extending product use is in direct conflict with closing resource loops in the circular economy. We identify the salvage value of discarded products and technical progress as two factors that determine the impact that closing resource loops has on the duration of product use. The article highlights the risk that closing resource loops and moving to a more circular economy incentivizes more unsustainable behavior. KW - circular economy KW - opportunity cost KW - eco-efficiency KW - obsolescence KW - economic obsolescence Y1 - 2022 U6 - https://doi.org/10.1016/j.jclepro.2022.133207 SN - 0959-6526 SN - 1879-1786 VL - 368 PB - Elsevier CY - Oxford ER - TY - JOUR A1 - Hartmann, Anika M. A1 - Dell'Oro, Melanie A1 - Spoo, Michaela A1 - Fischer, Jan Moritz A1 - Steckhan, Nico A1 - Jeitler, Michael A1 - Häupl, Thomas A1 - Kandil, Farid A1 - Michalsen, Andreas A1 - Koppold-Liebscher, Daniela A. A1 - Kessler, Christian S. T1 - To eat or not to eat-an exploratory randomized controlled trial on fasting and plant-based diet in rheumatoid arthritis (NutriFast-Study) JF - Frontiers in nutrition N2 - Background: Fasting is beneficial in many diseases, including rheumatoid arthritis (RA), with lasting effects for up to 1 year. However, existing data dates back several decades before the introduction of modern therapeutic modalities. Objective: This exploratory RCT compares the effects of a 7-day fast followed by a plant-based diet (PBD) to the effects of the dietary recommendations of the German society for nutrition (Deutsche Gesellschaft für Ernährung, DGE) on RA disease activity, cardiovascular (CV) risk factors, and well-being. Methods: In this RCT we randomly assigned 53 RA patients to either a 7-day fast followed by an 11-week PBD or a 12-week standard DGE diet. The primary endpoint was the group change from baseline to 12 weeks on the Health Assessment Questionnaire Disability Index (HAQ-DI). Further outcomes included other disease activity scores, body composition, and quality of life. Results: Of 53 RA patients enrolled, 50 participants (25 per group) completed the trial and were included into the per-protocol analysis. The primary endpoint was not statistically significant. However, HAQ-DI improved rapidly in the fasting group by day 7 and remained stable over 12 weeks (Δ-0.29, p = 0.001), while the DGE group improved later at 6 and 12 weeks (Δ-0.23, p = 0.032). DAS28 ameliorated in both groups by week 12 (Δ-0.97, p < 0.001 and Δ-1.14, p < 0.001; respectively), with 9 patients in the fasting but only 3 in the DGE group achieving ACR50 or higher. CV risk factors including weight improved stronger in the fasting group than in the DGE group (Δ-3.9 kg, p < 0.001 and Δ-0.7 kg, p = 0.146). Conclusions: Compared with a guideline-based anti-inflammatory diet, fasting followed by a plant-based diet showed no benefit in terms of function and disability after 12 weeks. Both dietary approaches had a positive effect on RA disease activity and cardiovascular risk factors in patients with RA. Clinical trial registration: https://clinicaltrials.gov/ct2/show/NCT03856190, identifier: NCT03856190. KW - rheumatoid arthritis KW - fasting KW - caloric restriction KW - plant-based diet KW - inflammation Y1 - 2022 U6 - https://doi.org/10.3389/fnut.2022.1030380 SN - 2296-861X VL - 9 PB - Frontiers Media CY - Lausanne ER - TY - JOUR A1 - Schneider, Sven A1 - Maximova, Maria A1 - Sakizloglou, Lucas A1 - Giese, Holger T1 - Formal testing of timed graph transformation systems using metric temporal graph logic JF - International journal on software tools for technology transfer N2 - Embedded real-time systems generate state sequences where time elapses between state changes. Ensuring that such systems adhere to a provided specification of admissible or desired behavior is essential. Formal model-based testing is often a suitable cost-effective approach. We introduce an extended version of the formalism of symbolic graphs, which encompasses types as well as attributes, for representing states of dynamic systems. Relying on this extension of symbolic graphs, we present a novel formalism of timed graph transformation systems (TGTSs) that supports the model-based development of dynamic real-time systems at an abstract level where possible state changes and delays are specified by graph transformation rules. We then introduce an extended form of the metric temporal graph logic (MTGL) with increased expressiveness to improve the applicability of MTGL for the specification of timed graph sequences generated by a TGTS. Based on the metric temporal operators of MTGL and its built-in graph binding mechanics, we express properties on the structure and attributes of graphs as well as on the occurrence of graphs over time that are related by their inner structure. We provide formal support for checking whether a single generated timed graph sequence adheres to a provided MTGL specification. Relying on this logical foundation, we develop a testing framework for TGTSs that are specified using MTGL. Lastly, we apply this testing framework to a running example by using our prototypical implementation in the tool AutoGraph. KW - formal testing KW - typed attributed symbolic graphs KW - timed graph KW - transformation KW - graph conditions KW - metric temporal graph logic Y1 - 2021 U6 - https://doi.org/10.1007/s10009-020-00585-w SN - 1433-2779 SN - 1433-2787 VL - 23 IS - 3 SP - 411 EP - 488 PB - Springer CY - Heidelberg ER - TY - JOUR A1 - Ladleif, Jan A1 - Weske, Mathias T1 - Which event happened first? BT - Deferred choice on blockchain using oracles JF - Frontiers in blockchain N2 - First come, first served: Critical choices between alternative actions are often made based on events external to an organization, and reacting promptly to their occurrence can be a major advantage over the competition. In Business Process Management (BPM), such deferred choices can be expressed in process models, and they are an important aspect of process engines. Blockchain-based process execution approaches are no exception to this, but are severely limited by the inherent properties of the platform: The isolated environment prevents direct access to external entities and data, and the non-continual runtime based entirely on atomic transactions impedes the monitoring and detection of events. In this paper we provide an in-depth examination of the semantics of deferred choice, and transfer them to environments such as the blockchain. We introduce and compare several oracle architectures able to satisfy certain requirements, and show that they can be implemented using state-of-the-art blockchain technology. KW - business processes KW - business process management KW - deferred choice KW - workflow patterns KW - blockchain KW - smart contracts KW - oracles KW - formal semantics Y1 - 2021 U6 - https://doi.org/10.3389/fbloc.2021.758169 SN - 2624-7852 VL - 4 SP - 1 EP - 16 PB - Frontiers in Blockchain CY - Lausanne, Schweiz ER - TY - JOUR A1 - Aa, Han van der A1 - Rebmann, Adrian A1 - Leopold, Henrik T1 - Natural language-based detection of semantic execution anomalies in event logs JF - Information systems : IS ; an international journal ; data bases N2 - Anomaly detection in process mining aims to recognize outlying or unexpected behavior in event logs for purposes such as the removal of noise and identification of conformance violations. Existing techniques for this task are primarily frequency-based, arguing that behavior is anomalous because it is uncommon. However, such techniques ignore the semantics of recorded events and, therefore, do not take the meaning of potential anomalies into consideration. In this work, we overcome this caveat and focus on the detection of anomalies from a semantic perspective, arguing that anomalies can be recognized when process behavior does not make sense. To achieve this, we propose an approach that exploits the natural language associated with events. Our key idea is to detect anomalous process behavior by identifying semantically inconsistent execution patterns. To detect such patterns, we first automatically extract business objects and actions from the textual labels of events. We then compare these against a process-independent knowledge base. By populating this knowledge base with patterns from various kinds of resources, our approach can be used in a range of contexts and domains. We demonstrate the capability of our approach to successfully detect semantic execution anomalies through an evaluation based on a set of real-world and synthetic event logs and show the complementary nature of semantics-based anomaly detection to existing frequency-based techniques. KW - Process mining KW - Natural language processing KW - Anomaly detection Y1 - 2021 U6 - https://doi.org/10.1016/j.is.2021.101824 SN - 0306-4379 SN - 1873-6076 VL - 102 PB - Elsevier CY - Amsterdam ER - TY - JOUR A1 - Belaid, Mohamed Karim A1 - Rabus, Maximilian A1 - Krestel, Ralf T1 - CrashNet BT - an encoder-decoder architecture to predict crash test outcomes JF - Data mining and knowledge discovery N2 - Destructive car crash tests are an elaborate, time-consuming, and expensive necessity of the automotive development process. Today, finite element method (FEM) simulations are used to reduce costs by simulating car crashes computationally. We propose CrashNet, an encoder-decoder deep neural network architecture that reduces costs further and models specific outcomes of car crashes very accurately. We achieve this by formulating car crash events as time series prediction enriched with a set of scalar features. Traditional sequence-to-sequence models are usually composed of convolutional neural network (CNN) and CNN transpose layers. We propose to concatenate those with an MLP capable of learning how to inject the given scalars into the output time series. In addition, we replace the CNN transpose with 2D CNN transpose layers in order to force the model to process the hidden state of the set of scalars as one time series. The proposed CrashNet model can be trained efficiently and is able to process scalars and time series as input in order to infer the results of crash tests. CrashNet produces results faster and at a lower cost compared to destructive tests and FEM simulations. Moreover, it represents a novel approach in the car safety management domain. KW - Predictive models KW - Time series analysis KW - Supervised deep neural KW - networks KW - Car safety management Y1 - 2021 U6 - https://doi.org/10.1007/s10618-021-00761-9 SN - 1384-5810 SN - 1573-756X VL - 35 IS - 4 SP - 1688 EP - 1709 PB - Springer CY - Dordrecht ER - TY - JOUR A1 - Krestel, Ralf A1 - Chikkamath, Renukswamy A1 - Hewel, Christoph A1 - Risch, Julian T1 - A survey on deep learning for patent analysis JF - World patent information N2 - Patent document collections are an immense source of knowledge for research and innovation communities worldwide. The rapid growth of the number of patent documents poses an enormous challenge for retrieving and analyzing information from this source in an effective manner. Based on deep learning methods for natural language processing, novel approaches have been developed in the field of patent analysis. The goal of these approaches is to reduce costs by automating tasks that previously only domain experts could solve. In this article, we provide a comprehensive survey of the application of deep learning for patent analysis. We summarize the state-of-the-art techniques and describe how they are applied to various tasks in the patent domain. In a detailed discussion, we categorize 40 papers based on the dataset, the representation, and the deep learning architecture that were used, as well as the patent analysis task that was targeted. With our survey, we aim to foster future research at the intersection of patent analysis and deep learning and we conclude by listing promising paths for future work. KW - deep learning KW - patent analysis KW - text mining KW - natural language processing Y1 - 2021 U6 - https://doi.org/10.1016/j.wpi.2021.102035 SN - 0172-2190 SN - 1874-690X VL - 65 PB - Elsevier CY - Amsterdam ER - TY - JOUR A1 - Schlosser, Rainer A1 - Chenavaz, Régis Y. A1 - Dimitrov, Stanko T1 - Circular economy BT - joint dynamic pricing and recycling investments JF - International journal of production economics N2 - In a circular economy, the use of recycled resources in production is a key performance indicator for management. Yet, academic studies are still unable to inform managers on appropriate recycling and pricing policies. We develop an optimal control model integrating a firm's recycling rate, which can use both virgin and recycled resources in the production process. Our model accounts for recycling influence both at the supply- and demandsides. The positive effect of a firm's use of recycled resources diminishes over time but may increase through investments. Using general formulations for demand and cost, we analytically examine joint dynamic pricing and recycling investment policies in order to determine their optimal interplay over time. We provide numerical experiments to assess the existence of a steady-state and to calculate sensitivity analyses with respect to various model parameters. The analysis shows how to dynamically adapt jointly optimized controls to reach sustainability in the production process. Our results pave the way to sounder sustainable practices for firms operating within a circular economy. KW - Dynamic pricing KW - Recycling investments KW - Optimal control KW - General demand function KW - Circular economy Y1 - 2021 U6 - https://doi.org/10.1016/j.ijpe.2021.108117 SN - 0925-5273 SN - 1873-7579 VL - 236 PB - Elsevier CY - Amsterdam ER - TY - JOUR A1 - Perscheid, Cindy T1 - Comprior BT - Facilitating the implementation and automated benchmarking of prior knowledge-based feature selection approaches on gene expression data sets JF - BMC Bioinformatics N2 - Background Reproducible benchmarking is important for assessing the effectiveness of novel feature selection approaches applied on gene expression data, especially for prior knowledge approaches that incorporate biological information from online knowledge bases. However, no full-fledged benchmarking system exists that is extensible, provides built-in feature selection approaches, and a comprehensive result assessment encompassing classification performance, robustness, and biological relevance. Moreover, the particular needs of prior knowledge feature selection approaches, i.e. uniform access to knowledge bases, are not addressed. As a consequence, prior knowledge approaches are not evaluated amongst each other, leaving open questions regarding their effectiveness. Results We present the Comprior benchmark tool, which facilitates the rapid development and effortless benchmarking of feature selection approaches, with a special focus on prior knowledge approaches. Comprior is extensible by custom approaches, offers built-in standard feature selection approaches, enables uniform access to multiple knowledge bases, and provides a customizable evaluation infrastructure to compare multiple feature selection approaches regarding their classification performance, robustness, runtime, and biological relevance. Conclusion Comprior allows reproducible benchmarking especially of prior knowledge approaches, which facilitates their applicability and for the first time enables a comprehensive assessment of their effectiveness KW - Feature selection KW - Prior knowledge KW - Gene expression KW - Reproducible benchmarking Y1 - 2021 U6 - https://doi.org/10.1186/s12859-021-04308-z SN - 1471-2105 VL - 22 SP - 1 EP - 15 PB - Springer Nature CY - London ER - TY - JOUR A1 - Navarro, Marisa A1 - Orejas, Fernando A1 - Pino, Elvira A1 - Lambers, Leen T1 - A navigational logic for reasoning about graph properties JF - Journal of logical and algebraic methods in programming N2 - Graphs play an important role in many areas of Computer Science. In particular, our work is motivated by model-driven software development and by graph databases. For this reason, it is very important to have the means to express and to reason about the properties that a given graph may satisfy. With this aim, in this paper we present a visual logic that allows us to describe graph properties, including navigational properties, i.e., properties about the paths in a graph. The logic is equipped with a deductive tableau method that we have proved to be sound and complete. KW - Graph logic KW - Algebraic methods KW - Formal modelling KW - Specification Y1 - 2021 U6 - https://doi.org/10.1016/j.jlamp.2020.100616 SN - 2352-2208 SN - 2352-2216 VL - 118 PB - Elsevier Science CY - Amsterdam [u.a.] ER - TY - JOUR A1 - Hölzle, Katharina A1 - Björk, Jennie A1 - Boer, Harry T1 - Light at the end of the tunnel JF - Creativity and innovation management Y1 - 2021 U6 - https://doi.org/10.1111/caim.12427 SN - 0963-1690 SN - 1467-8691 VL - 30 IS - 1 SP - 3 EP - 5 PB - Wiley-Blackwell CY - Oxford [u.a.] ER - TY - JOUR A1 - Gamage, Dilrukshi A1 - Staubitz, Thomas A1 - Whiting, Mark T1 - Peer assessment in MOOCs BT - Systematic literature review JF - Distance education N2 - We report on a systematic review of the landscape of peer assessment in massive open online courses (MOOCs) with papers from 2014 to 2020 in 20 leading education technology publication venues across four databases containing education technology-related papers, addressing three research issues: the evolution of peer assessment in MOOCs during the period 2014 to 2020, the methods used in MOOCs to assess peers, and the challenges of and future directions in MOOC peer assessment. We provide summary statistics and a review of methods across the corpus and highlight three directions for improving the use of peer assessment in MOOCs: the need for focusing on scaling learning through peer evaluations, the need for scaling and optimizing team submissions in team peer assessments, and the need for embedding a social process for peer assessment. KW - MOOC KW - peer assessment KW - peer evaluation KW - peer review KW - literature review KW - social interaction Y1 - 2021 U6 - https://doi.org/10.1080/01587919.2021.1911626 SN - 0158-7919 SN - 1475-0198 VL - 42 IS - 2 SP - 268 EP - 289 PB - Routledge, Taylor & Francis Group CY - Abingdon ER - TY - JOUR A1 - Combi, Carlo A1 - Oliboni, Barbara A1 - Weske, Mathias A1 - Zerbato, Francesca T1 - Seamless conceptual modeling of processes with transactional and analytical data JF - Data & knowledge engineering N2 - In the field of Business Process Management (BPM), modeling business processes and related data is a critical issue since process activities need to manage data stored in databases. The connection between processes and data is usually handled at the implementation level, even if modeling both processes and data at the conceptual level should help designers in improving business process models and identifying requirements for implementation. Especially in data -and decision-intensive contexts, business process activities need to access data stored both in databases and data warehouses. In this paper, we complete our approach for defining a novel conceptual view that bridges process activities and data. The proposed approach allows the designer to model the connection between business processes and database models and define the operations to perform, providing interesting insights on the overall connected perspective and hints for identifying activities that are crucial for decision support. KW - Conceptual modeling KW - Business process modeling KW - BPMN KW - Data modeling KW - Data warehouse KW - Decision support Y1 - 2021 U6 - https://doi.org/10.1016/j.datak.2021.101895 SN - 0169-023X SN - 1872-6933 VL - 134 PB - Elsevier CY - Amsterdam ER - TY - JOUR A1 - Cseh, Ágnes A1 - Juhos, Attila T1 - Pairwise preferences in the stable marriage problem JF - ACM Transactions on Economics and Computation / Association for Computing Machinery N2 - We study the classical, two-sided stable marriage problem under pairwise preferences. In the most general setting, agents are allowed to express their preferences as comparisons of any two of their edges, and they also have the right to declare a draw or even withdraw from such a comparison. This freedom is then gradually restricted as we specify six stages of orderedness in the preferences, ending with the classical case of strictly ordered lists. We study all cases occurring when combining the three known notions of stability-weak, strong, and super-stability-under the assumption that each side of the bipartite market obtains one of the six degrees of orderedness. By designing three polynomial algorithms and two NP-completeness proofs, we determine the complexity of all cases not yet known and thus give an exact boundary in terms of preference structure between tractable and intractable cases. KW - Stable marriage KW - intransitivity KW - acyclic preferences KW - poset KW - weakly KW - stable matching KW - strongly stable matching KW - super stable matching Y1 - 2021 U6 - https://doi.org/10.1145/3434427 SN - 2167-8375 SN - 2167-8383 VL - 9 IS - 1 PB - Association for Computing Machinery CY - New York ER - TY - JOUR A1 - Cseh, Ágnes A1 - Kavitha, Telikepalli T1 - Popular matchings in complete graphs JF - Algorithmica : an international journal in computer science N2 - Our input is a complete graph G on n vertices where each vertex has a strict ranking of all other vertices in G. The goal is to construct a matching in G that is popular. A matching M is popular if M does not lose a head-to-head election against any matching M ': here each vertex casts a vote for the matching in {M,M '} in which it gets a better assignment. Popular matchings need not exist in the given instance G and the popular matching problem is to decide whether one exists or not. The popular matching problem in G is easy to solve for odd n. Surprisingly, the problem becomes NP-complete for even n, as we show here. This is one of the few graph theoretic problems efficiently solvable when n has one parity and NP-complete when n has the other parity. KW - Popular matching KW - Complexity KW - Stable matching Y1 - 2021 U6 - https://doi.org/10.1007/s00453-020-00791-7 SN - 0178-4617 SN - 1432-0541 VL - 83 IS - 5 SP - 1493 EP - 1523 PB - Springer CY - New York ER - TY - JOUR A1 - Benson, Lawrence A1 - Makait, Hendrik A1 - Rabl, Tilmann T1 - Viper BT - An Efficient Hybrid PMem-DRAM Key-Value Store JF - Proceedings of the VLDB Endowment N2 - Key-value stores (KVSs) have found wide application in modern software systems. For persistence, their data resides in slow secondary storage, which requires KVSs to employ various techniques to increase their read and write performance from and to the underlying medium. Emerging persistent memory (PMem) technologies offer data persistence at close-to-DRAM speed, making them a promising alternative to classical disk-based storage. However, simply drop-in replacing existing storage with PMem does not yield good results, as block-based access behaves differently in PMem than on disk and ignores PMem's byte addressability, layout, and unique performance characteristics. In this paper, we propose three PMem-specific access patterns and implement them in a hybrid PMem-DRAM KVS called Viper. We employ a DRAM-based hash index and a PMem-aware storage layout to utilize the random-write speed of DRAM and efficient sequential-write performance PMem. Our evaluation shows that Viper significantly outperforms existing KVSs for core KVS operations while providing full data persistence. Moreover, Viper outperforms existing PMem-only, hybrid, and disk-based KVSs by 4-18x for write workloads, while matching or surpassing their get performance. KW - memory Y1 - 2021 U6 - https://doi.org/10.14778/3461535.3461543 SN - 2150-8097 VL - 14 IS - 9 SP - 1544 EP - 1556 PB - Association for Computing Machinery CY - New York ER - TY - JOUR A1 - Bonnet, Philippe A1 - Dong, Xin Luna A1 - Naumann, Felix A1 - Tözün, Pınar T1 - VLDB 2021 BT - Designing a hybrid conference JF - SIGMOD record N2 - The 47th International Conference on Very Large Databases (VLDB'21) was held on August 16-20, 2021 as a hybrid conference. It attracted 180 in-person attendees in Copenhagen and 840 remote attendees. In this paper, we describe our key decisions as general chairs and program committee chairs and share the lessons we learned. Y1 - 2021 U6 - https://doi.org/10.1145/3516431.3516447 SN - 0163-5808 SN - 1943-5835 VL - 50 IS - 4 SP - 50 EP - 53 PB - Association for Computing Machinery CY - New York ER - TY - JOUR A1 - Rüther, Ferenc Darius A1 - Sebode, Marcial A1 - Lohse, Ansgar W. A1 - Wernicke, Sarah A1 - Böttinger, Erwin A1 - Casar, Christian A1 - Braun, Felix A1 - Schramm, Christoph T1 - Mobile app requirements for patients with rare liver diseases BT - a single center survey for the ERN RARE-LIVER JF - Clinics and research in hepatology and gastroenterology N2 - Background: More patient data are needed to improve research on rare liver diseases. Mobile health apps enable an exhaustive data collection. Therefore, the European Reference Network on Hepatological diseases (ERN RARE-LIVER) intends to implement an app for patients with rare liver diseases communicating with a patient registry, but little is known about which features patients and their healthcare providers regard as being useful. Aims: This study aimed to investigate how an app for rare liver diseases would be accepted, and to find out which features are considered useful. Methods: An anonymous survey was conducted on adult patients with rare liver diseases at a single academic, tertiary care outpatient-service. Additionally, medical experts of the ERN working group on autoimmune hepatitis were invited to participate in an online survey. Results: In total, the responses from 100 patients with autoimmune (n = 90) or other rare (n = 10) liver diseases and 32 experts were analyzed. Patients were convinced to use a disease specific app (80%) and expected some benefit to their health (78%) but responses differed signifi-cantly between younger and older patients (93% vs. 62%, p < 0.001; 88% vs. 64%, p < 0.01). Comparing patients' and experts' feedback, patients more often expected a simplified healthcare pathway (e.g. 89% vs. 59% (p < 0.001) wanted access to one's own medical records), while healthcare providers saw the benefit mainly in improving compliance and treatment outcome (e.g. 93% vs. 31% (p < 0.001) and 70% vs. 21% (p < 0.001) expected the app to reduce mistakes in taking medication and improve quality of life, respectively). KW - Primary sclerosing cholangitis KW - Primary biliary cholangitis KW - Autoimmune KW - hepatitis KW - European reference networks KW - Mobile applications KW - Patient KW - reported out-come measures Y1 - 2021 U6 - https://doi.org/10.1016/j.clinre.2021.101760 SN - 2210-7401 SN - 2210-741X VL - 45 IS - 6 PB - Elsevier Masson CY - Amsterdam ER - TY - JOUR A1 - Freitas da Cruz, Harry A1 - Pfahringer, Boris A1 - Martensen, Tom A1 - Schneider, Frederic A1 - Meyer, Alexander A1 - Böttinger, Erwin A1 - Schapranow, Matthieu-Patrick T1 - Using interpretability approaches to update "black-box" clinical prediction models BT - an external validation study in nephrology JF - Artificial intelligence in medicine : AIM N2 - Despite advances in machine learning-based clinical prediction models, only few of such models are actually deployed in clinical contexts. Among other reasons, this is due to a lack of validation studies. In this paper, we present and discuss the validation results of a machine learning model for the prediction of acute kidney injury in cardiac surgery patients initially developed on the MIMIC-III dataset when applied to an external cohort of an American research hospital. To help account for the performance differences observed, we utilized interpretability methods based on feature importance, which allowed experts to scrutinize model behavior both at the global and local level, making it possible to gain further insights into why it did not behave as expected on the validation cohort. The knowledge gleaned upon derivation can be potentially useful to assist model update during validation for more generalizable and simpler models. We argue that interpretability methods should be considered by practitioners as a further tool to help explain performance differences and inform model update in validation studies. KW - Clinical predictive modeling KW - Nephrology KW - Validation KW - Interpretability KW - methods Y1 - 2021 U6 - https://doi.org/10.1016/j.artmed.2020.101982 SN - 0933-3657 SN - 1873-2860 VL - 111 PB - Elsevier CY - Amsterdam ER - TY - JOUR A1 - Rose, Robert A1 - Groeger, Lars A1 - Hölzle, Katharina T1 - The emergence of shared leadership in innovation labs JF - Frontiers in Psychology N2 - Implementing innovation laboratories to leverage intrapreneurship are an increasingly popular organizational practice. A typical feature in these creative environments are semi-autonomous teams in which multiple members collectively exert leadership influence, thereby challenging traditional command-and-control conceptions of leadership. An extensive body of research on the team-centric concept of shared leadership has recognized the potential for pluralized leadership structures in enhancing team effectiveness; however, little empirical work has been conducted in organizational contexts in which creativity is key. This study set out to explore antecedents of shared leadership and its influence on team creativity in an innovation lab. Building on extant shared leadership and innovation research, we propose antecedents customary to creative teamwork, that is, experimental culture, task reflexivity, and voice. Multisource data were collected from 104 team members and 49 evaluations of 29 coaches nested in 21 teams working in a prototypical innovation lab. We identify factors specific to creative teamwork that facilitate the emergence of shared leadership by providing room for experimentation, encouraging team members to speak up in the creative process, and cultivating a reflective application of entrepreneurial thinking. We provide specific exemplary activities for innovation lab teams to increase levels of shared leadership. KW - innovation laboratories KW - intrapreneurship KW - team creativity KW - shared leadership KW - social network analysis Y1 - 2021 U6 - https://doi.org/10.3389/fpsyg.2021.685167 SN - 1664-1078 VL - 12 SP - 1 EP - 13 PB - Frontiers Research Foundation CY - Lausanne ER - TY - JOUR A1 - Trautmann, Justin A1 - Zhou, Lin A1 - Brahms, Clemens Markus A1 - Tunca, Can A1 - Ersoy, Cem A1 - Granacher, Urs A1 - Arnrich, Bert T1 - TRIPOD BT - A treadmill walking dataset with IMU, pressure-distribution and photoelectric data for gait analysis JF - Data : open access ʻData in scienceʼ journal N2 - Inertial measurement units (IMUs) enable easy to operate and low-cost data recording for gait analysis. When combined with treadmill walking, a large number of steps can be collected in a controlled environment without the need of a dedicated gait analysis laboratory. In order to evaluate existing and novel IMU-based gait analysis algorithms for treadmill walking, a reference dataset that includes IMU data as well as reliable ground truth measurements for multiple participants and walking speeds is needed. This article provides a reference dataset consisting of 15 healthy young adults who walked on a treadmill at three different speeds. Data were acquired using seven IMUs placed on the lower body, two different reference systems (Zebris FDMT-HQ and OptoGait), and two RGB cameras. Additionally, in order to validate an existing IMU-based gait analysis algorithm using the dataset, an adaptable modular data analysis pipeline was built. Our results show agreement between the pressure-sensitive Zebris and the photoelectric OptoGait system (r = 0.99), demonstrating the quality of our reference data. As a use case, the performance of an algorithm originally designed for overground walking was tested on treadmill data using the data pipeline. The accuracy of stride length and stride time estimations was comparable to that reported in other studies with overground data, indicating that the algorithm is equally applicable to treadmill data. The Python source code of the data pipeline is publicly available, and the dataset will be provided by the authors upon request, enabling future evaluations of IMU gait analysis algorithms without the need of recording new data. KW - inertial measurement unit KW - gait analysis algorithm KW - OptoGait KW - Zebris KW - data pipeline KW - public dataset Y1 - 2021 U6 - https://doi.org/10.3390/data6090095 SN - 2306-5729 VL - 6 IS - 9 PB - MDPI CY - Basel ER - TY - JOUR A1 - Söchting, Maximilian A1 - Trapp, Matthias T1 - Controlling image-stylization techniques using eye tracking JF - Science and Technology Publications N2 - With the spread of smart phones capable of taking high-resolution photos and the development of high-speed mobile data infrastructure, digital visual media is becoming one of the most important forms of modern communication. With this development, however, also comes a devaluation of images as a media form with the focus becoming the frequency at which visual content is generated instead of the quality of the content. In this work, an interactive system using image-abstraction techniques and an eye tracking sensor is presented, which allows users to experience diverting and dynamic artworks that react to their eye movement. The underlying modular architecture enables a variety of different interaction techniques that share common design principles, making the interface as intuitive as possible. The resulting experience allows users to experience a game-like interaction in which they aim for a reward, the artwork, while being held under constraints, e.g., not blinking. The co nscious eye movements that are required by some interaction techniques hint an interesting, possible future extension for this work into the field of relaxation exercises and concentration training. KW - Eye-tracking KW - Image Abstraction KW - Image Processing KW - Artistic Image Stylization KW - Interactive Media Y1 - 2020 SN - 2184-4321 PB - Springer CY - Berlin ER - TY - JOUR A1 - Van Hout, Cristopher V. A1 - Tachmazidou, Ioanna A1 - Backman, Joshua D. A1 - Hoffman, Joshua D. A1 - Liu, Daren A1 - Pandey, Ashutosh K. A1 - Gonzaga-Jauregui, Claudia A1 - Khalid, Shareef A1 - Ye, Bin A1 - Banerjee, Nilanjana A1 - Li, Alexander H. A1 - O'Dushlaine, Colm A1 - Marcketta, Anthony A1 - Staples, Jeffrey A1 - Schurmann, Claudia A1 - Hawes, Alicia A1 - Maxwell, Evan A1 - Barnard, Leland A1 - Lopez, Alexander A1 - Penn, John A1 - Habegger, Lukas A1 - Blumenfeld, Andrew L. A1 - Bai, Xiaodong A1 - O'Keeffe, Sean A1 - Yadav, Ashish A1 - Praveen, Kavita A1 - Jones, Marcus A1 - Salerno, William J. A1 - Chung, Wendy K. A1 - Surakka, Ida A1 - Willer, Cristen J. A1 - Hveem, Kristian A1 - Leader, Joseph B. A1 - Carey, David J. A1 - Ledbetter, David H. A1 - Cardon, Lon A1 - Yancopoulos, George D. A1 - Economides, Aris A1 - Coppola, Giovanni A1 - Shuldiner, Alan R. A1 - Balasubramanian, Suganthi A1 - Cantor, Michael A1 - Nelson, Matthew R. A1 - Whittaker, John A1 - Reid, Jeffrey G. A1 - Marchini, Jonathan A1 - Overton, John D. A1 - Scott, Robert A. A1 - Abecasis, Goncalo R. A1 - Yerges-Armstrong, Laura M. A1 - Baras, Aris T1 - Exome sequencing and characterization of 49,960 individuals in the UK Biobank JF - Nature : the international weekly journal of science N2 - The UK Biobank is a prospective study of 502,543 individuals, combining extensive phenotypic and genotypic data with streamlined access for researchers around the world(1). Here we describe the release of exome-sequence data for the first 49,960 study participants, revealing approximately 4 million coding variants (of which around 98.6% have a frequency of less than 1%). The data include 198,269 autosomal predicted loss-of-function (LOF) variants, a more than 14-fold increase compared to the imputed sequence. Nearly all genes (more than 97%) had at least one carrier with a LOF variant, and most genes (more than 69%) had at least ten carriers with a LOF variant. We illustrate the power of characterizing LOF variants in this population through association analyses across 1,730 phenotypes. In addition to replicating established associations, we found novel LOF variants with large effects on disease traits, includingPIEZO1on varicose veins,COL6A1on corneal resistance,MEPEon bone density, andIQGAP2andGMPRon blood cell traits. We further demonstrate the value of exome sequencing by surveying the prevalence of pathogenic variants of clinical importance, and show that 2% of this population has a medically actionable variant. Furthermore, we characterize the penetrance of cancer in carriers of pathogenicBRCA1andBRCA2variants. Exome sequences from the first 49,960 participants highlight the promise of genome sequencing in large population-based studies and are now accessible to the scientific community.
Exome sequences from the first 49,960 participants in the UK Biobank highlight the promise of genome sequencing in large population-based studies and are now accessible to the scientific community. KW - clinical exome KW - breast-cancer KW - mutations KW - recommendations KW - gene KW - metaanalysis KW - variants, KW - BRCA1 KW - risk KW - susceptibility Y1 - 2020 U6 - https://doi.org/10.1038/s41586-020-2853-0 SN - 0028-0836 SN - 1476-4687 VL - 586 IS - 7831 SP - 749 EP - 756 PB - Macmillan Publishers Limited CY - London ER - TY - JOUR A1 - Christopher Ashwood, Wout Bittremieux A1 - Bittremieux, Wout A1 - Deutsch, Eric W. A1 - Doncheva, Nadezhda T. A1 - Dorfer, Viktoria A1 - Gabriels, Ralf A1 - Gorshkov, Vladimir A1 - Gupta, Surya A1 - Jones, Andrew R. A1 - Käll, Lukas A1 - Kopczynski, Dominik A1 - Lane, Lydie A1 - Lautenbacher, Ludwig A1 - Legeay, Marc A1 - Locard-Paulet, Marie A1 - Mesuere, Bart A1 - Sachsenberg, Timo A1 - Salz, Renee A1 - Samaras, Patroklos A1 - Schiebenhoefer, Henning A1 - Schmidt, Tobias A1 - Schwämmle, Veit A1 - Soggiu, Alessio A1 - Uszkoreit, Julian A1 - Van Den Bossche, Tim A1 - Van Puyvelde, Bart A1 - Van Strien, Joeri A1 - Verschaffelt, Pieter A1 - Webel, Henry A1 - Willems, Sander A1 - Perez-Riverolab, Yasset A1 - Netz, Eugen A1 - Pfeuffer, Julianus T1 - Proceedings of the EuBIC-MS 2020 Developers’ Meeting JF - EuPA Open Proteomics N2 - The 2020 European Bioinformatics Community for Mass Spectrometry (EuBIC-MS) Developers’ meeting was held from January 13th to January 17th 2020 in Nyborg, Denmark. Among the participants were scientists as well as developers working in the field of computational mass spectrometry (MS) and proteomics. The 4-day program was split between introductory keynote lectures and parallel hackathon sessions. During the latter, the participants developed bioinformatics tools and resources addressing outstanding needs in the community. The hackathons allowed less experienced participants to learn from more advanced computational MS experts, and to actively contribute to highly relevant research projects. We successfully produced several new tools that will be useful to the proteomics community by improving data analysis as well as facilitating future research. All keynote recordings are available on https://doi.org/10.5281/zenodo.3890181. KW - computational mass spectrometry KW - proteomics KW - bioinformatics KW - spectrum clustering KW - phosphoproteomics KW - XIC extraction KW - proteomics graph networks KW - predicted spectra Y1 - 2020 U6 - https://doi.org/10.1016/j.euprot.2020.11.001 SN - 2212-9685 VL - 24 SP - 1 EP - 6 PB - Elsevier CY - Amsterdam ER - TY - JOUR A1 - Scheibel, Willy A1 - Trapp, Matthias A1 - Limberger, Daniel A1 - Döllner, Jürgen Roland Friedrich T1 - A taxonomy of treemap visualization techniques JF - Science and Technology Publications N2 - A treemap is a visualization that has been specifically designed to facilitate the exploration of tree-structured data and, more general, hierarchically structured data. The family of visualization techniques that use a visual metaphor for parent-child relationships based “on the property of containment” (Johnson, 1993) is commonly referred to as treemaps. However, as the number of variations of treemaps grows, it becomes increasingly important to distinguish clearly between techniques and their specific characteristics. This paper proposes to discern between Space-filling Treemap TS, Containment Treemap TC, Implicit Edge Representation Tree TIE, and Mapped Tree TMT for classification of hierarchy visualization techniques and highlights their respective properties. This taxonomy is created as a hyponymy, i.e., its classes have an is-a relationship to one another: TS TC TIE TMT. With this proposal, we intend to stimulate a discussion on a more unambiguous classification of treemaps and, furthermore, broaden what is understood by the concept of treemap itself. KW - Treemaps KW - Taxonomy Y1 - 2020 PB - Springer CY - Berlin ER - TY - JOUR A1 - Risch, Julian A1 - Krestel, Ralf ED - Agarwal, Basant ED - Nayak, Richi ED - Mittal, Namita ED - Patnaik, Srikanta T1 - Toxic comment detection in online discussions JF - Deep learning-based approaches for sentiment analysis N2 - Comment sections of online news platforms are an essential space to express opinions and discuss political topics. In contrast to other online posts, news discussions are related to particular news articles, comments refer to each other, and individual conversations emerge. However, the misuse by spammers, haters, and trolls makes costly content moderation necessary. Sentiment analysis can not only support moderation but also help to understand the dynamics of online discussions. A subtask of content moderation is the identification of toxic comments. To this end, we describe the concept of toxicity and characterize its subclasses. Further, we present various deep learning approaches, including datasets and architectures, tailored to sentiment analysis in online discussions. One way to make these approaches more comprehensible and trustworthy is fine-grained instead of binary comment classification. On the downside, more classes require more training data. Therefore, we propose to augment training data by using transfer learning. We discuss real-world applications, such as semi-automated comment moderation and troll detection. Finally, we outline future challenges and current limitations in light of most recent research publications. KW - deep learning KW - natural language processing KW - user-generated content KW - toxic comment classification KW - hate speech detection Y1 - 2020 SN - 978-981-15-1216-2 SN - 978-981-15-1215-5 U6 - https://doi.org/10.1007/978-981-15-1216-2_4 SN - 2524-7565 SN - 2524-7573 SP - 85 EP - 109 PB - Springer CY - Singapore ER - TY - JOUR A1 - Jiang, Lan A1 - Naumann, Felix T1 - Holistic primary key and foreign key detection JF - Journal of intelligent information systems : JIIS N2 - Primary keys (PKs) and foreign keys (FKs) are important elements of relational schemata in various applications, such as query optimization and data integration. However, in many cases, these constraints are unknown or not documented. Detecting them manually is time-consuming and even infeasible in large-scale datasets. We study the problem of discovering primary keys and foreign keys automatically and propose an algorithm to detect both, namely Holistic Primary Key and Foreign Key Detection (HoPF). PKs and FKs are subsets of the sets of unique column combinations (UCCs) and inclusion dependencies (INDs), respectively, for which efficient discovery algorithms are known. Using score functions, our approach is able to effectively extract the true PKs and FKs from the vast sets of valid UCCs and INDs. Several pruning rules are employed to speed up the procedure. We evaluate precision and recall on three benchmarks and two real-world datasets. The results show that our method is able to retrieve on average 88% of all primary keys, and 91% of all foreign keys. We compare the performance of HoPF with two baseline approaches that both assume the existence of primary keys. KW - Data profiling application KW - Primary key KW - Foreign key KW - Database KW - management Y1 - 2019 U6 - https://doi.org/10.1007/s10844-019-00562-z SN - 0925-9902 SN - 1573-7675 VL - 54 IS - 3 SP - 439 EP - 461 PB - Springer CY - Dordrecht ER - TY - JOUR A1 - Galka, Andreas A1 - Moontaha, Sidratul A1 - Siniatchkin, Michael T1 - Constrained expectation maximisation algorithm for estimating ARMA models in state space representation JF - EURASIP journal on advances in signal processing N2 - This paper discusses the fitting of linear state space models to given multivariate time series in the presence of constraints imposed on the four main parameter matrices of these models. Constraints arise partly from the assumption that the models have a block-diagonal structure, with each block corresponding to an ARMA process, that allows the reconstruction of independent source components from linear mixtures, and partly from the need to keep models identifiable. The first stage of parameter fitting is performed by the expectation maximisation (EM) algorithm. Due to the identifiability constraint, a subset of the diagonal elements of the dynamical noise covariance matrix needs to be constrained to fixed values (usually unity). For this kind of constraints, so far, no closed-form update rules were available. We present new update rules for this situation, both for updating the dynamical noise covariance matrix directly and for updating a matrix square-root of this matrix. The practical applicability of the proposed algorithm is demonstrated by a low-dimensional simulation example. The behaviour of the EM algorithm, as observed in this example, illustrates the well-known fact that in practical applications, the EM algorithm should be combined with a different algorithm for numerical optimisation, such as a quasi-Newton algorithm. KW - Kalman filtering KW - state space modelling KW - expectation maximisation algorithm Y1 - 2020 U6 - https://doi.org/10.1186/s13634-020-00678-3 SN - 1687-6180 VL - 2020 IS - 1 PB - Springer CY - Heidelberg ER - TY - JOUR A1 - van der Aa, Han A1 - Leopold, Henrik A1 - Weidlich, Matthias T1 - Partial order resolution of event logs for process conformance checking JF - Decision support systems : DSS N2 - While supporting the execution of business processes, information systems record event logs. Conformance checking relies on these logs to analyze whether the recorded behavior of a process conforms to the behavior of a normative specification. A key assumption of existing conformance checking techniques, however, is that all events are associated with timestamps that allow to infer a total order of events per process instance. Unfortunately, this assumption is often violated in practice. Due to synchronization issues, manual event recordings, or data corruption, events are only partially ordered. In this paper, we put forward the problem of partial order resolution of event logs to close this gap. It refers to the construction of a probability distribution over all possible total orders of events of an instance. To cope with the order uncertainty in real-world data, we present several estimators for this task, incorporating different notions of behavioral abstraction. Moreover, to reduce the runtime of conformance checking based on partial order resolution, we introduce an approximation method that comes with a bounded error in terms of accuracy. Our experiments with real-world and synthetic data reveal that our approach improves accuracy over the state-of-the-art considerably. KW - process mining KW - conformance checking KW - partial order resolution KW - data KW - uncertainty Y1 - 2020 U6 - https://doi.org/10.1016/j.dss.2020.113347 SN - 0167-9236 SN - 1873-5797 VL - 136 PB - Elsevier CY - Amsterdam [u.a.] ER - TY - JOUR A1 - Casel, Katrin A1 - Dreier, Jan A1 - Fernau, Henning A1 - Gobbert, Moritz A1 - Kuinke, Philipp A1 - Villaamil, Fernando Sánchez A1 - Schmid, Markus L. A1 - van Leeuwen, Erik Jan T1 - Complexity of independency and cliquy trees JF - Discrete applied mathematics N2 - An independency (cliquy) tree of an n-vertex graph G is a spanning tree of G in which the set of leaves induces an independent set (clique). We study the problems of minimizing or maximizing the number of leaves of such trees, and fully characterize their parameterized complexity. We show that all four variants of deciding if an independency/cliquy tree with at least/most l leaves exists parameterized by l are either Para-NP- or W[1]-hard. We prove that minimizing the number of leaves of a cliquy tree parameterized by the number of internal vertices is Para-NP-hard too. However, we show that minimizing the number of leaves of an independency tree parameterized by the number k of internal vertices has an O*(4(k))-time algorithm and a 2k vertex kernel. Moreover, we prove that maximizing the number of leaves of an independency/cliquy tree parameterized by the number k of internal vertices both have an O*(18(k))-time algorithm and an O(k 2(k)) vertex kernel, but no polynomial kernel unless the polynomial hierarchy collapses to the third level. Finally, we present an O(3(n) . f(n))-time algorithm to find a spanning tree where the leaf set has a property that can be decided in f (n) time and has minimum or maximum size. KW - independency tree KW - cliquy tree KW - parameterized complexity KW - Kernelization KW - algorithms KW - exact algorithms Y1 - 2018 U6 - https://doi.org/10.1016/j.dam.2018.08.011 SN - 0166-218X SN - 1872-6771 VL - 272 SP - 2 EP - 15 PB - Elsevier CY - Amsterdam [u.a.] ER - TY - JOUR A1 - Limanowski, Jakub A1 - Lopes, Pedro A1 - Keck, Janis A1 - Baudisch, Patrick A1 - Friston, Karl A1 - Blankenburg, Felix T1 - Action-dependent processing of touch in the human parietal operculum and posterior insula JF - Cerebral Cortex N2 - Somatosensory input generated by one's actions (i.e., self-initiated body movements) is generally attenuated. Conversely, externally caused somatosensory input is enhanced, for example, during active touch and the haptic exploration of objects. Here, we used functional magnetic resonance imaging (fMRI) to ask how the brain accomplishes this delicate weighting of self-generated versus externally caused somatosensory components. Finger movements were either self-generated by our participants or induced by functional electrical stimulation (FES) of the same muscles. During half of the trials, electrotactile impulses were administered when the (actively or passively) moving finger reached a predefined flexion threshold. fMRI revealed an interaction effect in the contralateral posterior insular cortex (pIC), which responded more strongly to touch during self-generated than during FES-induced movements. A network analysis via dynamic causal modeling revealed that connectivity from the secondary somatosensory cortex via the pIC to the supplementary motor area was generally attenuated during self-generated relative to FES-induced movements-yet specifically enhanced by touch received during self-generated, but not FES-induced movements. Together, these results suggest a crucial role of the parietal operculum and the posterior insula in differentiating self-generated from externally caused somatosensory information received from one's moving limb. KW - active touch KW - dynamic causal modeling KW - insula KW - parietal operculum KW - somatosensation Y1 - 2019 U6 - https://doi.org/10.1093/cercor/bhz111 SN - 1047-3211 SN - 1460-2199 VL - 30 IS - 2 SP - 607 EP - 617 PB - Oxford University Press CY - Oxford ER - TY - JOUR A1 - Trilla, Irene A1 - Drimalla, Hanna A1 - Bajbouj, Malek A1 - Dziobek, Isabel T1 - The influence of reward on facial mimicry BT - no evidence for a significant effect of oxytocin JF - Frontiers in behavioral neuroscience N2 - Recent findings suggest a role of oxytocin on the tendency to spontaneously mimic the emotional facial expressions of others. Oxytocin-related increases of facial mimicry, however, seem to be dependent on contextual factors. Given previous literature showing that people preferentially mimic emotional expressions of individuals associated with high (vs. low) rewards, we examined whether the reward value of the mimicked agent is one factor influencing the oxytocin effects on facial mimicry. To test this hypothesis, 60 male adults received 24 IU of either intranasal oxytocin or placebo in a double-blind, between-subject experiment. Next, the value of male neutral faces was manipulated using an associative learning task with monetary rewards. After the reward associations were learned, participants watched videos of the same faces displaying happy and angry expressions. Facial reactions to the emotional expressions were measured with electromyography. We found that participants judged as more pleasant the face identities associated with high reward values than with low reward values. However, happy expressions by low rewarding faces were more spontaneously mimicked than high rewarding faces. Contrary to our expectations, we did not find a significant direct effect of intranasal oxytocin on facial mimicry, nor on the reward-driven modulation of mimicry. Our results support the notion that mimicry is a complex process that depends on contextual factors, but failed to provide conclusive evidence of a role of oxytocin on the modulation of facial mimicry. KW - oxytocin KW - facial mimicry KW - reward KW - EMG KW - social modulation KW - null results Y1 - 2020 U6 - https://doi.org/10.3389/fnbeh.2020.00088 SN - 1662-5153 VL - 14 PB - Frontiers Media CY - Lausanne ER - TY - JOUR A1 - Doerr, Benjamin A1 - Kötzing, Timo T1 - Multiplicative Up-Drift JF - Algorithmica N2 - Drift analysis aims at translating the expected progress of an evolutionary algorithm (or more generally, a random process) into a probabilistic guarantee on its run time (hitting time). So far, drift arguments have been successfully employed in the rigorous analysis of evolutionary algorithms, however, only for the situation that the progress is constant or becomes weaker when approaching the target. Motivated by questions like how fast fit individuals take over a population, we analyze random processes exhibiting a (1+delta)-multiplicative growth in expectation. We prove a drift theorem translating this expected progress into a hitting time. This drift theorem gives a simple and insightful proof of the level-based theorem first proposed by Lehre (2011). Our version of this theorem has, for the first time, the best-possible near-linear dependence on 1/delta} (the previous results had an at least near-quadratic dependence), and it only requires a population size near-linear in delta (this was super-quadratic in previous results). These improvements immediately lead to stronger run time guarantees for a number of applications. We also discuss the case of large delta and show stronger results for this setting. KW - drift theory KW - evolutionary computation KW - stochastic process Y1 - 2020 U6 - https://doi.org/10.1007/s00453-020-00775-7 SN - 0178-4617 SN - 1432-0541 VL - 83 IS - 10 SP - 3017 EP - 3058 PB - Springer CY - New York ER - TY - JOUR A1 - Kruse, Sebastian A1 - Kaoudi, Zoi A1 - Contreras-Rojas, Bertty A1 - Chawla, Sanjay A1 - Naumann, Felix A1 - Quiane-Ruiz, Jorge-Arnulfo T1 - RHEEMix in the data jungle BT - a cost-based optimizer for cross-platform systems JF - The VLDB Journal N2 - Data analytics are moving beyond the limits of a single platform. In this paper, we present the cost-based optimizer of Rheem, an open-source cross-platform system that copes with these new requirements. The optimizer allocates the subtasks of data analytic tasks to the most suitable platforms. Our main contributions are: (i) a mechanism based on graph transformations to explore alternative execution strategies; (ii) a novel graph-based approach to determine efficient data movement plans among subtasks and platforms; and (iii) an efficient plan enumeration algorithm, based on a novel enumeration algebra. We extensively evaluate our optimizer under diverse real tasks. We show that our optimizer can perform tasks more than one order of magnitude faster when using multiple platforms than when using a single platform. KW - Cross-platform KW - Polystore KW - Query optimization KW - Data processing Y1 - 2020 U6 - https://doi.org/10.1007/s00778-020-00612-x SN - 1066-8888 SN - 0949-877X VL - 29 IS - 6 SP - 1287 EP - 1310 PB - Springer CY - Berlin ER - TY - THES A1 - Chujfi-La-Roche, Salim T1 - Human Cognition and natural Language Processing in the Digitally Mediated Environment N2 - Organizations continue to assemble and rely upon teams of remote workers as an essential element of their business strategy; however, knowledge processing is particular difficult in such isolated, largely digitally mediated settings. The great challenge for a knowledge-based organization lies not in how individuals should interact using technology but in how to achieve effective cooperation and knowledge exchange. Currently more attention has been paid to technology and the difficulties machines have processing natural language and less to studies of the human aspect—the influence of our own individual cognitive abilities and preferences on the processing of information when interacting online. This thesis draws on four scientific domains involved in the process of interpreting and processing massive, unstructured data—knowledge management, linguistics, cognitive science, and artificial intelligence—to build a model that offers a reliable way to address the ambiguous nature of language and improve workers’ digitally mediated interactions. Human communication can be discouragingly imprecise and is characterized by a strong linguistic ambiguity; this represents an enormous challenge for the computer analysis of natural language. In this thesis, I propose and develop a new data interpretation layer for the processing of natural language based on the human cognitive preferences of the conversants themselves. Such a semantic analysis merges information derived both from the content and from the associated social and individual contexts, as well as the social dynamics that emerge online. At the same time, assessment taxonomies are used to analyze online comportment at the individual and community level in order to successfully identify characteristics leading to greater effectiveness of communication. Measurement patterns for identifying effective methods of individual interaction with regard to individual cognitive and learning preferences are also evaluated; a novel Cyber-Cognitive Identity (CCI)—a perceptual profile of an individual’s cognitive and learning styles—is proposed. Accommodation of such cognitive preferences can greatly facilitate knowledge management in the geographically dispersed and collaborative digital environment. Use of the CCI is proposed for cognitively labeled Latent Dirichlet Allocation (CLLDA), a novel method for automatically labeling and clustering knowledge that does not rely solely on probabilistic methods, but rather on a fusion of machine learning algorithms and the cognitive identities of the associated individuals interacting in a digitally mediated environment. Advantages include: a greater perspicuity of dynamic and meaningful cognitive rules leading to greater tagging accuracy and a higher content portability at the sentence, document, and corpus level with respect to digital communication. N2 - Zunehmend bauen Organisationen Telearbeit als zentrales Element ihrer Geschäftsstrategie auf. Allerdings führt die Wissensverarbeitung in solchen digital vermittelnden -weitegehend aber nicht interaktiv strukturierten- Kontexten zu Schwierigkeiten. Dabei liegt die wesentliche Herausforderung für wissensbasierte Organisationen nicht in der Frage, wie Individuen mithilfe von Technologien zusammenarbeiten sollten, sondern darin, wie effektiv die Zusammenarbeit und ein effektiver Wissensaustausch zu erreichen sind. Gegenwärtige Untersuchungen fokussieren weit mehr auf Technologien selbst als auf den menschlichen Voraussetzungen von kognitiven Fähigkeiten und Präferenzen bei der online basierten Zusammenarbeit. Genauso ist der Umstand noch nicht hinreichend berücksichtigt worden, dass Natural Language Processing (NLP) den generellen Begleiterscheinungen von Sprache wie Missverständnissen und Mehrdeutigkeiten unterworfen ist. Diese Arbeit setzt auf vier wissenschaftlichen Feldern auf, die in der Verarbeitung und Interpretation von großen, teils unstrukturierten Datenmengen wesentlich sind: Wissensmanagement, Kognitionswissenschaft, Linguistik und Künstliche Intelligenz. Auf dieser breiten Grundlage wird ein Modell angeboten, das auf verlässliche Art, den nicht-deterministischen Charakter von Sprache betont und vor diesem Hintergrund Verbesserungspotentiale digital gestützter Zusammenarbeit aufzeigt. Menschliche Kommunikation kann entmutigend unpräzise sein und ist von linguistischer Mehrdeutigkeit geprägt. Dies bildet eine wesentliche Herausforderung für die computertechnische Analyse natürlicher Sprache. In dieser Arbeit entwickle ich unter Berücksichtigung kognitiver Präferenzen von Gesprächspartnern den Vorschlag für einen neuen Interpretationsansatz von Daten. Im Rahmen dieser semantischen Analyse werden Informationen zusammengeführt, die sowohl den zu vermittelnden Inhalt als auch die damit verbundenen sozialen und individuellen Kontexte, sowie die Gruppendynamik im Online-Umfeld einbeziehen. Gleichzeitig werden Bewertungstaxonomien verwendet, um das Online-Verhalten sowohl auf individueller wie gruppendynamischer Ebene zu analysieren, um darin Merkmale für eine größere Effektivität der Kommunikation zu identifizieren. Es werden Muster zur Identifizierung und Messung wirksamer Methoden der Interaktion in Hinblick auf individuelle kognitive und lernpsychologische Präferenzen bewertet. Hierzu wird der Begriff einer Cyber-Cognitive Identity (CCI) vorgeschlagen, der unterschiedliche Wahrnehmungsprofile kognitiver und lernpsychologischer Stile verschiedener Individuen beschreibt. Die Bezugnahme auf solche kognitiven Präferenzen kann das Wissensmanagement in geografisch verteilten, kollaborativen digitalen Umgebungen erheblich erleichtern und damit das Wissensaustausch verbessern. Cognitive Labeled Latent Dirichlet Allocation (CLLDA) wird als generatives Wahrscheinlichkeitsmodell für die automatische Kennzeichnung und Clusterbildung von CCI-gewonnenen Profilen verwendet. Dabei dominieren methodologisch die Kognitionstypen gegenüber den Wahrscheinlichkeitsaspekten. Mit der Einführung und Weiterverarbeitung des CCI-Begriffs wird der bisherige Forschungsstand um ein fundiertes Verfahrensmodell erweitert, das eine Grundlage für sich potentiell anschließende Forschungsarbeiten und praktische Anwendungen bietet. KW - cognitive science KW - natural language processing KW - knowledge management KW - thinking styles KW - artificial intelligence KW - Kognitionswissenschaft KW - Verarbeitung natürlicher Sprache KW - Wissensmanagement KW - Denkstile KW - künstliche Intelligenz Y1 - 2020 ER - TY - JOUR A1 - Rezaei, Mina A1 - Yang, Haojin A1 - Meinel, Christoph T1 - Recurrent generative adversarial network for learning imbalanced medical image semantic segmentation JF - Multimedia tools and applications : an international journal N2 - We propose a new recurrent generative adversarial architecture named RNN-GAN to mitigate imbalance data problem in medical image semantic segmentation where the number of pixels belongs to the desired object are significantly lower than those belonging to the background. A model trained with imbalanced data tends to bias towards healthy data which is not desired in clinical applications and predicted outputs by these networks have high precision and low recall. To mitigate imbalanced training data impact, we train RNN-GAN with proposed complementary segmentation mask, in addition, ordinary segmentation masks. The RNN-GAN consists of two components: a generator and a discriminator. The generator is trained on the sequence of medical images to learn corresponding segmentation label map plus proposed complementary label both at a pixel level, while the discriminator is trained to distinguish a segmentation image coming from the ground truth or from the generator network. Both generator and discriminator substituted with bidirectional LSTM units to enhance temporal consistency and get inter and intra-slice representation of the features. We show evidence that the proposed framework is applicable to different types of medical images of varied sizes. In our experiments on ACDC-2017, HVSMR-2016, and LiTS-2017 benchmarks we find consistently improved results, demonstrating the efficacy of our approach. KW - Imbalanced medical image semantic segmentation KW - Recurrent generative KW - adversarial network Y1 - 2019 U6 - https://doi.org/10.1007/s11042-019-7305-1 SN - 1380-7501 SN - 1573-7721 VL - 79 IS - 21-22 SP - 15329 EP - 15348 PB - Springer CY - Dordrecht ER - TY - JOUR A1 - Bin Tareaf, Raad A1 - Berger, Philipp A1 - Hennig, Patrick A1 - Meinel, Christoph T1 - Cross-platform personality exploration system for online social networks BT - Facebook vs. Twitter JF - Web intelligence N2 - Social networking sites (SNS) are a rich source of latent information about individual characteristics. Crawling and analyzing this content provides a new approach for enterprises to personalize services and put forward product recommendations. In the past few years, commercial brands made a gradual appearance on social media platforms for advertisement, customers support and public relation purposes and by now it became a necessity throughout all branches. This online identity can be represented as a brand personality that reflects how a brand is perceived by its customers. We exploited recent research in text analysis and personality detection to build an automatic brand personality prediction model on top of the (Five-Factor Model) and (Linguistic Inquiry and Word Count) features extracted from publicly available benchmarks. Predictive evaluation on brands' accounts reveals that Facebook platform provides a slight advantage over Twitter platform in offering more self-disclosure for users' to express their emotions especially their demographic and psychological traits. Results also confirm the wider perspective that the same social media account carry a quite similar and comparable personality scores over different social media platforms. For evaluating our prediction results on actual brands' accounts, we crawled the Facebook API and Twitter API respectively for 100k posts from the most valuable brands' pages in the USA and we visualize exemplars of comparison results and present suggestions for future directions. KW - Big Five model KW - personality prediction KW - brand personality KW - machine KW - learning KW - social media analysis Y1 - 2020 U6 - https://doi.org/10.3233/WEB-200427 SN - 2405-6456 SN - 2405-6464 VL - 18 IS - 1 SP - 35 EP - 51 PB - IOS Press CY - Amsterdam ER - TY - JOUR A1 - Richly, Keven A1 - Brauer, Janos A1 - Schlosser, Rainer T1 - Predicting location probabilities of drivers to improved dispatch decisions of transportation network companies based on trajectory data JF - Proceedings of the 9th International Conference on Operations Research and Enterprise Systems - ICORES N2 - The demand for peer-to-peer ridesharing services increased over the last years rapidly. To cost-efficiently dispatch orders and communicate accurate pick-up times is challenging as the current location of each available driver is not exactly known since observed locations can be outdated for several seconds. The developed trajectory visualization tool enables transportation network companies to analyze dispatch processes and determine the causes of unexpected delays. As dispatching algorithms are based on the accuracy of arrival time predictions, we account for factors like noise, sample rate, technical and economic limitations as well as the duration of the entire process as they have an impact on the accuracy of spatio-temporal data. To improve dispatching strategies, we propose a prediction approach that provides a probability distribution for a driver’s future locations based on patterns observed in past trajectories. We demonstrate the capabilities of our prediction results to ( i) avoid critical delays, (ii) to estimate waiting times with higher confidence, and (iii) to enable risk considerations in dispatching strategies. KW - trajectory data KW - location prediction algorithm KW - Peer-to-Peer ridesharing KW - transport network companies KW - risk-aware dispatching Y1 - 2020 PB - Springer CY - Berlin ER - TY - JOUR A1 - Lewkowicz, Daniel A1 - Wohlbrandt, Attila A1 - Böttinger, Erwin T1 - Economic impact of clinical decision support interventions based on electronic health records JF - BMC Health Services Research N2 - Background Unnecessary healthcare utilization, non-adherence to current clinical guidelines, or insufficient personalized care are perpetual challenges and remain potential major cost-drivers for healthcare systems around the world. Implementing decision support systems into clinical care is promised to improve quality of care and thereby yield substantial effects on reducing healthcare expenditure. In this article, we evaluate the economic impact of clinical decision support (CDS) interventions based on electronic health records (EHR). Methods We searched for studies published after 2014 using MEDLINE, CENTRAL, WEB OF SCIENCE, EBSCO, and TUFTS CEA registry databases that encompass an economic evaluation or consider cost outcome measures of EHR based CDS interventions. Thereupon, we identified best practice application areas and categorized the investigated interventions according to an existing taxonomy of front-end CDS tools. Results and discussion Twenty-seven studies are investigated in this review. Of those, twenty-two studies indicate a reduction of healthcare expenditure after implementing an EHR based CDS system, especially towards prevalent application areas, such as unnecessary laboratory testing, duplicate order entry, efficient transfusion practice, or reduction of antibiotic prescriptions. On the contrary, order facilitators and undiscovered malfunctions revealed to be threats and could lead to new cost drivers in healthcare. While high upfront and maintenance costs of CDS systems are a worldwide implementation barrier, most studies do not consider implementation cost. Finally, four included economic evaluation studies report mixed monetary outcome results and thus highlight the importance of further high-quality economic evaluations for these CDS systems. Conclusion Current research studies lack consideration of comparative cost-outcome metrics as well as detailed cost components in their analyses. Nonetheless, the positive economic impact of EHR based CDS interventions is highly promising, especially with regard to reducing waste in healthcare. KW - Economic evaluation KW - Electronic health record KW - Clinical decision support KW - Behavioral economics Y1 - 2020 U6 - https://doi.org/10.1186/s12913-020-05688-3 SN - 1472-6963 VL - 20 PB - BioMed Central CY - London ER -