TY  - JOUR
A1  - Coupette, Corinna
A1  - Hartung, Dirk
A1  - Beckedorf, Janis
A1  - Böther, Maximilian
A1  - Katz, Daniel Martin
T1  - Law smells
BT  - defining and detecting problematic patterns in legal drafting
JF  - Artificial intelligence and law
N2  - Building on the computer science concept of code smells, we initiate the study of law smells, i.e., patterns in legal texts that pose threats to the comprehensibility and maintainability of the law. With five intuitive law smells as running examples-namely, duplicated phrase, long element, large reference tree, ambiguous syntax, and natural language obsession-, we develop a comprehensive law smell taxonomy. This taxonomy classifies law smells by when they can be detected, which aspects of law they relate to, and how they can be discovered. We introduce text-based and graph-based methods to identify instances of law smells, confirming their utility in practice using the United States Code as a test case. Our work demonstrates how ideas from software engineering can be leveraged to assess and improve the quality of legal code, thus drawing attention to an understudied area in the intersection of law and computer science and highlighting the potential of computational legal drafting.
KW  - Refactoring
KW  - Software engineering
KW  - Law
KW  - Natural language processing
KW  - Network analysis
Y1  - 2022
U6  - https://doi.org/10.1007/s10506-022-09315-w
SN  - 0924-8463
SN  - 1572-8382
VL  - 31
SP  - 335
EP  - 368
PB  - Springer
CY  - Dordrecht
ER  - 
TY  - JOUR
A1  - Tang, Mitchell
A1  - Nakamoto, Carter H.
A1  - Stern, Ariel Dora
A1  - Mehrotra, Ateev
T1  - Trends in remote patient monitoring use in traditional Medicare
JF  - JAMA Internal Medicine
N2  - This cross-sectional study uses traditional Medicare claims data to assess trends in general remote patient monitoring from January 2018 through September 2021.
Y1  - 2022
U6  - https://doi.org/10.1001/jamainternmed.2022.3043
SN  - 2168-6106
SN  - 2168-6114
VL  - 182
IS  - 9
SP  - 1005
EP  - 1006
PB  - American Veterinary Medical Association
CY  - Chicago
ER  - 
TY  - JOUR
A1  - Cseh, Ágnes
A1  - Juhos, Attila
T1  - Pairwise preferences in the stable marriage problem
JF  - ACM Transactions on Economics and Computation / Association for Computing Machinery
N2  - We study the classical, two-sided stable marriage problem under pairwise preferences. In the most general setting, agents are allowed to express their preferences as comparisons of any two of their edges, and they also have the right to declare a draw or even withdraw from such a comparison. This freedom is then gradually restricted as we specify six stages of orderedness in the preferences, ending with the classical case of strictly ordered lists. We study all cases occurring when combining the three known notions of stability-weak, strong, and super-stability-under the assumption that each side of the bipartite market obtains one of the six degrees of orderedness. By designing three polynomial algorithms and two NP-completeness proofs, we determine the complexity of all cases not yet known and thus give an exact boundary in terms of preference structure between tractable and intractable cases.
KW  - Stable marriage
KW  - intransitivity
KW  - acyclic preferences
KW  - poset
KW  - weakly
KW  - stable matching
KW  - strongly stable matching
KW  - super stable matching
Y1  - 2021
U6  - https://doi.org/10.1145/3434427
SN  - 2167-8375
SN  - 2167-8383
VL  - 9
IS  - 1
PB  - Association for Computing Machinery
CY  - New York
ER  - 
TY  - JOUR
A1  - Cseh, Ágnes
A1  - Kavitha, Telikepalli
T1  - Popular matchings in complete graphs
JF  - Algorithmica : an international journal in computer science
N2  - Our input is a complete graph G on n vertices where each vertex has a strict ranking of all other vertices in G. The goal is to construct a matching in G that is popular. A matching M is popular if M does not lose a head-to-head election against any matching M ': here each vertex casts a vote for the matching in {M,M '} in which it gets a better assignment. Popular matchings need not exist in the given instance G and the popular matching problem is to decide whether one exists or not. The popular matching problem in G is easy to solve for odd n. Surprisingly, the problem becomes NP-complete for even n, as we show here. This is one of the few graph theoretic problems efficiently solvable when n has one parity and NP-complete when n has the other parity.
KW  - Popular matching
KW  - Complexity
KW  - Stable matching
Y1  - 2021
U6  - https://doi.org/10.1007/s00453-020-00791-7
SN  - 0178-4617
SN  - 1432-0541
VL  - 83
IS  - 5
SP  - 1493
EP  - 1523
PB  - Springer
CY  - New York
ER  - 
TY  - JOUR
A1  - Genske, Ulrich
A1  - Jahnke, Paul
T1  - Human observer net
BT  - a platform tool for human observer studies of Image data
JF  - Radiology
N2  - Background: 
Current software applications for human observer studies of images lack flexibility in study design, platform independence, multicenter use, and assessment methods and are not open source, limiting accessibility and expandability.

Purpose: 
To develop a user-friendly software platform that enables efficient human observer studies in medical imaging with flexibility of study design. 

Materials and Methods: 
Software for human observer imaging studies was designed as an open-source web application to facilitate access, platform-independent usability, and multicenter studies. Different interfaces for study creation, participation, and management of results were implemented. The software was evaluated in human observer experiments between May 2019 and March 2021, in which duration of observer responses was tracked. Fourteen radiologists evaluated and graded software usability using the 100-point system usability scale. The application was tested in Chrome, Firefox, Safari, and Edge browsers. 

Results: 
Software function was designed to allow visual grading analysis (VGA), multiple-alternative forced-choice (m-AFC), receiver operating characteristic (ROC), localization ROC, free-response ROC, and customized designs. The mean duration of reader responses per image or per image set was 6.2 seconds 6 4.8 (standard deviation), 5.8 seconds 6 4.7, 8.7 seconds 6 5.7, and 6.0 seconds 6 4.5 in four-AFC with 160 image quartets per reader, four-AFC with 640 image quartets per reader, localization ROC, and experimental studies, respectively. The mean system usability scale score was 83 6 11 (out of 100). The documented code and a demonstration of the application are available online (https://github.com/genskeu/HON, https://hondemo.pythonanywhere.com/). 

Conclusion: 
A user-friendly and efficient open-source application was developed for human reader experiments that enables study design versatility, as well as platform-independent and multicenter usability.
Y1  - 2022
U6  - https://doi.org/10.1148/radiol.211832
SN  - 0033-8419
VL  - 303
IS  - 3
SP  - 524
EP  - 530
PB  - Radiologgical soc North America (RSNA)
CY  - Oak brook
ER  - 
TY  - JOUR
A1  - Puri, Manish
A1  - Varde, Aparna S.
A1  - Melo, Gerard de
T1  - Commonsense based text mining on urban policy
JF  - Language resources and evaluation
N2  - Local laws on urban policy, i.e., ordinances directly affect our daily life in various ways (health, business etc.), yet in practice, for many citizens they remain impervious and complex. This article focuses on an approach to make urban policy more accessible and comprehensible to the general public and to government officials, while also addressing pertinent social media postings. Due to the intricacies of the natural language, ranging from complex legalese in ordinances to informal lingo in tweets, it is practical to harness human judgment here. To this end, we mine ordinances and tweets via reasoning based on commonsense knowledge so as to better account for pragmatics and semantics in the text. Ours is pioneering work in ordinance mining, and thus there is no prior labeled training data available for learning. This gap is filled by commonsense knowledge, a prudent choice in situations involving a lack of adequate training data. The ordinance mining can be beneficial to the public in fathoming policies and to officials in assessing policy effectiveness based on public reactions. This work contributes to smart governance, leveraging transparency in governing processes via public involvement. We focus significantly on ordinances contributing to smart cities, hence an important goal is to assess how well an urban region heads towards a smart city as per its policies mapping with smart city characteristics, and the corresponding public satisfaction.
KW  - Commonsense reasoning
KW  - Opinion mining
KW  - Ordinances
KW  - Smart cities
KW  - Social
KW  - media
KW  - Text mining
Y1  - 2022
U6  - https://doi.org/10.1007/s10579-022-09584-6
SN  - 1574-020X
SN  - 1574-0218
VL  - 57
SP  - 733
EP  - 763
PB  - Springer
CY  - Dordrecht [u.a.]
ER  - 
TY  - JOUR
A1  - Bonnet, Philippe
A1  - Dong, Xin Luna
A1  - Naumann, Felix
A1  - Tözün, Pınar
T1  - VLDB 2021
BT  - Designing a hybrid conference
JF  - SIGMOD record
N2  - The 47th International Conference on Very Large Databases (VLDB'21) was held on August 16-20, 2021 as a hybrid conference. It attracted 180 in-person attendees in Copenhagen and 840 remote attendees. In this paper, we describe our key decisions as general chairs and program committee chairs and share the lessons we learned.
Y1  - 2021
SN  - 0163-5808
SN  - 1943-5835
VL  - 50
IS  - 4
SP  - 50
EP  - 53
PB  - Association for Computing Machinery
CY  - New York
ER  - 
TY  - JOUR
A1  - Hagedorn, Christiane
A1  - Serth, Sebastian
A1  - Meinel, Christoph
T1  - The mysterious adventures of Detective Duke
BT  - how storified programming MOOCs support learners in achieving their learning goals
JF  - Frontiers in education
N2  - About 15 years ago, the first Massive Open Online Courses (MOOCs) appeared and revolutionized online education with more interactive and engaging course designs. Yet, keeping learners motivated and ensuring high satisfaction is one of the challenges today's course designers face. Therefore, many MOOC providers employed gamification elements that only boost extrinsic motivation briefly and are limited to platform support. In this article, we introduce and evaluate a gameful learning design we used in several iterations on computer science education courses. For each of the courses on the fundamentals of the Java programming language, we developed a self-contained, continuous story that accompanies learners through their learning journey and helps visualize key concepts. Furthermore, we share our approach to creating the surrounding story in our MOOCs and provide a guideline for educators to develop their own stories. Our data and the long-term evaluation spanning over four Java courses between 2017 and 2021 indicates the openness of learners toward storified programming courses in general and highlights those elements that had the highest impact. While only a few learners did not like the story at all, most learners consumed the additional story elements we provided. However, learners' interest in influencing the story through majority voting was negligible and did not show a considerable positive impact, so we continued with a fixed story instead. We did not find evidence that learners just participated in the narrative because they worked on all materials. Instead, for 10-16% of learners, the story was their main course motivation. We also investigated differences in the presentation format and concluded that several longer audio-book style videos were most preferred by learners in comparison to animated videos or different textual formats. Surprisingly, the availability of a coherent story embedding examples and providing a context for the practical programming exercises also led to a slightly higher ranking in the perceived quality of the learning material (by 4%). With our research in the context of storified MOOCs, we advance gameful learning designs, foster learner engagement and satisfaction in online courses, and help educators ease knowledge transfer for their learners.
KW  - gameful learning
KW  - storytelling
KW  - programming
KW  - learner engagement
KW  - course design
KW  - MOOCs
KW  - content gamification
KW  - narrative
Y1  - 2023
U6  - https://doi.org/10.3389/feduc.2022.1016401
SN  - 2504-284X
VL  - 7
PB  - Frontiers Media
CY  - Lausanne
ER  - 
TY  - JOUR
A1  - Bläsius, Thomas
A1  - Friedrich, Tobias
A1  - Lischeid, Julius
A1  - Meeks, Kitty
A1  - Schirneck, Friedrich Martin
T1  - Efficiently enumerating hitting sets of hypergraphs arising in data profiling
JF  - Journal of computer and system sciences : JCSS
N2  - The transversal hypergraph problem asks to enumerate the minimal hitting sets of a hypergraph. If the solutions have bounded size, Eiter and Gottlob [SICOMP'95] gave an algorithm running in output-polynomial time, but whose space requirement also scales with the output. We improve this to polynomial delay and space. Central to our approach is the extension problem, deciding for a set X of vertices whether it is contained in any minimal hitting set. We show that this is one of the first natural problems to be W[3]-complete. We give an algorithm for the extension problem running in time O(m(vertical bar X vertical bar+1) n) and prove a SETH-lower bound showing that this is close to optimal. We apply our enumeration method to the discovery problem of minimal unique column combinations from data profiling. Our empirical evaluation suggests that the algorithm outperforms its worst-case guarantees on hypergraphs stemming from real-world databases.
KW  - Data profiling
KW  - Enumeration algorithm
KW  - Minimal hitting set
KW  - Transversal hypergraph
KW  - Unique column combination
KW  - W[3]-Completeness
Y1  - 2022
U6  - https://doi.org/10.1016/j.jcss.2021.10.002
SN  - 0022-0000
SN  - 1090-2724
VL  - 124
SP  - 192
EP  - 213
PB  - Elsevier
CY  - San Diego
ER  - 
TY  - JOUR
A1  - Schlosser, Rainer
A1  - Chenavaz, Régis Y.
A1  - Dimitrov, Stanko
T1  - Circular economy
BT  - joint dynamic pricing and recycling investments
JF  - International journal of production economics
N2  - In a circular economy, the use of recycled resources in production is a key performance indicator for management. Yet, academic studies are still unable to inform managers on appropriate recycling and pricing policies. We develop an optimal control model integrating a firm's recycling rate, which can use both virgin and recycled resources in the production process. Our model accounts for recycling influence both at the supply- and demandsides. The positive effect of a firm's use of recycled resources diminishes over time but may increase through investments. Using general formulations for demand and cost, we analytically examine joint dynamic pricing and recycling investment policies in order to determine their optimal interplay over time. We provide numerical experiments to assess the existence of a steady-state and to calculate sensitivity analyses with respect to various model parameters. The analysis shows how to dynamically adapt jointly optimized controls to reach sustainability in the production process. Our results pave the way to sounder sustainable practices for firms operating within a circular economy.
KW  - Dynamic pricing
KW  - Recycling investments
KW  - Optimal control
KW  - General demand function
KW  - Circular economy
Y1  - 2021
U6  - https://doi.org/10.1016/j.ijpe.2021.108117
SN  - 0925-5273
SN  - 1873-7579
VL  - 236
PB  - Elsevier
CY  - Amsterdam
ER  - 
TY  - JOUR
A1  - Thienen, Julia von
A1  - Weinstein, Theresa Julia
A1  - Meinel, Christoph
T1  - Creative metacognition in design thinking
BT  - exploring theories, educational practices, and their implications for measurement
JF  - Frontiers in psychology
N2  - Design thinking is a well-established practical and educational approach to fostering high-level creativity and innovation, which has been refined since the 1950s with the participation of experts like Joy Paul Guilford and Abraham Maslow. Through real-world projects, trainees learn to optimize their creative outcomes by developing and practicing creative cognition and metacognition. This paper provides a holistic perspective on creativity, enabling the formulation of a comprehensive theoretical framework of creative metacognition. It focuses on the design thinking approach to creativity and explores the role of metacognition in four areas of creativity expertise: Products, Processes, People, and Places. The analysis includes task-outcome relationships (product metacognition), the monitoring of strategy effectiveness (process metacognition), an understanding of individual or group strengths and weaknesses (people metacognition), and an examination of the mutual impact between environments and creativity (place metacognition). It also reviews measures taken in design thinking education, including a distribution of cognition and metacognition, to support students in their development of creative mastery. On these grounds, we propose extended methods for measuring creative metacognition with the goal of enhancing comprehensive assessments of the phenomenon. Proposed methodological advancements include accuracy sub-scales, experimental tasks where examinees explore problem and solution spaces, combinations of naturalistic observations with capability testing, as well as physiological assessments as indirect measures of creative metacognition.
KW  - accuracy
KW  - creativity
KW  - design thinking
KW  - education
KW  - measurement
KW  - metacognition
KW  - innovation
KW  - framework
Y1  - 2023
U6  - https://doi.org/10.3389/fpsyg.2023.1157001
SN  - 1664-1078
VL  - 14
PB  - Frontiers Research Foundation
CY  - Lausanne
ER  - 
TY  - JOUR
A1  - Belaid, Mohamed Karim
A1  - Rabus, Maximilian
A1  - Krestel, Ralf
T1  - CrashNet
BT  - an encoder-decoder architecture to predict crash test outcomes
JF  - Data mining and knowledge discovery
N2  - Destructive car crash tests are an elaborate, time-consuming, and expensive necessity of the automotive development process. Today, finite element method (FEM) simulations are used to reduce costs by simulating car crashes computationally. We propose CrashNet, an encoder-decoder deep neural network architecture that reduces costs further and models specific outcomes of car crashes very accurately. We achieve this by formulating car crash events as time series prediction enriched with a set of scalar features. Traditional sequence-to-sequence models are usually composed of convolutional neural network (CNN) and CNN transpose layers. We propose to concatenate those with an MLP capable of learning how to inject the given scalars into the output time series. In addition, we replace the CNN transpose with 2D CNN transpose layers in order to force the model to process the hidden state of the set of scalars as one time series. The proposed CrashNet model can be trained efficiently and is able to process scalars and time series as input in order to infer the results of crash tests. CrashNet produces results faster and at a lower cost compared to destructive tests and FEM simulations. Moreover, it represents a novel approach in the car safety management domain.
KW  - Predictive models
KW  - Time series analysis
KW  - Supervised deep neural
KW  - networks
KW  - Car safety management
Y1  - 2021
U6  - https://doi.org/10.1007/s10618-021-00761-9
SN  - 1384-5810
SN  - 1573-756X
VL  - 35
IS  - 4
SP  - 1688
EP  - 1709
PB  - Springer
CY  - Dordrecht
ER  - 
TY  - CHAP
A1  - Corazza, Giovanni Emanuele
A1  - Thienen, Julia von
ED  - Glăveanu, Vlad Petre
T1  - Invention
T2  - The Palgrave encyclopedia of the possible
N2  - This entry addresses invention from five different perspectives: (i) definition of the term, (ii) mechanisms underlying invention processes, (iii) (pre-)history of human inventions, (iv) intellectual property protection vs open innovation, and (v) case studies of great inventors. Regarding the definition, an invention is the outcome of a creative process taking place within a technological milieu, which is recognized as successful in terms of its effectiveness as an original technology. In the process of invention, a technological possibility becomes realized. Inventions are distinct from either discovery or innovation. In human creative processes, seven mechanisms of invention can be observed, yielding characteristic outcomes: (1) basic inventions, (2) invention branches, (3) invention combinations, (4) invention toolkits, (5) invention exaptations, (6) invention values, and (7) game-changing inventions. The development of humanity has been strongly shaped by inventions ever since early stone tools and the conception of agriculture. An “explosion of creativity” has been associated with Homo sapiens, and inventions in all fields of human endeavor have followed suit, engendering an exponential growth of cumulative culture. This culture development emerges essentially through a reuse of previous inventions, their revision, amendment and rededication. In sociocultural terms, humans have increasingly regulated processes of invention and invention-reuse through concepts such as intellectual property, patents, open innovation and licensing methods. Finally, three case studies of great inventors are considered: Edison, Marconi, and Montessori, next to a discussion of human invention processes as collaborative endeavors.
KW  - invention
KW  - creativity
KW  - invention mechanism
KW  - cumulative culture
KW  - technology
KW  - innovation
KW  - patent
KW  - open innovation
Y1  - 2023
SN  - 978-3-030-90912-3
SN  - 978-3-030-90913-0
U6  - https://doi.org/10.1007/978-3-030-90913-0_14
SP  - 806
EP  - 814
PB  - Springer International Publishing
CY  - Cham
ER  - 
TY  - JOUR
A1  - Hiort, Pauline
A1  - Schlaffner, Christoph N.
A1  - Steen, Judith A.
A1  - Renard, Bernhard Y.
A1  - Steen, Hanno
T1  - multiFLEX-LF: a computational approach to quantify the modification stoichiometries in label-free proteomics data sets
JF  - Journal of proteome research
N2  - In liquid-chromatography-tandem-mass-spectrometry-based proteomics, information about the presence and stoichiometry ofprotein modifications is not readily available. To overcome this problem,we developed multiFLEX-LF, a computational tool that builds uponFLEXIQuant, which detects modified peptide precursors and quantifiestheir modification extent by monitoring the differences between observedand expected intensities of the unmodified precursors. multiFLEX-LFrelies on robust linear regression to calculate the modification extent of agiven precursor relative to a within-study reference. multiFLEX-LF cananalyze entire label-free discovery proteomics data sets in a precursor-centric manner without preselecting a protein of interest. To analyzemodification dynamics and coregulated modifications, we hierarchicallyclustered the precursors of all proteins based on their computed relativemodification scores. We applied multiFLEX-LF to a data-independent-acquisition-based data set acquired using the anaphase-promoting complex/cyclosome (APC/C) isolated at various time pointsduring mitosis. The clustering of the precursors allows for identifying varying modification dynamics and ordering the modificationevents. Overall, multiFLEX-LF enables the fast identification of potentially differentially modified peptide precursors and thequantification of their differential modification extent in large data sets using a personal computer. Additionally, multiFLEX-LF candrive the large-scale investigation of the modification dynamics of peptide precursors in time-series and case-control studies.multiFLEX-LF is available athttps://gitlab.com/SteenOmicsLab/multiflex-lf.
KW  - bioinformatics tool
KW  - label-free quantification
KW  - LC-MS
KW  - MS
KW  - post-translational modification
KW  - modification stoichiometry
KW  - PTM
KW  - quantification
Y1  - 2022
U6  - https://doi.org/10.1021/acs.jproteome.1c00669
SN  - 1535-3893
SN  - 1535-3907
VL  - 21
IS  - 4
SP  - 899
EP  - 909
PB  - American Chemical Society
CY  - Washington
ER  - 
TY  - JOUR
A1  - Wittig, Alice
A1  - Miranda, Fabio Malcher
A1  - Hölzer, Martin
A1  - Altenburg, Tom
A1  - Bartoszewicz, Jakub Maciej
A1  - Beyvers, Sebastian
A1  - Dieckmann, Marius Alfred
A1  - Genske, Ulrich
A1  - Giese, Sven Hans-Joachim
A1  - Nowicka, Melania
A1  - Richard, Hugues
A1  - Schiebenhoefer, Henning
A1  - Schmachtenberg, Anna-Juliane
A1  - Sieben, Paul
A1  - Tang, Ming
A1  - Tembrockhaus, Julius
A1  - Renard, Bernhard Y.
A1  - Fuchs, Stephan
T1  - CovRadar
BT  - continuously tracking and filtering SARS-CoV-2 mutations for genomic surveillance
JF  - Bioinformatics
N2  - The ongoing pandemic caused by SARS-CoV-2 emphasizes the importance of genomic surveillance to understand the evolution of the virus, to monitor the viral population, and plan epidemiological responses. Detailed analysis, easy visualization and intuitive filtering of the latest viral sequences are powerful for this purpose. We present CovRadar, a tool for genomic surveillance of the SARS-CoV-2 Spike protein. CovRadar consists of an analytical pipeline and a web application that enable the analysis and visualization of hundreds of thousand sequences. First, CovRadar extracts the regions of interest using local alignment, then builds a multiple sequence alignment, infers variants and consensus and finally presents the results in an interactive app, making accessing and reporting simple, flexible and fast.
Y1  - 2022
U6  - https://doi.org/10.1093/bioinformatics/btac411
SN  - 1367-4803
SN  - 1367-4811
VL  - 38
IS  - 17
SP  - 4223
EP  - 4225
PB  - Oxford Univ. Press
CY  - Oxford
ER  - 
TY  - JOUR
A1  - Omolaoye, Temidayo S.
A1  - Omolaoye, Victor Adelakun
A1  - Kandasamy, Richard K.
A1  - Hachim, Mahmood Yaseen
A1  - Du Plessis, Stefan S.
T1  - Omics and male infertility
BT  - highlighting the application of transcriptomic data
JF  - Life : open access journal
N2  - Male infertility is a multifaceted disorder affecting approximately 50% of male partners in infertile couples. 
Over the years, male infertility has been diagnosed mainly through semen analysis, hormone evaluations, medical records and physical examinations, which of course are fundamental, but yet inefficient, because 30% of male infertility cases remain idiopathic. This dilemmatic status of the unknown needs to be addressed with more sophisticated and result-driven technologies and/or techniques. 
Genetic alterations have been linked with male infertility, thereby unveiling the practicality of investigating this disorder from the "omics" perspective. 
Omics aims at analyzing the structure and functions of a whole constituent of a given biological function at different levels, including the molecular gene level (genomics), transcript level (transcriptomics), protein level (proteomics) and metabolites level (metabolomics). In the current study, an overview of the four branches of omics and their roles in male infertility are briefly discussed; the potential usefulness of assessing transcriptomic data to understand this pathology is also elucidated. 
After assessing the publicly obtainable transcriptomic data for datasets on male infertility, a total of 1385 datasets were retrieved, of which 10 datasets met the inclusion criteria and were used for further analysis. 
These datasets were classified into groups according to the disease or cause of male infertility. 
The groups include non-obstructive azoospermia (NOA), obstructive azoospermia (OA), non-obstructive and obstructive azoospermia (NOA and OA), spermatogenic dysfunction, sperm dysfunction, and Y chromosome microdeletion. 
Findings revealed that 8 genes (LDHC, PDHA2, TNP1, TNP2, ODF1, ODF2, SPINK2, PCDHB3) were commonly differentially expressed between all disease groups. 
Likewise, 56 genes were common between NOA versus NOA and OA (ADAD1, BANF2, BCL2L14, C12orf50, C20orf173, C22orf23, C6orf99, C9orf131, C9orf24, CABS1, CAPZA3, CCDC187, CCDC54, CDKN3, CEP170, CFAP206, CRISP2, CT83, CXorf65, FAM209A, FAM71F1, FAM81B, GALNTL5, GTSF1, H1FNT, HEMGN, HMGB4, KIF2B, LDHC, LOC441601, LYZL2, ODF1, ODF2, PCDHB3, PDHA2, PGK2, PIH1D2, PLCZ1, PROCA1, RIMBP3, ROPN1L, SHCBP1L, SMCP, SPATA16, SPATA19, SPINK2, TEX33, TKTL2, TMCO2, TMCO5A, TNP1, TNP2, TSPAN16, TSSK1B, TTLL2, UBQLN3). 
These genes, particularly the above-mentioned 8 genes, are involved in diverse biological processes such as germ cell development, spermatid development, spermatid differentiation, regulation of proteolysis, spermatogenesis and metabolic processes. 
Owing to the stage-specific expression of these genes, any mal-expression can ultimately lead to male infertility. 
Therefore, currently available data on all branches of omics relating to male fertility can be used to identify biomarkers for diagnosing male infertility, which can potentially help in unravelling some idiopathic cases.
KW  - male infertility
KW  - omics
KW  - genomics
KW  - transcriptomics
KW  - proteomics
KW  - metabolomics
Y1  - 2022
U6  - https://doi.org/10.3390/life12020280
SN  - 2075-1729
VL  - 12
IS  - 2
PB  - MDPI
CY  - Basel
ER  - 
TY  - JOUR
A1  - Gamage, Dilrukshi
A1  - Staubitz, Thomas
A1  - Whiting, Mark
T1  - Peer assessment in MOOCs
BT  - Systematic literature review
JF  - Distance education
N2  - We report on a systematic review of the landscape of peer assessment in massive open online courses (MOOCs) with papers from 2014 to 2020 in 20 leading education technology publication venues across four databases containing education technology-related papers, addressing three research issues: the evolution of peer assessment in MOOCs during the period 2014 to 2020, the methods used in MOOCs to assess peers, and the challenges of and future directions in MOOC peer assessment. We provide summary statistics and a review of methods across the corpus and highlight three directions for improving the use of peer assessment in MOOCs: the need for focusing on scaling learning through peer evaluations, the need for scaling and optimizing team submissions in team peer assessments, and the need for embedding a social process for peer assessment.
KW  - MOOC
KW  - peer assessment
KW  - peer evaluation
KW  - peer review
KW  - literature review
KW  - social interaction
Y1  - 2021
U6  - https://doi.org/10.1080/01587919.2021.1911626
SN  - 0158-7919
SN  - 1475-0198
VL  - 42
IS  - 2
SP  - 268
EP  - 289
PB  - Routledge, Taylor & Francis Group
CY  - Abingdon
ER  - 
TY  - JOUR
A1  - Chandran, Sunil L.
A1  - Issac, Davis
A1  - Lauri, Juho
A1  - van Leeuwen, Erik Jan
T1  - Upper bounding rainbow connection number by forest number
JF  - Discrete mathematics
N2  - A path in an edge-colored graph is rainbow if no two edges of it are colored the same, and the graph is rainbow-connected if there is a rainbow path between each pair of its vertices. The minimum number of colors needed to rainbow-connect a graph G is the rainbow connection number of G, denoted by rc(G).& nbsp;A simple way to rainbow-connect a graph G is to color the edges of a spanning tree with distinct colors and then re-use any of these colors to color the remaining edges of G. This proves that rc(G) <= |V (G)|-1. We ask whether there is a stronger connection between tree-like structures and rainbow coloring than that is implied by the above trivial argument. For instance, is it possible to find an upper bound of t(G)-1 for rc(G), where t(G) is the number of vertices in the largest induced tree of G? The answer turns out to be negative, as there are counter-examples that show that even c .t(G) is not an upper bound for rc(G) for any given constant c.& nbsp;In this work we show that if we consider the forest number f(G), the number of vertices in a maximum induced forest of G, instead of t(G), then surprisingly we do get an upper bound. More specifically, we prove that rc(G) <= f(G) + 2. Our result indicates a stronger connection between rainbow connection and tree-like structures than that was suggested by the simple spanning tree based upper bound.
KW  - rainbow connection
KW  - forest number
KW  - upper bound
Y1  - 2022
U6  - https://doi.org/10.1016/j.disc.2022.112829
SN  - 0012-365X
SN  - 1872-681X
VL  - 345
IS  - 7
PB  - Elsevier
CY  - Amsterdam [u.a.]
ER  - 
TY  - JOUR
A1  - Hölzle, Katharina
A1  - Björk, Jennie
A1  - Boer, Harry
T1  - Light at the end of the tunnel
JF  - Creativity and innovation management
Y1  - 2021
U6  - https://doi.org/10.1111/caim.12427
SN  - 0963-1690
SN  - 1467-8691
VL  - 30
IS  - 1
SP  - 3
EP  - 5
PB  - Wiley-Blackwell
CY  - Oxford [u.a.]
ER  - 
TY  - JOUR
A1  - Navarro, Marisa
A1  - Orejas, Fernando
A1  - Pino, Elvira
A1  - Lambers, Leen
T1  - A navigational logic for reasoning about graph properties
JF  - Journal of logical and algebraic methods in programming
N2  - Graphs play an important role in many areas of Computer Science. In particular, our work is motivated by model-driven software development and by graph databases. For this reason, it is very important to have the means to express and to reason about the properties that a given graph may satisfy. With this aim, in this paper we present a visual logic that allows us to describe graph properties, including navigational properties, i.e., properties about the paths in a graph. The logic is equipped with a deductive tableau method that we have proved to be sound and complete.
KW  - Graph logic
KW  - Algebraic methods
KW  - Formal modelling
KW  - Specification
Y1  - 2021
U6  - https://doi.org/10.1016/j.jlamp.2020.100616
SN  - 2352-2208
SN  - 2352-2216
VL  - 118
PB  - Elsevier Science
CY  - Amsterdam [u.a.]
ER  - 
TY  - JOUR
A1  - de Paula, Danielly
A1  - Marx, Carolin
A1  - Wolf, Ella
A1  - Dremel, Christian
A1  - Cormican, Kathryn
A1  - Uebernickel, Falk
T1  - A managerial mental model to drive innovation in the context of digital transformation
JF  - Industry and innovation
N2  - Industry 4.0 is transforming how businesses innovate and, as a result, companies are spearheading the movement towards 'Digital Transformation'. While some scholars advocate the use of design thinking to identify new innovative behaviours, cognition experts emphasise the importance of top managers in supporting employees to develop these behaviours. However, there is a dearth of research in this domain and companies are struggling to implement the required behaviours. To address this gap, this study aims to identify and prioritise behavioural strategies conducive to design thinking to inform the creation of a managerial mental model. We identify 20 behavioural strategies from 45 interviewees with practitioners and educators and combine them with the concepts of 'paradigm-mindset-mental model' from cognition theory. The paper contributes to the body of knowledge by identifying and prioritising specific behavioural strategies to form a novel set of survival conditions aligned to the new industrial paradigm of Industry 4.0.
KW  - Strategic cognition
KW  - mental models
KW  - industry 4.0
KW  - digital transformation
KW  - design thinking
Y1  - 2022
U6  - https://doi.org/10.1080/13662716.2022.2072711
SN  - 1366-2716
SN  - 1469-8390
PB  - Routledge, Taylor & Francis Group
CY  - Abingdon
ER  - 
TY  - JOUR
A1  - Ihde, Sven
A1  - Pufahl, Luise
A1  - Völker, Maximilian
A1  - Goel, Asvin
A1  - Weske, Mathias
T1  - A framework for modeling and executing task
BT  - specific resource allocations in business processes
JF  - Computing : archives for informatics and numerical computation
N2  - As resources are valuable assets, organizations have to decide which resources to allocate to business process tasks in a way that the process is executed not only effectively but also efficiently. Traditional role-based resource allocation leads to effective process executions, since each task is performed by a resource that has the required skills and competencies to do so. However, the resulting allocations are typically not as efficient as they could be, since optimization techniques have yet to find their way in traditional business process management scenarios. On the other hand, operations research provides a rich set of analytical methods for supporting problem-specific decisions on resource allocation. This paper provides a novel framework for creating transparency on existing tasks and resources, supporting individualized allocations for each activity in a process, and the possibility to integrate problem-specific analytical methods of the operations research domain. To validate the framework, the paper reports on the design and prototypical implementation of a software architecture, which extends a traditional process engine with a dedicated resource management component. This component allows us to define specific resource allocation problems at design time, and it also facilitates optimized resource allocation at run time. The framework is evaluated using a real-world parcel delivery process. The evaluation shows that the quality of the allocation results increase significantly with a technique from operations research in contrast to the traditional applied rule-based approach.
KW  - Process Execution
KW  - Business Process Management
KW  - Resource Allocation
KW  - Resource Management
KW  - Activity-oriented Optimization
Y1  - 2022
U6  - https://doi.org/10.1007/s00607-022-01093-2
SN  - 0010-485X
SN  - 1436-5057
VL  - 104
SP  - 2405
EP  - 2429
PB  - Springer
CY  - Wien
ER  - 
TY  - JOUR
A1  - Combi, Carlo
A1  - Oliboni, Barbara
A1  - Weske, Mathias
A1  - Zerbato, Francesca
T1  - Seamless conceptual modeling of processes with transactional and analytical data
JF  - Data & knowledge engineering
N2  - In the field of Business Process Management (BPM), modeling business processes and related data is a critical issue since process activities need to manage data stored in databases. The connection between processes and data is usually handled at the implementation level, even if modeling both processes and data at the conceptual level should help designers in improving business process models and identifying requirements for implementation. Especially in data -and decision-intensive contexts, business process activities need to access data stored both in databases and data warehouses. In this paper, we complete our approach for defining a novel conceptual view that bridges process activities and data. The proposed approach allows the designer to model the connection between business processes and database models and define the operations to perform, providing interesting insights on the overall connected perspective and hints for identifying activities that are crucial for decision support.
KW  - Conceptual modeling
KW  - Business process modeling
KW  - BPMN
KW  - Data modeling
KW  - Data warehouse
KW  - Decision support
Y1  - 2021
U6  - https://doi.org/10.1016/j.datak.2021.101895
SN  - 0169-023X
SN  - 1872-6933
VL  - 134
PB  - Elsevier
CY  - Amsterdam
ER  - 
TY  - JOUR
A1  - Rüther, Ferenc Darius
A1  - Sebode, Marcial
A1  - Lohse, Ansgar W.
A1  - Wernicke, Sarah
A1  - Böttinger, Erwin
A1  - Casar, Christian
A1  - Braun, Felix
A1  - Schramm, Christoph
T1  - Mobile app requirements for patients with rare liver diseases
BT  - a single center survey for the ERN RARE-LIVER
JF  - Clinics and research in hepatology and gastroenterology
N2  - Background: 
More patient data are needed to improve research on rare liver diseases. Mobile health apps enable an exhaustive data collection. Therefore, the European Reference Network on Hepatological diseases (ERN RARE-LIVER) intends to implement an app for patients with rare liver diseases communicating with a patient registry, but little is known about which features patients and their healthcare providers regard as being useful. 

Aims: 
This study aimed to investigate how an app for rare liver diseases would be accepted, and to find out which features are considered useful. 

Methods: 
An anonymous survey was conducted on adult patients with rare liver diseases at a single academic, tertiary care outpatient-service. Additionally, medical experts of the ERN working group on autoimmune hepatitis were invited to participate in an online survey. 

Results: 
In total, the responses from 100 patients with autoimmune (n = 90) or other rare (n = 10) liver diseases and 32 experts were analyzed. Patients were convinced to use a disease specific app (80%) and expected some benefit to their health (78%) but responses differed signifi-cantly between younger and older patients (93% vs. 62%, p < 0.001; 88% vs. 64%, p < 0.01). Comparing patients' and experts' feedback, patients more often expected a simplified healthcare pathway (e.g. 89% vs. 59% (p < 0.001) wanted access to one's own medical records), while healthcare providers saw the benefit mainly in improving compliance and treatment outcome (e.g. 93% vs. 31% (p < 0.001) and 70% vs. 21% (p < 0.001) expected the app to reduce mistakes in taking medication and improve quality of life, respectively).
KW  - Primary sclerosing cholangitis
KW  - Primary biliary cholangitis
KW  - Autoimmune
KW  - hepatitis
KW  - European reference networks
KW  - Mobile applications
KW  - Patient
KW  - reported out-come measures
Y1  - 2021
U6  - https://doi.org/10.1016/j.clinre.2021.101760
SN  - 2210-7401
SN  - 2210-741X
VL  - 45
IS  - 6
PB  - Elsevier Masson
CY  - Amsterdam
ER  - 
TY  - JOUR
A1  - Koorn, Jelmer Jan
A1  - Lu, Xixi
A1  - Leopold, Henrik
A1  - Reijers, Hajo A.
T1  - From action to response to effect
BT  - mining statistical relations in work processes
JF  - Information systems : IS ; an international journal ; data bases
N2  - Process mining techniques are valuable to gain insights into and help improve (work) processes. Many of these techniques focus on the sequential order in which activities are performed. Few of these techniques consider the statistical relations within processes. In particular, existing techniques do not allow insights into how responses to an event (action) result in desired or undesired outcomes (effects). We propose and formalize the ARE miner, a novel technique that allows us to analyze and understand these action-response-effect patterns. We take a statistical approach to uncover potential dependency relations in these patterns. The goal of this research is to generate processes that are: (1) appropriately represented, and (2) effectively filtered to show meaningful relations. We evaluate the ARE miner in two ways. First, we use an artificial data set to demonstrate the effectiveness of the ARE miner compared to two traditional process-oriented approaches. Second, we apply the ARE miner to a real-world data set from a Dutch healthcare institution. We show that the ARE miner generates comprehensible representations that lead to informative insights into statistical relations between actions, responses, and effects.
KW  - Process discovery
KW  - Statistical process mining
KW  - Effect measurement
Y1  - 2022
U6  - https://doi.org/10.1016/j.is.2022.102035
SN  - 0306-4379
SN  - 0094-453X
VL  - 109
PB  - Elsevier
CY  - Amsterdam
ER  - 
TY  - JOUR
A1  - Freitas da Cruz, Harry
A1  - Pfahringer, Boris
A1  - Martensen, Tom
A1  - Schneider, Frederic
A1  - Meyer, Alexander
A1  - Böttinger, Erwin
A1  - Schapranow, Matthieu-Patrick
T1  - Using interpretability approaches to update "black-box" clinical prediction models
BT  - an external validation study in nephrology
JF  - Artificial intelligence in medicine : AIM
N2  - Despite advances in machine learning-based clinical prediction models, only few of such models are actually deployed in clinical contexts. Among other reasons, this is due to a lack of validation studies. In this paper, we present and discuss the validation results of a machine learning model for the prediction of acute kidney injury in cardiac surgery patients initially developed on the MIMIC-III dataset when applied to an external cohort of an American research hospital. To help account for the performance differences observed, we utilized interpretability methods based on feature importance, which allowed experts to scrutinize model behavior both at the global and local level, making it possible to gain further insights into why it did not behave as expected on the validation cohort. The knowledge gleaned upon derivation can be potentially useful to assist model update during validation for more generalizable and simpler models. We argue that interpretability methods should be considered by practitioners as a further tool to help explain performance differences and inform model update in validation studies.
KW  - Clinical predictive modeling
KW  - Nephrology
KW  - Validation
KW  - Interpretability
KW  - methods
Y1  - 2021
U6  - https://doi.org/10.1016/j.artmed.2020.101982
SN  - 0933-3657
SN  - 1873-2860
VL  - 111
PB  - Elsevier
CY  - Amsterdam
ER  - 
TY  - JOUR
A1  - Aa, Han van der
A1  - Rebmann, Adrian
A1  - Leopold, Henrik
T1  - Natural language-based detection of semantic execution anomalies in event logs
JF  - Information systems : IS ; an international journal ; data bases
N2  - Anomaly detection in process mining aims to recognize outlying or unexpected behavior in event logs for purposes such as the removal of noise and identification of conformance violations. Existing techniques for this task are primarily frequency-based, arguing that behavior is anomalous because it is uncommon. However, such techniques ignore the semantics of recorded events and, therefore, do not take the meaning of potential anomalies into consideration. In this work, we overcome this caveat and focus on the detection of anomalies from a semantic perspective, arguing that anomalies can be recognized when process behavior does not make sense. To achieve this, we propose an approach that exploits the natural language associated with events. Our key idea is to detect anomalous process behavior by identifying semantically inconsistent execution patterns. To detect such patterns, we first automatically extract business objects and actions from the textual labels of events. We then compare these against a process-independent knowledge base. By populating this knowledge base with patterns from various kinds of resources, our approach can be used in a range of contexts and domains. We demonstrate the capability of our approach to successfully detect semantic execution anomalies through an evaluation based on a set of real-world and synthetic event logs and show the complementary nature of semantics-based anomaly detection to existing frequency-based techniques.
KW  - Process mining
KW  - Natural language processing
KW  - Anomaly detection
Y1  - 2021
U6  - https://doi.org/10.1016/j.is.2021.101824
SN  - 0306-4379
SN  - 1873-6076
VL  - 102
PB  - Elsevier
CY  - Amsterdam
ER  - 
TY  - JOUR
A1  - Ulrich, Jens-Uwe
A1  - Lutfi, Ahmad
A1  - Rutzen, Kilian
A1  - Renard, Bernhard Y.
T1  - ReadBouncer
BT  - precise and scalable adaptive sampling for nanopore sequencing
JF  - Bioinformatics
N2  - Motivation: 
Nanopore sequencers allow targeted sequencing of interesting nucleotide sequences by rejecting other sequences from individual pores. This feature facilitates the enrichment of low-abundant sequences by depleting overrepresented ones in-silico. Existing tools for adaptive sampling either apply signal alignment, which cannot handle human-sized reference sequences, or apply read mapping in sequence space relying on fast graphical processing units (GPU) base callers for real-time read rejection. Using nanopore long-read mapping tools is also not optimal when mapping shorter reads as usually analyzed in adaptive sampling applications. 

Results: 
Here, we present a new approach for nanopore adaptive sampling that combines fast CPU and GPU base calling with read classification based on Interleaved Bloom Filters. ReadBouncer improves the potential enrichment of low abundance sequences by its high read classification sensitivity and specificity, outperforming existing tools in the field. It robustly removes even reads belonging to large reference sequences while running on commodity hardware without GPUs, making adaptive sampling accessible for in-field researchers. Readbouncer also provides a user-friendly interface and installer files for end-users without a bioinformatics background.
Y1  - 2022
U6  - https://doi.org/10.1093/bioinformatics/btac223
SN  - 1367-4803
SN  - 1460-2059
VL  - 38
IS  - SUPPL 1
SP  - 153
EP  - 160
PB  - Oxford Univ. Press
CY  - Oxford
ER  - 
TY  - JOUR
A1  - Richly, Keven
A1  - Schlosser, Rainer
A1  - Boissier, Martin
T1  - Budget-conscious fine-grained configuration optimization for spatio-temporal applications
JF  - Proceedings of the VLDB Endowment
N2  - Based on the performance requirements of modern spatio-temporal data mining applications, in-memory database systems are often used to store and process the data. To efficiently utilize the scarce DRAM capacities, modern database systems support various tuning possibilities to reduce the memory footprint (e.g., data compression) or increase performance (e.g., additional indexes). However, the selection of cost and performance balancing configurations is challenging due to the vast number of possible setups consisting of mutually dependent individual decisions. In this paper, we introduce a novel approach to jointly optimize the compression, sorting, indexing, and tiering configuration for spatio-temporal workloads. Further, we consider horizontal data partitioning, which enables the independent application of different tuning options on a fine-grained level. We propose different linear programming (LP) models addressing cost dependencies at different levels of accuracy to compute optimized tuning configurations for a given workload and memory budgets. To yield maintainable and robust configurations, we extend our LP-based approach to incorporate reconfiguration costs as well as a worst-case optimization for potential workload scenarios. Further, we demonstrate on a real-world dataset that our models allow to significantly reduce the memory footprint with equal performance or increase the performance with equal memory size compared to existing tuning heuristics.
KW  - General Earth and Planetary Sciences
KW  - Water Science and Technology
KW  - Geography, Planning and Development
Y1  - 2022
U6  - https://doi.org/10.14778/3565838.3565858
SN  - 2150-8097
VL  - 15
IS  - 13
SP  - 4079
EP  - 4092
PB  - Association for Computing Machinery (ACM)
CY  - [New York]
ER  - 
TY  - JOUR
A1  - Trilla, Irene
A1  - Drimalla, Hanna
A1  - Bajbouj, Malek
A1  - Dziobek, Isabel
T1  - The influence of reward on facial mimicry
BT  - no evidence for a significant effect of oxytocin
JF  - Frontiers in behavioral neuroscience
N2  - Recent findings suggest a role of oxytocin on the tendency to spontaneously mimic the emotional facial expressions of others. Oxytocin-related increases of facial mimicry, however, seem to be dependent on contextual factors. Given previous literature showing that people preferentially mimic emotional expressions of individuals associated with high (vs. low) rewards, we examined whether the reward value of the mimicked agent is one factor influencing the oxytocin effects on facial mimicry. To test this hypothesis, 60 male adults received 24 IU of either intranasal oxytocin or placebo in a double-blind, between-subject experiment. Next, the value of male neutral faces was manipulated using an associative learning task with monetary rewards. After the reward associations were learned, participants watched videos of the same faces displaying happy and angry expressions. Facial reactions to the emotional expressions were measured with electromyography. We found that participants judged as more pleasant the face identities associated with high reward values than with low reward values. However, happy expressions by low rewarding faces were more spontaneously mimicked than high rewarding faces. Contrary to our expectations, we did not find a significant direct effect of intranasal oxytocin on facial mimicry, nor on the reward-driven modulation of mimicry. Our results support the notion that mimicry is a complex process that depends on contextual factors, but failed to provide conclusive evidence of a role of oxytocin on the modulation of facial mimicry.
KW  - oxytocin
KW  - facial mimicry
KW  - reward
KW  - EMG
KW  - social modulation
KW  - null results
Y1  - 2020
U6  - https://doi.org/10.3389/fnbeh.2020.00088
SN  - 1662-5153
VL  - 14
PB  - Frontiers Media
CY  - Lausanne
ER  - 
TY  - JOUR
A1  - Limanowski, Jakub
A1  - Lopes, Pedro
A1  - Keck, Janis
A1  - Baudisch, Patrick
A1  - Friston, Karl
A1  - Blankenburg, Felix
T1  - Action-dependent processing of touch in the human parietal operculum and posterior insula
JF  - Cerebral Cortex
N2  - Somatosensory input generated by one's actions (i.e., self-initiated body movements) is generally attenuated. Conversely, externally caused somatosensory input is enhanced, for example, during active touch and the haptic exploration of objects. Here, we used functional magnetic resonance imaging (fMRI) to ask how the brain accomplishes this delicate weighting of self-generated versus externally caused somatosensory components. Finger movements were either self-generated by our participants or induced by functional electrical stimulation (FES) of the same muscles. During half of the trials, electrotactile impulses were administered when the (actively or passively) moving finger reached a predefined flexion threshold. fMRI revealed an interaction effect in the contralateral posterior insular cortex (pIC), which responded more strongly to touch during self-generated than during FES-induced movements. A network analysis via dynamic causal modeling revealed that connectivity from the secondary somatosensory cortex via the pIC to the supplementary motor area was generally attenuated during self-generated relative to FES-induced movements-yet specifically enhanced by touch received during self-generated, but not FES-induced movements. Together, these results suggest a crucial role of the parietal operculum and the posterior insula in differentiating self-generated from externally caused somatosensory information received from one's moving limb.
KW  - active touch
KW  - dynamic causal modeling
KW  - insula
KW  - parietal operculum
KW  - somatosensation
Y1  - 2019
U6  - https://doi.org/10.1093/cercor/bhz111
SN  - 1047-3211
SN  - 1460-2199
VL  - 30
IS  - 2
SP  - 607
EP  - 617
PB  - Oxford University Press
CY  - Oxford
ER  - 
TY  - JOUR
A1  - Shi, Feng
A1  - Schirneck, Friedrich Martin
A1  - Friedrich, Tobias
A1  - Kötzing, Timo
A1  - Neumann, Frank
T1  - Reoptimization time analysis of evolutionary algorithms on linear functions under dynamic uniform constraints
JF  - Algorithmica : an international journal in computer science
N2  - Rigorous runtime analysis is a major approach towards understanding evolutionary computing techniques, and in this area linear pseudo-Boolean objective functions play a central role. Having an additional linear constraint is then equivalent to the NP-hard Knapsack problem, certain classes thereof have been studied in recent works. In this article, we present a dynamic model of optimizing linear functions under uniform constraints. Starting from an optimal solution with respect to a given constraint bound, we investigate the runtimes that different evolutionary algorithms need to recompute an optimal solution when the constraint bound changes by a certain amount. The classical (1+1) EA and several population-based algorithms are designed for that purpose, and are shown to recompute efficiently. Furthermore, a variant of the (1+(λ,λ))GA for the dynamic optimization problem is studied, whose performance is better when the change of the constraint bound is small.
Y1  - 2018
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-605295
SN  - 0178-4617
SN  - 1432-0541
VL  - 82
IS  - 10
SP  - 3117
EP  - 3123
PB  - Springer
CY  - New York
ER  - 
TY  - JOUR
A1  - Piro, Vitor C.
A1  - Dadi, Temesgen H.
A1  - Seiler, Enrico
A1  - Reinert, Knut
A1  - Renard, Bernhard Y.
T1  - ganon
BT  - precise metagenomics classification against large and up-to-date sets of reference sequences
JF  - Bioinformatics
N2  - Motivation:
The exponential growth of assembled genome sequences greatly benefits metagenomics studies. However, currently available methods struggle to manage the increasing amount of sequences and their frequent updates. Indexing the current RefSeq can take days and hundreds of GB of memory on large servers. Few methods address these issues thus far, and even though many can theoretically handle large amounts of references, time/memory requirements are prohibitive in practice. As a result, many studies that require sequence classification use often outdated and almost never truly up-to-date indices. 

Results: 
Motivated by those limitations, we created ganon, a k-mer-based read classification tool that uses Interleaved Bloom Filters in conjunction with a taxonomic clustering and a k-mer counting/filtering scheme. Ganon provides an efficient method for indexing references, keeping them updated. It requires <55 min to index the complete RefSeq of bacteria, archaea, fungi and viruses. The tool can further keep these indices up-to-date in a fraction of the time necessary to create them. Ganon makes it possible to query against very large reference sets and therefore it classifies significantly more reads and identifies more species than similar methods. When classifying a high-complexity CAMI challenge dataset against complete genomes from RefSeq, ganon shows strongly increased precision with equal or better sensitivity compared with state-of-the-art tools. With the same dataset against the complete RefSeq, ganon improved the F1-score by 65% at the genus level. It supports taxonomy- and assembly-level classification, multiple indices and hierarchical classification.
Y1  - 2020
U6  - https://doi.org/https://doi.org/10.1093/bioinformatics/btaa458
SN  - 1367-4811
SN  - 1367-4803
VL  - 36
SP  - 12
EP  - 20
PB  - Oxford Univ. Press
CY  - Oxford
ER  - 
TY  - JOUR
A1  - Casel, Katrin
A1  - Dreier, Jan
A1  - Fernau, Henning
A1  - Gobbert, Moritz
A1  - Kuinke, Philipp
A1  - Villaamil, Fernando Sánchez
A1  - Schmid, Markus L.
A1  - van Leeuwen, Erik Jan
T1  - Complexity of independency and cliquy trees
JF  - Discrete applied mathematics
N2  - An independency (cliquy) tree of an n-vertex graph G is a spanning tree of G in which the set of leaves induces an independent set (clique). We study the problems of minimizing or maximizing the number of leaves of such trees, and fully characterize their parameterized complexity. We show that all four variants of deciding if an independency/cliquy tree with at least/most l leaves exists parameterized by l are either Para-NP- or W[1]-hard. We prove that minimizing the number of leaves of a cliquy tree parameterized by the number of internal vertices is Para-NP-hard too. However, we show that minimizing the number of leaves of an independency tree parameterized by the number k of internal vertices has an O*(4(k))-time algorithm and a 2k vertex kernel. Moreover, we prove that maximizing the number of leaves of an independency/cliquy tree parameterized by the number k of internal vertices both have an O*(18(k))-time algorithm and an O(k 2(k)) vertex kernel, but no polynomial kernel unless the polynomial hierarchy collapses to the third level. Finally, we present an O(3(n) . f(n))-time algorithm to find a spanning tree where the leaf set has a property that can be decided in f (n) time and has minimum or maximum size.
KW  - independency tree
KW  - cliquy tree
KW  - parameterized complexity
KW  - Kernelization
KW  - algorithms
KW  - exact algorithms
Y1  - 2018
U6  - https://doi.org/10.1016/j.dam.2018.08.011
SN  - 0166-218X
SN  - 1872-6771
VL  - 272
SP  - 2
EP  - 15
PB  - Elsevier
CY  - Amsterdam [u.a.]
ER  - 
TY  - JOUR
A1  - Casel, Katrin
A1  - Fischbeck, Philipp
A1  - Friedrich, Tobias
A1  - Göbel, Andreas
A1  - Lagodzinski, J. A. Gregor
T1  - Zeros and approximations of Holant polynomials on the complex plane
JF  - Computational complexity : CC
N2  - We present fully polynomial time approximation schemes for a broad class of Holant problems with complex edge weights, which we call Holant polynomials. We transform these problems into partition functions of abstract combinatorial structures known as polymers in statistical physics. Our method involves establishing zero-free regions for the partition functions of polymer models and using the most significant terms of the cluster expansion to approximate them. Results of our technique include new approximation and sampling algorithms for a diverse class of Holant polynomials in the low-temperature regime (i.e. small external field) and approximation algorithms for general Holant problems with small signature weights. Additionally, we give randomised approximation and sampling algorithms with faster running times for more restrictive classes. Finally, we improve the known zero-free regions for a perfect matching polynomial.
KW  - Holant problems
KW  - approximate counting
KW  - partition functions
KW  - graph
KW  - polynomials
Y1  - 2022
U6  - https://doi.org/10.1007/s00037-022-00226-5
SN  - 1016-3328
SN  - 1420-8954
VL  - 31
IS  - 2
PB  - Springer
CY  - Basel
ER  - 
TY  - JOUR
A1  - Krestel, Ralf
A1  - Chikkamath, Renukswamy
A1  - Hewel, Christoph
A1  - Risch, Julian
T1  - A survey on deep learning for patent analysis
JF  - World patent information
N2  - Patent document collections are an immense source of knowledge for research and innovation communities worldwide. The rapid growth of the number of patent documents poses an enormous challenge for retrieving and analyzing information from this source in an effective manner. Based on deep learning methods for natural language processing, novel approaches have been developed in the field of patent analysis. The goal of these approaches is to reduce costs by automating tasks that previously only domain experts could solve. In this article, we provide a comprehensive survey of the application of deep learning for patent analysis. We summarize the state-of-the-art techniques and describe how they are applied to various tasks in the patent domain. In a detailed discussion, we categorize 40 papers based on the dataset, the representation, and the deep learning architecture that were used, as well as the patent analysis task that was targeted. With our survey, we aim to foster future research at the intersection of patent analysis and deep learning and we conclude by listing promising paths for future work.
KW  - deep learning
KW  - patent analysis
KW  - text mining
KW  - natural language processing
Y1  - 2021
U6  - https://doi.org/10.1016/j.wpi.2021.102035
SN  - 0172-2190
SN  - 1874-690X
VL  - 65
PB  - Elsevier
CY  - Amsterdam
ER  - 
TY  - JOUR
A1  - Hagedorn, Christopher
A1  - Huegle, Johannes
A1  - Schlosser, Rainer
T1  - Understanding unforeseen production downtimes in manufacturing processes using log data-driven causal reasoning
JF  - Journal of intelligent manufacturing
N2  - In discrete manufacturing, the knowledge about causal relationships makes it possible to avoid unforeseen production downtimes by identifying their root causes. Learning causal structures from real-world settings remains challenging due to high-dimensional data, a mix of discrete and continuous variables, and requirements for preprocessing log data under the causal perspective. In our work, we address these challenges proposing a process for causal reasoning based on raw machine log data from production monitoring. Within this process, we define a set of transformation rules to extract independent and identically distributed observations. Further, we incorporate a variable selection step to handle high-dimensionality and a discretization step to include continuous variables. We enrich a commonly used causal structure learning algorithm with domain-related orientation rules, which provides a basis for causal reasoning. We demonstrate the process on a real-world dataset from a globally operating precision mechanical engineering company. The dataset contains over 40 million log data entries from production monitoring of a single machine. In this context, we determine the causal structures embedded in operational processes. Further, we examine causal effects to support machine operators in avoiding unforeseen production stops, i.e., by detaining machine operators from drawing false conclusions on impacting factors of unforeseen production stops based on experience.
KW  - Causal structure learning
KW  - Log data
KW  - Causal inference
KW  - Manufacturing
KW  - industry
Y1  - 2022
U6  - https://doi.org/10.1007/s10845-022-01952-x
SN  - 0956-5515
SN  - 1572-8145
VL  - 33
IS  - 7
SP  - 2027
EP  - 2043
PB  - Springer
CY  - Dordrecht
ER  - 
TY  - JOUR
A1  - Kruse, Sebastian
A1  - Kaoudi, Zoi
A1  - Contreras-Rojas, Bertty
A1  - Chawla, Sanjay
A1  - Naumann, Felix
A1  - Quiane-Ruiz, Jorge-Arnulfo
T1  - RHEEMix in the data jungle
BT  - a cost-based optimizer for cross-platform systems
JF  - The VLDB Journal
N2  - Data analytics are moving beyond the limits of a single platform. In this paper, we present the cost-based optimizer of Rheem, an open-source cross-platform system that copes with these new requirements. The optimizer allocates the subtasks of data analytic tasks to the most suitable platforms. Our main contributions are: (i) a mechanism based on graph transformations to explore alternative execution strategies; (ii) a novel graph-based approach to determine efficient data movement plans among subtasks and platforms; and (iii) an efficient plan enumeration algorithm, based on a novel enumeration algebra. We extensively evaluate our optimizer under diverse real tasks. We show that our optimizer can perform tasks more than one order of magnitude faster when using multiple platforms than when using a single platform.
KW  - Cross-platform
KW  - Polystore
KW  - Query optimization
KW  - Data processing
Y1  - 2020
U6  - https://doi.org/10.1007/s00778-020-00612-x
SN  - 1066-8888
SN  - 0949-877X
VL  - 29
IS  - 6
SP  - 1287
EP  - 1310
PB  - Springer
CY  - Berlin
ER  - 
TY  - JOUR
A1  - van der Aa, Han
A1  - Leopold, Henrik
A1  - Weidlich, Matthias
T1  - Partial order resolution of event logs for process conformance checking
JF  - Decision support systems : DSS
N2  - While supporting the execution of business processes, information systems record event logs. Conformance checking relies on these logs to analyze whether the recorded behavior of a process conforms to the behavior of a normative specification. A key assumption of existing conformance checking techniques, however, is that all events are associated with timestamps that allow to infer a total order of events per process instance. Unfortunately, this assumption is often violated in practice. Due to synchronization issues, manual event recordings, or data corruption, events are only partially ordered. In this paper, we put forward the problem of partial order resolution of event logs to close this gap. It refers to the construction of a probability distribution over all possible total orders of events of an instance. To cope with the order uncertainty in real-world data, we present several estimators for this task, incorporating different notions of behavioral abstraction. Moreover, to reduce the runtime of conformance checking based on partial order resolution, we introduce an approximation method that comes with a bounded error in terms of accuracy. Our experiments with real-world and synthetic data reveal that our approach improves accuracy over the state-of-the-art considerably.
KW  - process mining
KW  - conformance checking
KW  - partial order resolution
KW  - data
KW  - uncertainty
Y1  - 2020
U6  - https://doi.org/10.1016/j.dss.2020.113347
SN  - 0167-9236
SN  - 1873-5797
VL  - 136
PB  - Elsevier
CY  - Amsterdam [u.a.]
ER  - 
TY  - JOUR
A1  - Galka, Andreas
A1  - Moontaha, Sidratul
A1  - Siniatchkin, Michael
T1  - Constrained expectation maximisation algorithm for estimating ARMA models in state space representation
JF  - EURASIP journal on advances in signal processing
N2  - This paper discusses the fitting of linear state space models to given multivariate time series in the presence of constraints imposed on the four main parameter matrices of these models. Constraints arise partly from the assumption that the models have a block-diagonal structure, with each block corresponding to an ARMA process, that allows the reconstruction of independent source components from linear mixtures, and partly from the need to keep models identifiable. The first stage of parameter fitting is performed by the expectation maximisation (EM) algorithm. Due to the identifiability constraint, a subset of the diagonal elements of the dynamical noise covariance matrix needs to be constrained to fixed values (usually unity). For this kind of constraints, so far, no closed-form update rules were available. We present new update rules for this situation, both for updating the dynamical noise covariance matrix directly and for updating a matrix square-root of this matrix. The practical applicability of the proposed algorithm is demonstrated by a low-dimensional simulation example. The behaviour of the EM algorithm, as observed in this example, illustrates the well-known fact that in practical applications, the EM algorithm should be combined with a different algorithm for numerical optimisation, such as a quasi-Newton algorithm.
KW  - Kalman filtering
KW  - state space modelling
KW  - expectation maximisation algorithm
Y1  - 2020
U6  - https://doi.org/10.1186/s13634-020-00678-3
SN  - 1687-6180
VL  - 2020
IS  - 1
PB  - Springer
CY  - Heidelberg
ER  - 
TY  - JOUR
A1  - Serth, Sebastian
A1  - Staubitz, Thomas
A1  - van Elten, Martin
A1  - Meinel, Christoph
ED  - Gamage, Dilrukshi
T1  - Measuring the effects of course modularizations in online courses for life-long learners
JF  - Frontiers in Education
N2  - Many participants in Massive Open Online Courses are full-time employees seeking greater flexibility in their time commitment and the available learning paths. We recently addressed these requirements by splitting up our 6-week courses into three 2-week modules followed by a separate exam. Modularizing courses offers many advantages: Shorter modules are more sustainable and can be combined, reused, and incorporated into learning paths more easily. Time flexibility for learners is also improved as exams can now be offered multiple times per year, while the learning content is available independently. In this article, we answer the question of which impact this modularization has on key learning metrics, such as course completion rates, learning success, and no-show rates. Furthermore, we investigate the influence of longer breaks between modules on these metrics. According to our analysis, course modules facilitate more selective learning behaviors that encourage learners to focus on topics they are the most interested in. At the same time, participation in overarching exams across all modules seems to be less appealing compared to an integrated exam of a 6-week course. While breaks between the modules increase the distinctive appearance of individual modules, a break before the final exam further reduces initial interest in the exams. We further reveal that participation in self-paced courses as a preparation for the final exam is unlikely to attract new learners to the course offerings, even though learners' performance is comparable to instructor-paced courses. The results of our long-term study on course modularization provide a solid foundation for future research and enable educators to make informed decisions about the design of their courses.
KW  - Massive Open Online Course (MOOC)
KW  - course design
KW  - modularization
KW  - learning path
KW  - flexibility
KW  - e-learning
KW  - assignments
KW  - self-paced learning
Y1  - 2022
U6  - https://doi.org/10.3389/feduc.2022.1008545
SN  - 2504-284X
VL  - 7
PB  - Frontiers
CY  - Lausanne, Schweiz
ER  - 
TY  - JOUR
A1  - Zenner, Alexander M.
A1  - Böttinger, Erwin
A1  - Konigorski, Stefan
T1  - StudyMe
BT  - a new mobile app for user-centric N-of-1 trials
JF  - Trials
N2  - N-of-1 trials are multi-crossover self-experiments that allow individuals to systematically evaluate the effect of interventions on their personal health goals. Although several tools for N-of-1 trials exist, there is a gap in supporting non-experts in conducting their own user-centric trials. In this study, we present StudyMe, an open-source mobile application that is freely available from https://play.google.com/store/apps/details?id=health.studyu.me and offers users flexibility and guidance in configuring every component of their trials. We also present research that informed the development of StudyMe, focusing on trial creation. Through an initial survey with 272 participants, we learned that individuals are interested in a variety of personal health aspects and have unique ideas on how to improve them. In an iterative, user-centered development process with intermediate user tests, we developed StudyMe that features an educational part to communicate N-of-1 trial concepts. A final empirical evaluation of StudyMe showed that all participants were able to create their own trials successfully using StudyMe and the app achieved a very good usability rating. Our findings suggest that StudyMe provides a significant step towards enabling individuals to apply a systematic science-oriented approach to personalize health-related interventions and behavior modifications in their everyday lives.
Y1  - 2022
U6  - https://doi.org/10.1186/s13063-022-06893-7
SN  - 1745-6215
VL  - 23
PB  - BioMed Central
CY  - London
ER  - 
TY  - JOUR
A1  - Monti, Remo
A1  - Rautenstrauch, Pia
A1  - Ghanbari, Mahsa
A1  - Rani James, Alva
A1  - Kirchler, Matthias
A1  - Ohler, Uwe
A1  - Konigorski, Stefan
A1  - Lippert, Christoph
T1  - Identifying interpretable gene-biomarker associations with functionally informed kernel-based tests in 190,000 exomes
JF  - Nature Communications
N2  - Here we present an exome-wide rare genetic variant association study for 30 blood biomarkers in 191,971 individuals in the UK Biobank. We compare gene- based association tests for separate functional variant categories to increase interpretability and identify 193 significant gene-biomarker associations. Genes associated with biomarkers were ~ 4.5-fold enriched for conferring Mendelian disorders. In addition to performing weighted gene-based variant collapsing tests, we design and apply variant-category-specific kernel-based tests that integrate quantitative functional variant effect predictions for mis- sense variants, splicing and the binding of RNA-binding proteins. For these tests, we present a computationally efficient combination of the likelihood- ratio and score tests that found 36% more associations than the score test alone while also controlling the type-1 error. Kernel-based tests identified 13% more associations than their gene-based collapsing counterparts and had advantages in the presence of gain of function missense variants. We introduce local collapsing by amino acid position for missense variants and use it to interpret associations and identify potential novel gain of function variants in PIEZO1. Our results show the benefits of investigating different functional mechanisms when performing rare-variant association tests, and demonstrate pervasive rare-variant contribution to biomarker variability.
Y1  - 2022
U6  - https://doi.org/10.1038/s41467-022-32864-2
SN  - 2041-1723
VL  - 13
PB  - Nature Publishing Group UK
CY  - London
ER  - 
TY  - JOUR
A1  - Fehr, Jana
A1  - Jaramillo-Gutierrez, Giovanna
A1  - Oala, Luis
A1  - Gröschel, Matthias I.
A1  - Bierwirth, Manuel
A1  - Balachandran, Pradeep
A1  - Werneck-Leite, Alixandro
A1  - Lippert, Christoph
T1  - Piloting a Survey-Based Assessment of Transparency and Trustworthiness with Three Medical AI Tools
JF  - Healthcare
N2  - Artificial intelligence (AI) offers the potential to support healthcare delivery, but poorly trained or validated algorithms bear risks of harm. Ethical guidelines stated transparency about model development and validation as a requirement for trustworthy AI. Abundant guidance exists to provide transparency through reporting, but poorly reported medical AI tools are common. To close this transparency gap, we developed and piloted a framework to quantify the transparency of medical AI tools with three use cases. Our framework comprises a survey to report on the intended use, training and validation data and processes, ethical considerations, and deployment recommendations. The transparency of each response was scored with either 0, 0.5, or 1 to reflect if the requested information was not, partially, or fully provided. Additionally, we assessed on an analogous three-point scale if the provided responses fulfilled the transparency requirement for a set of trustworthiness criteria from ethical guidelines. The degree of transparency and trustworthiness was calculated on a scale from 0% to 100%. Our assessment of three medical AI use cases pin-pointed reporting gaps and resulted in transparency scores of 67% for two use cases and one with 59%. We report anecdotal evidence that business constraints and limited information from external datasets were major obstacles to providing transparency for the three use cases. The observed transparency gaps also lowered the degree of trustworthiness, indicating compliance gaps with ethical guidelines. All three pilot use cases faced challenges to provide transparency about medical AI tools, but more studies are needed to investigate those in the wider medical AI sector. Applying this framework for an external assessment of transparency may be infeasible if business constraints prevent the disclosure of information. New strategies may be necessary to enable audits of medical AI tools while preserving business secrets.
KW  - artificial intelligence for health
KW  - quality assessment
KW  - transparency
KW  - trustworthiness
Y1  - 2022
U6  - https://doi.org/10.3390/healthcare10101923
SN  - 2227-9032
VL  - 10
IS  - 10
PB  - MDPI
CY  - Basel, Schweiz
ER  - 
TY  - JOUR
A1  - Ziegler, Joceline
A1  - Pfitzner, Bjarne
A1  - Schulz, Heinrich
A1  - Saalbach, Axel
A1  - Arnrich, Bert
T1  - Defending against Reconstruction Attacks through Differentially Private Federated Learning for Classification of Heterogeneous Chest X-ray Data
JF  - Sensors
N2  - Privacy regulations and the physical distribution of heterogeneous data are often primary concerns for the development of deep learning models in a medical context. This paper evaluates the feasibility of differentially private federated learning for chest X-ray classification as a defense against data privacy attacks. To the best of our knowledge, we are the first to directly compare the impact of differentially private training on two different neural network architectures, DenseNet121 and ResNet50. Extending the federated learning environments previously analyzed in terms of privacy, we simulated a heterogeneous and imbalanced federated setting by distributing images from the public CheXpert and Mendeley chest X-ray datasets unevenly among 36 clients. Both non-private baseline models achieved an area under the receiver operating characteristic curve (AUC) of 0.940.94 on the binary classification task of detecting the presence of a medical finding. We demonstrate that both model architectures are vulnerable to privacy violation by applying image reconstruction attacks to local model updates from individual clients. The attack was particularly successful during later training stages. To mitigate the risk of a privacy breach, we integrated Rényi differential privacy with a Gaussian noise mechanism into local model training. We evaluate model performance and attack vulnerability for privacy budgets ε∈{1,3,6,10}�∈{1,3,6,10}. The DenseNet121 achieved the best utility-privacy trade-off with an AUC of 0.940.94 for ε=6�=6. Model performance deteriorated slightly for individual clients compared to the non-private baseline. The ResNet50 only reached an AUC of 0.760.76 in the same privacy setting. Its performance was inferior to that of the DenseNet121 for all considered privacy constraints, suggesting that the DenseNet121 architecture is more robust to differentially private training.
KW  - federated learning
KW  - privacy and security
KW  - privacy attack
KW  - X-ray
Y1  - 2022
U6  - https://doi.org/10.3390/s22145195
SN  - 1424-8220
VL  - 22
PB  - MDPI
CY  - Basel, Schweiz
ET  - 14
ER  - 
TY  - JOUR
A1  - Hecker, Pascal
A1  - Steckhan, Nico
A1  - Eyben, Florian
A1  - Schuller, Björn Wolfgang
A1  - Arnrich, Bert
T1  - Voice Analysis for Neurological Disorder Recognition – A Systematic Review and Perspective on Emerging Trends
JF  - Frontiers in Digital Health
N2  - Quantifying neurological disorders from voice is a rapidly growing field of research and holds promise for unobtrusive and large-scale disorder monitoring. The data recording setup and data analysis pipelines are both crucial aspects to effectively obtain relevant information from participants. Therefore, we performed a systematic review to provide a high-level overview of practices across various neurological disorders and highlight emerging trends. PRISMA-based literature searches were conducted through PubMed, Web of Science, and IEEE Xplore to identify publications in which original (i.e., newly recorded) datasets were collected. Disorders of interest were psychiatric as well as neurodegenerative disorders, such as bipolar disorder, depression, and stress, as well as amyotrophic lateral sclerosis amyotrophic lateral sclerosis, Alzheimer's, and Parkinson's disease, and speech impairments (aphasia, dysarthria, and dysphonia). Of the 43 retrieved studies, Parkinson's disease is represented most prominently with 19 discovered datasets. Free speech and read speech tasks are most commonly used across disorders. Besides popular feature extraction toolkits, many studies utilise custom-built feature sets. Correlations of acoustic features with psychiatric and neurodegenerative disorders are presented. In terms of analysis, statistical analysis for significance of individual features is commonly used, as well as predictive modeling approaches, especially with support vector machines and a small number of artificial neural networks. An emerging trend and recommendation for future studies is to collect data in everyday life to facilitate longitudinal data collection and to capture the behavior of participants more naturally. Another emerging trend is to record additional modalities to voice, which can potentially increase analytical performance.
KW  - neurological disorders
KW  - voice
KW  - speech
KW  - everyday life
KW  - multiple modalities
KW  - machine learning
KW  - disorder recognition
Y1  - 2022
U6  - https://doi.org/10.3389/fdgth.2022.842301
SN  - 2673-253X
PB  - Frontiers Media SA
CY  - Lausanne, Schweiz
ER  - 
TY  - JOUR
A1  - Doerr, Benjamin
A1  - Kötzing, Timo
T1  - Multiplicative Up-Drift
JF  - Algorithmica
N2  - Drift analysis aims at translating the expected progress of an evolutionary algorithm (or more generally, a random process) into a probabilistic guarantee on its run time (hitting time). So far, drift arguments have been successfully employed in the rigorous analysis of evolutionary algorithms, however, only for the situation that the progress is constant or becomes weaker when approaching the target. Motivated by questions like how fast fit individuals take over a population, we analyze random processes exhibiting a (1+delta)-multiplicative growth in expectation. We prove a drift theorem translating this expected progress into a hitting time. This drift theorem gives a simple and insightful proof of the level-based theorem first proposed by Lehre (2011). Our version of this theorem has, for the first time, the best-possible near-linear dependence on 1/delta} (the previous results had an at least near-quadratic dependence), and it only requires a population size near-linear in delta (this was super-quadratic in previous results). These improvements immediately lead to stronger run time guarantees for a number of applications. We also discuss the case of large delta and show stronger results for this setting.
KW  - drift theory
KW  - evolutionary computation
KW  - stochastic process
Y1  - 2020
U6  - https://doi.org/10.1007/s00453-020-00775-7
SN  - 0178-4617
SN  - 1432-0541
VL  - 83
IS  - 10
SP  - 3017
EP  - 3058
PB  - Springer
CY  - New York
ER  - 
TY  - JOUR
A1  - Van Hout, Cristopher V.
A1  - Tachmazidou, Ioanna
A1  - Backman, Joshua D.
A1  - Hoffman, Joshua D.
A1  - Liu, Daren
A1  - Pandey, Ashutosh K.
A1  - Gonzaga-Jauregui, Claudia
A1  - Khalid, Shareef
A1  - Ye, Bin
A1  - Banerjee, Nilanjana
A1  - Li, Alexander H.
A1  - O'Dushlaine, Colm
A1  - Marcketta, Anthony
A1  - Staples, Jeffrey
A1  - Schurmann, Claudia
A1  - Hawes, Alicia
A1  - Maxwell, Evan
A1  - Barnard, Leland
A1  - Lopez, Alexander
A1  - Penn, John
A1  - Habegger, Lukas
A1  - Blumenfeld, Andrew L.
A1  - Bai, Xiaodong
A1  - O'Keeffe, Sean
A1  - Yadav, Ashish
A1  - Praveen, Kavita
A1  - Jones, Marcus
A1  - Salerno, William J.
A1  - Chung, Wendy K.
A1  - Surakka, Ida
A1  - Willer, Cristen J.
A1  - Hveem, Kristian
A1  - Leader, Joseph B.
A1  - Carey, David J.
A1  - Ledbetter, David H.
A1  - Cardon, Lon
A1  - Yancopoulos, George D.
A1  - Economides, Aris
A1  - Coppola, Giovanni
A1  - Shuldiner, Alan R.
A1  - Balasubramanian, Suganthi
A1  - Cantor, Michael
A1  - Nelson, Matthew R.
A1  - Whittaker, John
A1  - Reid, Jeffrey G.
A1  - Marchini, Jonathan
A1  - Overton, John D.
A1  - Scott, Robert A.
A1  - Abecasis, Goncalo R.
A1  - Yerges-Armstrong, Laura M.
A1  - Baras, Aris
T1  - Exome sequencing and characterization of 49,960 individuals in the UK Biobank
JF  - Nature : the international weekly journal of science
N2  - The UK Biobank is a prospective study of 502,543 individuals, combining extensive phenotypic and genotypic data with streamlined access for researchers around the world(1). Here we describe the release of exome-sequence data for the first 49,960 study participants, revealing approximately 4 million coding variants (of which around 98.6% have a frequency of less than 1%). The data include 198,269 autosomal predicted loss-of-function (LOF) variants, a more than 14-fold increase compared to the imputed sequence. Nearly all genes (more than 97%) had at least one carrier with a LOF variant, and most genes (more than 69%) had at least ten carriers with a LOF variant. We illustrate the power of characterizing LOF variants in this population through association analyses across 1,730 phenotypes. In addition to replicating established associations, we found novel LOF variants with large effects on disease traits, includingPIEZO1on varicose veins,COL6A1on corneal resistance,MEPEon bone density, andIQGAP2andGMPRon blood cell traits. We further demonstrate the value of exome sequencing by surveying the prevalence of pathogenic variants of clinical importance, and show that 2% of this population has a medically actionable variant. Furthermore, we characterize the penetrance of cancer in carriers of pathogenicBRCA1andBRCA2variants. Exome sequences from the first 49,960 participants highlight the promise of genome sequencing in large population-based studies and are now accessible to the scientific community. <br /> Exome sequences from the first 49,960 participants in the UK Biobank highlight the promise of genome sequencing in large population-based studies and are now accessible to the scientific community.
KW  - clinical exome
KW  - breast-cancer
KW  - mutations
KW  - recommendations
KW  - gene
KW  - metaanalysis
KW  - variants,
KW  - BRCA1
KW  - risk
KW  - susceptibility
Y1  - 2020
U6  - https://doi.org/10.1038/s41586-020-2853-0
SN  - 0028-0836
SN  - 1476-4687
VL  - 586
IS  - 7831
SP  - 749
EP  - 756
PB  - Macmillan Publishers Limited
CY  - London
ER  - 
TY  - JOUR
A1  - Rezaei, Mina
A1  - Yang, Haojin
A1  - Meinel, Christoph
T1  - Recurrent generative adversarial network for learning imbalanced medical image semantic segmentation
JF  - Multimedia tools and applications : an international journal
N2  - We propose a new recurrent generative adversarial architecture named RNN-GAN to mitigate imbalance data problem in medical image semantic segmentation where the number of pixels belongs to the desired object are significantly lower than those belonging to the background. A model trained with imbalanced data tends to bias towards healthy data which is not desired in clinical applications and predicted outputs by these networks have high precision and low recall. To mitigate imbalanced training data impact, we train RNN-GAN with proposed complementary segmentation mask, in addition, ordinary segmentation masks. The RNN-GAN consists of two components: a generator and a discriminator. The generator is trained on the sequence of medical images to learn corresponding segmentation label map plus proposed complementary label both at a pixel level, while the discriminator is trained to distinguish a segmentation image coming from the ground truth or from the generator network. Both generator and discriminator substituted with bidirectional LSTM units to enhance temporal consistency and get inter and intra-slice representation of the features. We show evidence that the proposed framework is applicable to different types of medical images of varied sizes. In our experiments on ACDC-2017, HVSMR-2016, and LiTS-2017 benchmarks we find consistently improved results, demonstrating the efficacy of our approach.
KW  - Imbalanced medical image semantic segmentation
KW  - Recurrent generative
KW  - adversarial network
Y1  - 2019
U6  - https://doi.org/10.1007/s11042-019-7305-1
SN  - 1380-7501
SN  - 1573-7721
VL  - 79
IS  - 21-22
SP  - 15329
EP  - 15348
PB  - Springer
CY  - Dordrecht
ER  - 
TY  - JOUR
A1  - Jiang, Lan
A1  - Naumann, Felix
T1  - Holistic primary key and foreign key detection
JF  - Journal of intelligent information systems : JIIS
N2  - Primary keys (PKs) and foreign keys (FKs) are important elements of relational schemata in various applications, such as query optimization and data integration. However, in many cases, these constraints are unknown or not documented. Detecting them manually is time-consuming and even infeasible in large-scale datasets. We study the problem of discovering primary keys and foreign keys automatically and propose an algorithm to detect both, namely Holistic Primary Key and Foreign Key Detection (HoPF). PKs and FKs are subsets of the sets of unique column combinations (UCCs) and inclusion dependencies (INDs), respectively, for which efficient discovery algorithms are known. Using score functions, our approach is able to effectively extract the true PKs and FKs from the vast sets of valid UCCs and INDs. Several pruning rules are employed to speed up the procedure. We evaluate precision and recall on three benchmarks and two real-world datasets. The results show that our method is able to retrieve on average 88% of all primary keys, and 91% of all foreign keys. We compare the performance of HoPF with two baseline approaches that both assume the existence of primary keys.
KW  - Data profiling application
KW  - Primary key
KW  - Foreign key
KW  - Database
KW  - management
Y1  - 2019
U6  - https://doi.org/10.1007/s10844-019-00562-z
SN  - 0925-9902
SN  - 1573-7675
VL  - 54
IS  - 3
SP  - 439
EP  - 461
PB  - Springer
CY  - Dordrecht
ER  -