004 Datenverarbeitung; Informatik
Refine
Year of publication
Document Type
- Article (232)
- Monograph/Edited Volume (155)
- Doctoral Thesis (142)
- Postprint (43)
- Conference Proceeding (39)
- Master's Thesis (10)
- Other (5)
- Preprint (3)
- Habilitation Thesis (1)
- Moving Images (1)
Language
- English (468)
- German (161)
- Multiple languages (2)
Keywords
- Informatik (18)
- Didaktik (15)
- Ausbildung (13)
- Hochschuldidaktik (13)
- answer set programming (12)
- machine learning (11)
- Cloud Computing (10)
- cloud computing (10)
- Hasso-Plattner-Institut (9)
- E-Learning (8)
Institute
- Institut für Informatik und Computational Science (253)
- Hasso-Plattner-Institut für Digital Engineering gGmbH (196)
- Hasso-Plattner-Institut für Digital Engineering GmbH (89)
- Extern (51)
- Mathematisch-Naturwissenschaftliche Fakultät (24)
- Institut für Mathematik (14)
- Wirtschaftswissenschaften (8)
- Digital Engineering Fakultät (6)
- Institut für Physik und Astronomie (6)
- Institut für Umweltwissenschaften und Geographie (6)
Data privacy is a very important issue. Especially in fields like medicine, it is paramount to abide by the existing privacy regulations to preserve patients' anonymity. However, data is required for research and training machine learning models that could help gain insight into complex correlations or personalised treatments that may otherwise stay undiscovered. Those models generally scale with the amount of data available, but the current situation often prohibits building large databases across sites. So it would be beneficial to be able to combine similar or related data from different sites all over the world while still preserving data privacy. Federated learning has been proposed as a solution for this, because it relies on the sharing of machine learning models, instead of the raw data itself. That means private data never leaves the site or device it was collected on. Federated learning is an emerging research area, and many domains have been identified for the application of those methods. This systematic literature review provides an extensive look at the concept of and research into federated learning and its applicability for confidential healthcare datasets.
Cyber warfare is a timely and relevant issue and one of the most controversial in international humanitarian law (IHL). The aim of IHL is to set rules and limits in terms of means and methods of warfare. In this context, a key question arises: Has digital warfare rules or limits, and if so, how are these applicable? Traditional principles, developed over a long period, are facing a new dimension of challenges due to the rise of cyber warfare. This paper argues that to overcome this new issue, it is critical that new humanity-oriented approaches is developed with regard to cyber warfare. The challenge is to establish a legal regime for cyber-attacks, successfully addressing human rights norms and standards. While clarifying this from a legal perspective, the authors can redesign the sensitive equilibrium between humanity and military necessity, weighing the humanitarian aims of IHL and the protection of civilians-in combination with international human rights law and other relevant legal regimes-in a different manner than before.
Learning from failure
(2022)
Regression testing is a widespread practice in today's software industry to ensure software product quality. Developers derive a set of test cases, and execute them frequently to ensure that their change did not adversely affect existing functionality. As the software product and its test suite grow, the time to feedback during regression test sessions increases, and impedes programmer productivity: developers wait longer for tests to complete, and delays in fault detection render fault removal increasingly difficult.
Test case prioritization addresses the problem of long feedback loops by reordering test cases, such that test cases of high failure probability run first, and test case failures become actionable early in the testing process. We ask, given test execution schedules reconstructed from publicly available data, to which extent can their fault detection efficiency improved, and which technique yields the most efficient test schedules with respect to APFD?
To this end, we recover regression 6200 test sessions from the build log files of Travis CI, a popular continuous integration service, and gather 62000 accompanying changelists. We evaluate the efficiency of current test schedules, and examine the prioritization results of state-of-the-art lightweight, history-based heuristics. We propose and evaluate a novel set of prioritization algorithms, which connect software changes and test failures in a matrix-like data structure.
Our studies indicate that the optimization potential is substantial, because the existing test plans score only 30% APFD. The predictive power of past test failures proves to be outstanding: simple heuristics, such as repeating tests with failures in recent sessions, result in efficiency scores of 95% APFD. The best-performing matrix-based heuristic achieves a similar score of 92.5% APFD. In contrast to prior approaches, we argue that matrix-based techniques are useful beyond the scope of effective prioritization, and enable a number of use cases involving software maintenance.
We validate our findings from continuous integration processes by extending a continuous testing tool within development environments with means of test prioritization, and pose further research questions. We think that our findings are suited to propel adoption of (continuous) testing practices, and that programmers' toolboxes should contain test prioritization as an existential productivity tool.
Several numerical tools designed to overcome the challenges of smoothing in a non-linear and non-Gaussian setting are investigated for a class of particle smoothers. The considered family of smoothers is induced by the class of linear ensemble transform filters which contains classical filters such as the stochastic ensemble Kalman filter, the ensemble square root filter, and the recently introduced nonlinear ensemble transform filter. Further the ensemble transform particle smoother is introduced and particularly highlighted as it is consistent in the particle limit and does not require assumptions with respect to the family of the posterior distribution. The linear update pattern of the considered class of linear ensemble transform smoothers allows one to implement important supplementary techniques such as adaptive spread corrections, hybrid formulations, and localization in order to facilitate their application to complex estimation problems. These additional features are derived and numerically investigated for a sequence of increasingly challenging test problems.
Argument mining on twitter
(2021)
In the last decade, the field of argument mining has grown notably. However, only relatively few studies have investigated argumentation in social media and specifically on Twitter. Here, we provide the, to our knowledge, first critical in-depth survey of the state of the art in tweet-based argument mining. We discuss approaches to modelling the structure of arguments in the context of tweet corpus annotation, and we review current progress in the task of detecting argument components and their relations in tweets. We also survey the intersection of argument mining and stance detection, before we conclude with an outlook.
N-of-1 trials are the gold standard study design to evaluate individual treatment effects and derive personalized treatment strategies. Digital tools have the potential to initiate a new era of N-of-1 trials in terms of scale and scope, but fully functional platforms are not yet available. Here, we present the open source StudyU platform, which includes the StudyU Designer and StudyU app. With the StudyU Designer, scientists are given a collaborative web application to digitally specify, publish, and conduct N-of-1 trials. The StudyU app is a smartphone app with innovative user-centric elements for participants to partake in trials published through the StudyU Designer to assess the effects of different interventions on their health. Thereby, the StudyU platform allows clinicians and researchers worldwide to easily design and conduct digital N-of-1 trials in a safe manner. We envision that StudyU can change the landscape of personalized treatments both for patients and healthy individuals, democratize and personalize evidence generation for self-optimization and medicine, and can be integrated in clinical practice.
N-of-1 trials are the gold standard study design to evaluate individual treatment effects and derive personalized treatment strategies. Digital tools have the potential to initiate a new era of N-of-1 trials in terms of scale and scope, but fully functional platforms are not yet available. Here, we present the open source StudyU platform, which includes the StudyU Designer and StudyU app. With the StudyU Designer, scientists are given a collaborative web application to digitally specify, publish, and conduct N-of-1 trials. The StudyU app is a smartphone app with innovative user-centric elements for participants to partake in trials published through the StudyU Designer to assess the effects of different interventions on their health. Thereby, the StudyU platform allows clinicians and researchers worldwide to easily design and conduct digital N-of-1 trials in a safe manner. We envision that StudyU can change the landscape of personalized treatments both for patients and healthy individuals, democratize and personalize evidence generation for self-optimization and medicine, and can be integrated in clinical practice.
Multiplicative Up-Drift
(2020)
Drift analysis aims at translating the expected progress of an evolutionary algorithm (or more generally, a random process) into a probabilistic guarantee on its run time (hitting time). So far, drift arguments have been successfully employed in the rigorous analysis of evolutionary algorithms, however, only for the situation that the progress is constant or becomes weaker when approaching the target. Motivated by questions like how fast fit individuals take over a population, we analyze random processes exhibiting a (1+delta)-multiplicative growth in expectation. We prove a drift theorem translating this expected progress into a hitting time. This drift theorem gives a simple and insightful proof of the level-based theorem first proposed by Lehre (2011). Our version of this theorem has, for the first time, the best-possible near-linear dependence on 1/delta} (the previous results had an at least near-quadratic dependence), and it only requires a population size near-linear in delta (this was super-quadratic in previous results). These improvements immediately lead to stronger run time guarantees for a number of applications. We also discuss the case of large delta and show stronger results for this setting.
First-class concepts
(2022)
Ideally, programs are partitioned into independently maintainable and understandable modules. As a system grows, its architecture gradually loses the capability to accommodate new concepts in a modular way. While refactoring is expensive and not always possible, and the programming language might lack dedicated primary language constructs to express certain cross-cutting concerns, programmers are still able to explain and delineate convoluted concepts through secondary means: code comments, use of whitespace and arrangement of code, documentation, or communicating tacit knowledge. <br /> Secondary constructs are easy to change and provide high flexibility in communicating cross-cutting concerns and other concepts among programmers. However, such secondary constructs usually have no reified representation that can be explored and manipulated as first-class entities through the programming environment. <br /> In this exploratory work, we discuss novel ways to express a wide range of concepts, including cross-cutting concerns, patterns, and lifecycle artifacts independently of the dominant decomposition imposed by an existing architecture. We propose the representation of concepts as first-class objects inside the programming environment that retain the capability to change as easily as code comments. We explore new tools that allow programmers to view, navigate, and change programs based on conceptual perspectives. In a small case study, we demonstrate how such views can be created and how the programming experience changes from draining programmers' attention by stretching it across multiple modules toward focusing it on cohesively presented concepts. Our designs are geared toward facilitating multiple secondary perspectives on a system to co-exist in symbiosis with the original architecture, hence making it easier to explore, understand, and explain complex contexts and narratives that are hard or impossible to express using primary modularity constructs.
In this paper, we examine conditioning of the discretization of the Helmholtz problem. Although the discrete Helmholtz problem has been studied from different perspectives, to the best of our knowledge, there is no conditioning analysis for it. We aim to fill this gap in the literature. We propose a novel method in 1D to observe the near-zero eigenvalues of a symmetric indefinite matrix. Standard classification of ill-conditioning based on the matrix condition number is not true for the discrete Helmholtz problem. We relate the ill-conditioning of the discretization of the Helmholtz problem with the condition number of the matrix. We carry out analytical conditioning analysis in 1D and extend our observations to 2D with numerical observations. We examine several discretizations. We find different regions in which the condition number of the problem shows different characteristics. We also explain the general behavior of the solutions in these regions.