004 Datenverarbeitung; Informatik
Refine
Year of publication
Document Type
- Article (336) (remove)
Keywords
- Informatik (16)
- Didaktik (13)
- Ausbildung (12)
- Hochschuldidaktik (12)
- Big Data (4)
- Computer Science Education (4)
- Competence Measurement (3)
- Data profiling (3)
- Informatics Education (3)
- Secondary Education (3)
Institute
- Institut für Informatik und Computational Science (150)
- Hasso-Plattner-Institut für Digital Engineering GmbH (48)
- Hasso-Plattner-Institut für Digital Engineering gGmbH (41)
- Extern (34)
- Institut für Mathematik (16)
- Fachgruppe Betriebswirtschaftslehre (12)
- Bürgerliches Recht (10)
- Wirtschaftswissenschaften (8)
- Institut für Biochemie und Biologie (4)
- Institut für Physik und Astronomie (4)
M-rate 0L systems are interactionless Lindenmayer systems together with a function assigning to every string a set of multisets of productions that may be applied simultaneously to the string. Some questions that have been left open in the forerunner papers are examined, and the computational power of deterministic M-rate 0L systems is investigated, where also tabled and extended variants are taken into consideration.
This paper shows that the law, in subtle ways, may set hitherto unrecognized incentives for the adoption of explainable machine learning applications. In doing so, we make two novel contributions. First, on the legal side, we show that to avoid liability, professional actors, such as doctors and managers, may soon be legally compelled to use explainable ML models. We argue that the importance of explainability reaches far beyond data protection law, and crucially influences questions of contractual and tort liability for the use of ML models. To this effect, we conduct two legal case studies, in medical and corporate merger applications of ML. As a second contribution, we discuss the (legally required) trade-off between accuracy and explainability and demonstrate the effect in a technical case study in the context of spam classification.
Author summary <br /> The use of orally inhaled drugs for treating lung diseases is appealing since they have the potential for lung selectivity, i.e. high exposure at the site of action -the lung- without excessive side effects. However, the degree of lung selectivity depends on a large number of factors, including physiochemical properties of drug molecules, patient disease state, and inhalation devices. To predict the impact of these factors on drug exposure and thereby to understand the characteristics of an optimal drug for inhalation, we develop a predictive mathematical framework (a "pharmacokinetic model"). In contrast to previous approaches, our model allows combining knowledge from different sources appropriately and its predictions were able to adequately predict different sets of clinical data. Finally, we compare the impact of different factors and find that the most important factors are the size of the inhaled particles, the affinity of the drug to the lung tissue, as well as the rate of drug dissolution in the lung. In contrast to the common belief, the solubility of a drug in the lining fluids is not found to be relevant. These findings are important to understand how inhaled drugs should be designed to achieve best treatment results in patients. <br /> The fate of orally inhaled drugs is determined by pulmonary pharmacokinetic processes such as particle deposition, pulmonary drug dissolution, and mucociliary clearance. Even though each single process has been systematically investigated, a quantitative understanding on the interaction of processes remains limited and therefore identifying optimal drug and formulation characteristics for orally inhaled drugs is still challenging. To investigate this complex interplay, the pulmonary processes can be integrated into mathematical models. However, existing modeling attempts considerably simplify these processes or are not systematically evaluated against (clinical) data. In this work, we developed a mathematical framework based on physiologically-structured population equations to integrate all relevant pulmonary processes mechanistically. A tailored numerical resolution strategy was chosen and the mechanistic model was evaluated systematically against data from different clinical studies. Without adapting the mechanistic model or estimating kinetic parameters based on individual study data, the developed model was able to predict simultaneously (i) lung retention profiles of inhaled insoluble particles, (ii) particle size-dependent pharmacokinetics of inhaled monodisperse particles, (iii) pharmacokinetic differences between inhaled fluticasone propionate and budesonide, as well as (iv) pharmacokinetic differences between healthy volunteers and asthmatic patients. Finally, to identify the most impactful optimization criteria for orally inhaled drugs, the developed mechanistic model was applied to investigate the impact of input parameters on both the pulmonary and systemic exposure. Interestingly, the solubility of the inhaled drug did not have any relevant impact on the local and systemic pharmacokinetics. Instead, the pulmonary dissolution rate, the particle size, the tissue affinity, and the systemic clearance were the most impactful potential optimization parameters. In the future, the developed prediction framework should be considered a powerful tool for identifying optimal drug and formulation characteristics.
There is an increasing interest in fusing data from heterogeneous sources. Combining data sources increases the utility of existing datasets, generating new information and creating services of higher quality. A central issue in working with heterogeneous sources is data migration: In order to share and process data in different engines, resource intensive and complex movements and transformations between computing engines, services, and stores are necessary.
Muses is a distributed, high-performance data migration engine that is able to interconnect distributed data stores by forwarding, transforming, repartitioning, or broadcasting data among distributed engines' instances in a resource-, cost-, and performance-adaptive manner. As such, it performs seamless information sharing across all participating resources in a standard, modular manner. We show an overall improvement of 30 % for pipelining jobs across multiple engines, even when we count the overhead of Muses in the execution time. This performance gain implies that Muses can be used to optimise large pipelines that leverage multiple engines.
Data privacy is a very important issue. Especially in fields like medicine, it is paramount to abide by the existing privacy regulations to preserve patients' anonymity. However, data is required for research and training machine learning models that could help gain insight into complex correlations or personalised treatments that may otherwise stay undiscovered. Those models generally scale with the amount of data available, but the current situation often prohibits building large databases across sites. So it would be beneficial to be able to combine similar or related data from different sites all over the world while still preserving data privacy. Federated learning has been proposed as a solution for this, because it relies on the sharing of machine learning models, instead of the raw data itself. That means private data never leaves the site or device it was collected on. Federated learning is an emerging research area, and many domains have been identified for the application of those methods. This systematic literature review provides an extensive look at the concept of and research into federated learning and its applicability for confidential healthcare datasets.
Cyber warfare is a timely and relevant issue and one of the most controversial in international humanitarian law (IHL). The aim of IHL is to set rules and limits in terms of means and methods of warfare. In this context, a key question arises: Has digital warfare rules or limits, and if so, how are these applicable? Traditional principles, developed over a long period, are facing a new dimension of challenges due to the rise of cyber warfare. This paper argues that to overcome this new issue, it is critical that new humanity-oriented approaches is developed with regard to cyber warfare. The challenge is to establish a legal regime for cyber-attacks, successfully addressing human rights norms and standards. While clarifying this from a legal perspective, the authors can redesign the sensitive equilibrium between humanity and military necessity, weighing the humanitarian aims of IHL and the protection of civilians-in combination with international human rights law and other relevant legal regimes-in a different manner than before.
Several numerical tools designed to overcome the challenges of smoothing in a non-linear and non-Gaussian setting are investigated for a class of particle smoothers. The considered family of smoothers is induced by the class of linear ensemble transform filters which contains classical filters such as the stochastic ensemble Kalman filter, the ensemble square root filter, and the recently introduced nonlinear ensemble transform filter. Further the ensemble transform particle smoother is introduced and particularly highlighted as it is consistent in the particle limit and does not require assumptions with respect to the family of the posterior distribution. The linear update pattern of the considered class of linear ensemble transform smoothers allows one to implement important supplementary techniques such as adaptive spread corrections, hybrid formulations, and localization in order to facilitate their application to complex estimation problems. These additional features are derived and numerically investigated for a sequence of increasingly challenging test problems.
Argument mining on twitter
(2021)
In the last decade, the field of argument mining has grown notably. However, only relatively few studies have investigated argumentation in social media and specifically on Twitter. Here, we provide the, to our knowledge, first critical in-depth survey of the state of the art in tweet-based argument mining. We discuss approaches to modelling the structure of arguments in the context of tweet corpus annotation, and we review current progress in the task of detecting argument components and their relations in tweets. We also survey the intersection of argument mining and stance detection, before we conclude with an outlook.
Multiplicative Up-Drift
(2020)
Drift analysis aims at translating the expected progress of an evolutionary algorithm (or more generally, a random process) into a probabilistic guarantee on its run time (hitting time). So far, drift arguments have been successfully employed in the rigorous analysis of evolutionary algorithms, however, only for the situation that the progress is constant or becomes weaker when approaching the target. Motivated by questions like how fast fit individuals take over a population, we analyze random processes exhibiting a (1+delta)-multiplicative growth in expectation. We prove a drift theorem translating this expected progress into a hitting time. This drift theorem gives a simple and insightful proof of the level-based theorem first proposed by Lehre (2011). Our version of this theorem has, for the first time, the best-possible near-linear dependence on 1/delta} (the previous results had an at least near-quadratic dependence), and it only requires a population size near-linear in delta (this was super-quadratic in previous results). These improvements immediately lead to stronger run time guarantees for a number of applications. We also discuss the case of large delta and show stronger results for this setting.
First-class concepts
(2022)
Ideally, programs are partitioned into independently maintainable and understandable modules. As a system grows, its architecture gradually loses the capability to accommodate new concepts in a modular way. While refactoring is expensive and not always possible, and the programming language might lack dedicated primary language constructs to express certain cross-cutting concerns, programmers are still able to explain and delineate convoluted concepts through secondary means: code comments, use of whitespace and arrangement of code, documentation, or communicating tacit knowledge. <br /> Secondary constructs are easy to change and provide high flexibility in communicating cross-cutting concerns and other concepts among programmers. However, such secondary constructs usually have no reified representation that can be explored and manipulated as first-class entities through the programming environment. <br /> In this exploratory work, we discuss novel ways to express a wide range of concepts, including cross-cutting concerns, patterns, and lifecycle artifacts independently of the dominant decomposition imposed by an existing architecture. We propose the representation of concepts as first-class objects inside the programming environment that retain the capability to change as easily as code comments. We explore new tools that allow programmers to view, navigate, and change programs based on conceptual perspectives. In a small case study, we demonstrate how such views can be created and how the programming experience changes from draining programmers' attention by stretching it across multiple modules toward focusing it on cohesively presented concepts. Our designs are geared toward facilitating multiple secondary perspectives on a system to co-exist in symbiosis with the original architecture, hence making it easier to explore, understand, and explain complex contexts and narratives that are hard or impossible to express using primary modularity constructs.