Refine
Year of publication
Document Type
- Article (573)
- Doctoral Thesis (129)
- Monograph/Edited Volume (82)
- Other (26)
- Postprint (10)
- Conference Proceeding (8)
- Preprint (5)
- Part of a Book (4)
- Master's Thesis (3)
- Bachelor Thesis (1)
Language
- English (842) (remove)
Keywords
- answer set programming (12)
- Answer Set Programming (10)
- Answer set programming (10)
- Machine Learning (7)
- Maschinelles Lernen (7)
- Antwortmengenprogrammierung (5)
- Computer Science Education (5)
- Internet of Things (4)
- MQTT (4)
- Optimization (4)
Institute
- Institut für Informatik und Computational Science (842) (remove)
Accurately solving classification problems nowadays is likely to be the most relevant machine learning task. Binary classification separating two classes only is algorithmically simpler but has fewer potential applications as many real-world problems are multi-class. On the reverse, separating only a subset of classes simplifies the classification task. Even though existing multi-class machine learning algorithms are very flexible regarding the number of classes, they assume that the target set Y is fixed and cannot be restricted once the training is finished. On the other hand, existing state-of-the-art production environments are becoming increasingly interconnected with the advance of Industry 4.0 and related technologies such that additional information can simplify the respective classification problems. In light of this, the main aim of this thesis is to introduce dynamic classification that generalizes multi-class classification such that the target class set can be restricted arbitrarily to a non-empty class subset M of Y at any time between two consecutive predictions.
This task is solved by a combination of two algorithmic approaches. First, classifier calibration, which transforms predictions into posterior probability estimates that are intended to be well calibrated. The analysis provided focuses on monotonic calibration and in particular corrects wrong statements that appeared in the literature. It also reveals that bin-based evaluation metrics, which became popular in recent years, are unjustified and should not be used at all. Next, the validity of Platt scaling, which is the most relevant parametric calibration approach, is analyzed in depth. In particular, its optimality for classifier predictions distributed according to four different families of probability distributions as well its equivalence with Beta calibration up to a sigmoidal preprocessing are proven. For non-monotonic calibration, extended variants on kernel density estimation and the ensemble method EKDE are introduced. Finally, the calibration techniques are evaluated using a simulation study with complete information as well as on a selection of 46 real-world data sets.
Building on this, classifier calibration is applied as part of decomposition-based classification that aims to reduce multi-class problems to simpler (usually binary) prediction tasks. For the involved fusing step performed at prediction time, a new approach based on evidence theory is presented that uses classifier calibration to model mass functions. This allows the analysis of decomposition-based classification against a strictly formal background and to prove closed-form equations for the overall combinations. Furthermore, the same formalism leads to a consistent integration of dynamic class information, yielding a theoretically justified and computationally tractable dynamic classification model. The insights gained from this modeling are combined with pairwise coupling, which is one of the most relevant reduction-based classification approaches, such that all individual predictions are combined with a weight. This not only generalizes existing works on pairwise coupling but also enables the integration of dynamic class information.
Lastly, a thorough empirical study is performed that compares all newly introduced approaches to existing state-of-the-art techniques. For this, evaluation metrics for dynamic classification are introduced that depend on corresponding sampling strategies. Thereafter, these are applied during a three-part evaluation. First, support vector machines and random forests are applied on 26 data sets from the UCI Machine Learning Repository. Second, two state-of-the-art deep neural networks are evaluated on five benchmark data sets from a relatively recent reference work. Here, computationally feasible strategies to apply the presented algorithms in combination with large-scale models are particularly relevant because a naive application is computationally intractable. Finally, reference data from a real-world process allowing the inclusion of dynamic class information are collected and evaluated. The results show that in combination with support vector machines and random forests, pairwise coupling approaches yield the best results, while in combination with deep neural networks, differences between the different approaches are mostly small to negligible. Most importantly, all results empirically confirm that dynamic classification succeeds in improving the respective prediction accuracies. Therefore, it is crucial to pass dynamic class information in respective applications, which requires an appropriate digital infrastructure.
Answer Set Programming (ASP) is a paradigm for modeling and solving problems for knowledge representation and reasoning. There are plenty of results dedicated to studying the hardness of (fragments of) ASP. So far, these studies resulted in characterizations in terms of computational complexity as well as in fine-grained insights presented in form of dichotomy-style results, lower bounds when translating to other formalisms like propositional satisfiability (SAT), and even detailed parameterized complexity landscapes. A generic parameter in parameterized complexity originating from graph theory is the socalled treewidth, which in a sense captures structural density of a program. Recently, there was an increase in the number of treewidth-based solvers related to SAT. While there are translations from (normal) ASP to SAT, no reduction that preserves treewidth or at least keeps track of the treewidth increase is known. In this paper we propose a novel reduction from normal ASP to SAT that is aware of the treewidth, and guarantees that a slight increase of treewidth is indeed sufficient. Further, we show a new result establishing that, when considering treewidth, already the fragment of normal ASP is slightly harder than SAT (under reasonable assumptions in computational complexity). This also confirms that our reduction probably cannot be significantly improved and that the slight increase of treewidth is unavoidable. Finally, we present an empirical study of our novel reduction from normal ASP to SAT, where we compare treewidth upper bounds that are obtained via known decomposition heuristics. Overall, our reduction works better with these heuristics than existing translations. (c) 2021 Elsevier B.V. All rights reserved.
M-rate 0L systems are interactionless Lindenmayer systems together with a function assigning to every string a set of multisets of productions that may be applied simultaneously to the string. Some questions that have been left open in the forerunner papers are examined, and the computational power of deterministic M-rate 0L systems is investigated, where also tabled and extended variants are taken into consideration.
In this paper, an asynchronous design for soft error detection and correction in combinational and sequential circuits is presented. The proposed architecture is called Asynchronous Full Error Detection and Correction (AFEDC). A custom design flow with integrated commercial EDA tools generates the AFEDC using the asynchronous bundled-data design style. The AFEDC relies on an Error Detection Circuit (EDC) for protecting the combinational logic and fault-tolerant latches for protecting the sequential logic. The EDC can be implemented using different detection methods. For this work, two boundary variants are considered, the Full Duplication with Comparison (FDC) and the Partial Duplication with Parity Prediction (PDPP). The AFEDC architecture can handle single events and timing faults of arbitrarily long duration as well as the synchronous FEDC, but additionally can address known metastability issues of the FEDC and other similar synchronous architectures and provide a more practical solution for handling the error recovery process. Two case studies are developed, a carry look-ahead adder and a pipelined non-restoring array divider. Results show that the AFEDC provides equivalent fault coverage when compared to the FEDC while reducing area, ranging from 9.6% to 17.6%, and increasing energy efficiency, which can be up to 6.5%.
Eclingo
(2020)
We describe eclingo, a solver for epistemic logic programs under Gelfond 1991 semantics built upon the Answer Set Programming system clingo. The input language of eclingo uses the syntax extension capabilities of clingo to define subjective literals that, as usual in epistemic logic programs, allow for checking the truth of a regular literal in all or in some of the answer sets of a program. The eclingo solving process follows a guess and check strategy. It first generates potential truth values for subjective literals and, in a second step, it checks the obtained result with respect to the cautious and brave consequences of the program. This process is implemented using the multi-shot functionalities of clingo. We have also implemented some optimisations, aiming at reducing the search space and, therefore, increasing eclingo 's efficiency in some scenarios. Finally, we compare the efficiency of eclingo with two state-of-the-art solvers for epistemic logic programs on a pair of benchmark scenarios and show that eclingo generally outperforms their obtained results.
Many Android applications embed webpages via WebView components and execute JavaScript code within Android. Hybrid applications leverage dedicated APIs to load a resource and render it in a WebView. Furthermore, Android objects can be shared with the JavaScript world. However, bridging the interfaces of the Android and JavaScript world might also incur severe security threats: Potentially untrusted webpages and their JavaScript might interfere with the Android environment and its access to native features.
No general analysis is currently available to assess the implications of such hybrid apps bridging the two worlds. To understand the semantics and effects of hybrid apps, we perform a large-scale study on the usage of the hybridization APIs in the wild. We analyze and categorize the parameters to hybridization APIs for 7,500 randomly selected and the 196 most popular applications from the Google Playstore as well as 1000 malware samples. Our results advance the general understanding of hybrid applications, as well as implications for potential program analyses, and the current security situation: We discovered thousands of flows of sensitive data from Android to JavaScript, the vast majority of which could flow to potentially untrustworthy code. Our analysis identified numerous web pages embedding vulnerabilities, which we exemplarily exploited. Additionally, we discovered a multitude of applications in which potentially untrusted JavaScript code may interfere with (trusted) Android objects, both in benign and malign applications.
A triple modular redundancy (TMR) based design technique for double cell upsets (DCUs) mitigation is investigated in this paper. This technique adds three extra self-voter circuits into a traditional TMR structure to enable the enhanced error correction capability. Fault-injection simulations show that the soft error rate (SER) of the proposed technique is lower than 3% of that of TMR. The implementation of this proposed technique is compatible with the automatic digital design flow, and its applicability and performance are evaluated on an FIFO circuit.
As a result of CMOS scaling, radiation-induced Single-Event Effects (SEEs) in electronic circuits became a critical reliability issue for modern Integrated Circuits (ICs) operating under harsh radiation conditions. SEEs can be triggered in combinational or sequential logic by the impact of high-energy particles, leading to destructive or non-destructive faults, resulting in data corruption or even system failure. Typically, the SEE mitigation methods are deployed statically in processing architectures based on the worst-case radiation conditions, which is most of the time unnecessary and results in a resource overhead. Moreover, the space radiation conditions are dynamically changing, especially during Solar Particle Events (SPEs). The intensity of space radiation can differ over five orders of magnitude within a few hours or days, resulting in several orders of magnitude fault probability variation in ICs during SPEs. This thesis introduces a comprehensive approach for designing a self-adaptive fault resilient multiprocessing system to overcome the static mitigation overhead issue. This work mainly addresses the following topics: (1) Design of on-chip radiation particle monitor for real-time radiation environment detection, (2) Investigation of space environment predictor, as support for solar particle events forecast, (3) Dynamic mode configuration in the resilient multiprocessing system. Therefore, according to detected and predicted in-flight space radiation conditions, the target system can be configured to use no mitigation or low-overhead mitigation during non-critical periods of time. The redundant resources can be used to improve system performance or save power. On the other hand, during increased radiation activity periods, such as SPEs, the mitigation methods can be dynamically configured appropriately depending on the real-time space radiation environment, resulting in higher system reliability. Thus, a dynamic trade-off in the target system between reliability, performance and power consumption in real-time can be achieved. All results of this work are evaluated in a highly reliable quad-core multiprocessing system that allows the self-adaptive setting of optimal radiation mitigation mechanisms during run-time. Proposed methods can serve as a basis for establishing a comprehensive self-adaptive resilient system design process. Successful implementation of the proposed design in the quad-core multiprocessor shows its application perspective also in the other designs.
The highly structured nature of the educational sector demands effective policy mechanisms close to the needs of the field. That is why evidence-based policy making, endorsed by the European Commission under Erasmus+ Key Action 3, aims to make an alignment between the domains of policy and practice. Against this background, this article addresses two issues: First, that there is a vertical gap in the translation of higher-level policies to local strategies and regulations. Second, that there is a horizontal gap between educational domains regarding the policy awareness of individual players. This was analyzed in quantitative and qualitative studies with domain experts from the fields of virtual mobility and teacher training. From our findings, we argue that the combination of both gaps puts the academic bridge from secondary to tertiary education at risk, including the associated knowledge proficiency levels. We discuss the role of digitalization in the academic bridge by asking the question: which value does the involved stakeholders expect from educational policies? As a theoretical basis, we rely on the model of value co-creation for and by stakeholders. We describe the used instruments along with the obtained results and proposed benefits. Moreover, we reflect on the methodology applied, and we finally derive recommendations for future academic bridge policies.
The highly structured nature of the educational sector demands effective policy mechanisms close to the needs of the field. That is why evidence-based policy making, endorsed by the European Commission under Erasmus+ Key Action 3, aims to make an alignment between the domains of policy and practice. Against this background, this article addresses two issues: First, that there is a vertical gap in the translation of higher-level policies to local strategies and regulations. Second, that there is a horizontal gap between educational domains regarding the policy awareness of individual players. This was analyzed in quantitative and qualitative studies with domain experts from the fields of virtual mobility and teacher training. From our findings, we argue that the combination of both gaps puts the academic bridge from secondary to tertiary education at risk, including the associated knowledge proficiency levels. We discuss the role of digitalization in the academic bridge by asking the question: which value does the involved stakeholders expect from educational policies? As a theoretical basis, we rely on the model of value co-creation for and by stakeholders. We describe the used instruments along with the obtained results and proposed benefits. Moreover, we reflect on the methodology applied, and we finally derive recommendations for future academic bridge policies.
Answer Set Programming (ASP) allows us to address knowledge-intensive search and optimization problems in a declarative way due to its integrated modeling, grounding, and solving workflow. A problem is modeled using a rule based language and then grounded and solved. Solving results in a set of stable models that correspond to solutions of the modeled problem. In this thesis, we present the design and implementation of the clingo system---perhaps, the most
widely used ASP system. It features a rich modeling language originating from the field of knowledge representation and reasoning, efficient grounding algorithms based on database evaluation techniques, and high performance solving algorithms based on Boolean satisfiability (SAT) solving technology.
The contributions of this thesis lie in the design of the modeling language, the design and implementation of the grounding algorithms, and the design and implementation of an Application Programmable Interface (API) facilitating the use of ASP in real world applications and the implementation of complex forms of reasoning beyond the traditional ASP workflow.
Large-scale databases that report the inhibitory capacities of many combinations of candidate drug compounds and cultivated cancer cell lines have driven the development of preclinical drug-sensitivity models based on machine learning. However, cultivated cell lines have devolved from human cancer cells over years or even decades under selective pressure in culture conditions. Moreover, models that have been trained on in vitro data cannot account for interactions with other types of cells. Drug-response data that are based on patient-derived cell cultures, xenografts, and organoids, on the other hand, are not available in the quantities that are needed to train high-capacity machine-learning models. We found that pre-training deep neural network models of drug sensitivity on in vitro drug-sensitivity databases before fine-tuning the model parameters on patient-derived data improves the models’ accuracy and improves the biological plausibility of the features, compared to training only on patient-derived data. From our experiments, we can conclude that pre-trained models outperform models that have been trained on the target domains in the vast majority of cases.
Large-scale databases that report the inhibitory capacities of many combinations of candidate drug compounds and cultivated cancer cell lines have driven the development of preclinical drug-sensitivity models based on machine learning. However, cultivated cell lines have devolved from human cancer cells over years or even decades under selective pressure in culture conditions. Moreover, models that have been trained on in vitro data cannot account for interactions with other types of cells. Drug-response data that are based on patient-derived cell cultures, xenografts, and organoids, on the other hand, are not available in the quantities that are needed to train high-capacity machine-learning models. We found that pre-training deep neural network models of drug sensitivity on in vitro drug-sensitivity databases before fine-tuning the model parameters on patient-derived data improves the models’ accuracy and improves the biological plausibility of the features, compared to training only on patient-derived data. From our experiments, we can conclude that pre-trained models outperform models that have been trained on the target domains in the vast majority of cases.
The notion of coherence relations is quite widely accepted in general, but concrete proposals differ considerably on the questions of how they should be motivated, which relations are to be assumed, and how they should be defined. This paper takes a "bottom-up" perspective by assessing the contribution made by linguistic signals (connectives), using insights from the relevant literature as well as verification by practical text annotation. We work primarily with the German language here and focus on the realm of contrast. Thus, we suggest a new inventory of contrastive connective functions and discuss their relationship to contrastive coherence relations that have been proposed in earlier work.
Handling manufacturing and aging faults with software-based techniques in tiny embedded systems
(2017)
Non-volatile memory area occupies a large portion of the area of a chip in an embedded system. Such memories are prone to manufacturing faults, retention faults, and aging faults. The paper presents a single software based technique that allows for handling all of these fault types in tiny embedded systems without the need for hardware support. This is beneficial for low-cost embedded systems with simple memory architectures. A software infrastructure and a flow are presented that demonstrate how the presented technique is used in general for fault handling right after manufacturing and in-the-field. Moreover, a full implementation is presented for a MSP430 microcontroller, along with a discussion of the performance, overhead, and reliability impacts.
This paper describes architectural extensions for a dynamically scheduled processor, so that it can be used in three different operation modes, ranging from high-performance, to high-reliability. With minor hardware-extensions of the control path, the resources of the superscalar data-path can be used either for high-performance execution, fail-safe-operation, or fault-tolerant-operation. This makes the processor-architecture a very good candidate for applications with dynamically changing reliability requirements, e.g. for automotive applications. The paper reports the hardware-overhead for the extensions, and investigates the performance penalties introduced by the fail-safe and fault-tolerant mode. Furthermore, a comprehensive fault simulation was carried out in order to investigate the fault-coverage of the proposed approach.
Answer Set Programming (ASP) is a successful rule-based formalism for modeling and solving knowledge-intense combinatorial (optimization) problems. Despite its success in both academic and industry, open challenges like automatic source code optimization, and software engineering remains. This is because a problem encoded into an ASP might not have the desired solving performance compared to an equivalent representation. Motivated by these two challenges, this paper has three main contributions. First, we propose a developing process towards a methodology to implement ASP programs, being faithful to existing methods. Second, we present ASP encodings that serve as the basis from the developing process. Third, we demonstrate the use of ASP to reverse the standard solving process. That is, knowing answer sets in advance, and desired strong equivalent properties, “we” exhaustively reconstruct ASP programs if they exist. This paper was originally motivated by the search of propositional formulas (if they exist) that represent the semantics of a new aggregate operator. Particularly, a parity aggregate. This aggregate comes as an improvement from the already existing parity (xor) constraints from xorro, where lacks expressiveness, even though these constraints fit perfectly for reasoning modes like sampling or model counting. To this end, this extended version covers the fundaments from parity constraints as well as the xorro system. Hence, we delve a little more in the examples and the proposed methodology over parity constraints. Finally, we discuss our results by showing the only representation available, that satisfies different properties from the classical logic xor operator, which is also consistent with the semantics of parity constraints from xorro.
Continuous verification of network security compliance is an accepted need. Especially, the analysis of stateful packet filters plays a central role for network security in practice. But the few existing tools which support the analysis of stateful packet filters are based on general applicable formal methods like Satifiability Modulo Theories (SMT) or theorem prover and show runtimes in the order of minutes to hours making them unsuitable for continuous compliance verification. In this work, we address these challenges and present the concept of state shell interweaving to transform a stateful firewall rule set into a stateless rule set. This allows us to reuse any fast domain specific engine from the field of data plane verification tools leveraging smart, very fast, and domain specialized data structures and algorithms including Header Space Analysis (HSA). First, we introduce the formal language FPL that enables a high-level human-understandable specification of the desired state of network security. Second, we demonstrate the instantiation of a compliance process using a verification framework that analyzes the configuration of complex networks and devices - including stateful firewalls - for compliance with FPL policies. Our evaluation results show the scalability of the presented approach for the well known Internet2 and Stanford benchmarks as well as for large firewall rule sets where it outscales state-of-the-art tools by a factor of over 41.
The Internet can be considered as the most important infrastructure for modern society and businesses. A loss of Internet connectivity has strong negative financial impacts for businesses and economies. Therefore, assessing Internet connectivity, in particular beyond their own premises and area of direct control, is of growing importance in the face of potential failures, accidents, and malicious attacks. This paper presents CORIA, a software framework for an easy analysis of connectivity risks based on large network graphs. It provides researchers, risk analysts, network managers and security consultants with a tool to assess an organization's connectivity and paths options through the Internet backbone, including a user-friendly and insightful visual representation of results. CORIA is flexibly extensible in terms of novel data sets, graph metrics, and risk scores that enable further use cases. The performance of CORIA is evaluated by several experiments on the Internet graph and further randomly generated networks.
Supervised machine learning to assess methane emissions of a dairy building with natural ventilation
(2020)
A reliable quantification of greenhouse gas emissions is a basis for the development of adequate mitigation measures. Protocols for emission measurements and data analysis approaches to extrapolate to accurate annual emission values are a substantial prerequisite in this context. We systematically analyzed the benefit of supervised machine learning methods to project methane emissions from a naturally ventilated cattle building with a concrete solid floor and manure scraper located in Northern Germany. We took into account approximately 40 weeks of hourly emission measurements and compared model predictions using eight regression approaches, 27 different sampling scenarios and four measures of model accuracy. Data normalization was applied based on median and quartile range. A correlation analysis was performed to evaluate the influence of individual features. This indicated only a very weak linear relation between the methane emission and features that are typically used to predict methane emission values of naturally ventilated barns. It further highlighted the added value of including day-time and squared ambient temperature as features. The error of the predicted emission values was in general below 10%. The results from Gaussian processes, ordinary multilinear regression and neural networks were least robust. More robust results were obtained with multilinear regression with regularization, support vector machines and particularly the ensemble methods gradient boosting and random forest. The latter had the added value to be rather insensitive against the normalization procedure. In the case of multilinear regression, also the removal of not significantly linearly related variables (i.e., keeping only the day-time component) led to robust modeling results. We concluded that measurement protocols with 7 days and six measurement periods can be considered sufficient to model methane emissions from the dairy barn with solid floor with manure scraper, particularly when periods are distributed over the year with a preference for transition periods. Features should be normalized according to median and quartile range and must be carefully selected depending on the modeling approach.