Refine
Year of publication
Document Type
- Article (573)
- Doctoral Thesis (129)
- Monograph/Edited Volume (82)
- Other (26)
- Postprint (10)
- Conference Proceeding (8)
- Preprint (5)
- Part of a Book (4)
- Master's Thesis (3)
- Bachelor Thesis (1)
Language
- English (842) (remove)
Keywords
- answer set programming (12)
- Answer Set Programming (10)
- Answer set programming (10)
- Machine Learning (7)
- Maschinelles Lernen (7)
- Antwortmengenprogrammierung (5)
- Computer Science Education (5)
- Internet of Things (4)
- MQTT (4)
- Optimization (4)
Institute
- Institut für Informatik und Computational Science (842) (remove)
TrainTrap
(2020)
Companies develop process models to explicitly describe their business operations. In the same time, business operations, business processes, must adhere to various types of compliance requirements. Regulations, e.g., Sarbanes Oxley Act of 2002, internal policies, best practices are just a few sources of compliance requirements. In some cases, non-adherence to compliance requirements makes the organization subject to legal punishment. In other cases, non-adherence to compliance leads to loss of competitive advantage and thus loss of market share. Unlike the classical domain-independent behavioral correctness of business processes, compliance requirements are domain-specific. Moreover, compliance requirements change over time. New requirements might appear due to change in laws and adoption of new policies. Compliance requirements are offered or enforced by different entities that have different objectives behind these requirements. Finally, compliance requirements might affect different aspects of business processes, e.g., control flow and data flow. As a result, it is infeasible to hard-code compliance checks in tools. Rather, a repeatable process of modeling compliance rules and checking them against business processes automatically is needed. This thesis provides a formal approach to support process design-time compliance checking. Using visual patterns, it is possible to model compliance requirements concerning control flow, data flow and conditional flow rules. Each pattern is mapped into a temporal logic formula. The thesis addresses the problem of consistency checking among various compliance requirements, as they might stem from divergent sources. Also, the thesis contributes to automatically check compliance requirements against process models using model checking. We show that extra domain knowledge, other than expressed in compliance rules, is needed to reach correct decisions. In case of violations, we are able to provide a useful feedback to the user. The feedback is in the form of parts of the process model whose execution causes the violation. In some cases, our approach is capable of providing automated remedy of the violation.
Recent philosophical analyses of the epistemic dimension of images in the sciences show a certain trend in acknowledging potential roles of these images beyond their merely decorative or pedagogical functions. We argue, however, that this new debate has yet paid little attention to a special type of pictures, we call ‘visual metaphor’, and its versatile heuristic potential in organizing data, supporting communication, and guiding research, modeling, and theory formation. Based on a case study of Conrad Hal Waddington’s epigenetic landscape images in biology, we develop a descriptive framework applicable to heuristic roles of various visual metaphors in the sciences.
Nowadays, business processes are increasingly supported by IT services that produce massive amounts of event data during the execution of a process. These event data can be used to analyze the process using process mining techniques to discover the real process, measure conformance to a given process model, or to enhance existing models with performance information. Mapping the produced events to activities of a given process model is essential for conformance checking, annotation and understanding of process mining results. In order to accomplish this mapping with low manual effort, we developed a semi-automatic approach that maps events to activities using insights from behavioral analysis and label analysis. The approach extracts Declare constraints from both the log and the model to build matching constraints to efficiently reduce the number of possible mappings. These mappings are further reduced using techniques from natural language processing, which allow for a matching based on labels and external knowledge sources. The evaluation with synthetic and real-life data demonstrates the effectiveness of the approach and its robustness toward non-conforming execution logs.
While the maturity of process mining algorithms increases and more process mining tools enter the market, process mining projects still face the problem of different levels of abstraction when comparing events with modeled business activities. Current approaches for event log abstraction try to abstract from the events in an automated way that does not capture the required domain knowledge to fit business activities. This can lead to misinterpretation of discovered process models. We developed an approach that aims to abstract an event log to the same abstraction level that is needed by the business. We use domain knowledge extracted from existing process documentation to semi-automatically match events and activities. Our abstraction approach is able to deal with n:m relations between events and activities and also supports concurrency. We evaluated our approach in two case studies with a German IT outsourcing company. (C) 2014 Elsevier Ltd. All rights reserved.
THIS INSTALLMENT OF Research for Practice provides curated reading guides to technology for underserved communities and to new developments in personal fabrication. First, Tawanna Dillahunt describes design considerations and technology for underserved and impoverished communities. Designing for the more than 1.6 billion impoverished individuals worldwide requires special consideration of community needs, constraints, and context. Her selections span protocols for poor-quality communication networks, community-driven content generation, and resource and public service discovery. Second, Stefanie Mueller and Patrick Baudisch provide an overview of recent advances in personal fabrication (for example, 3D printers).
Autonomy is an emerging paradigm for the design and implementation of managed services and systems. Self-managed aspects frequently concern the communication of systems with their environment. Self-management subsystems are critical, they should thus be designed and implemented as high-assurance components. Here, we propose to use GEAR, a game-based model checker for the full modal mu-calculus, and derived, more user-oriented logics, as a user friendly tool that can offer automatic proofs of critical properties of such systems. Designers and engineers can interactively investigate automatically generated winning strategies resulting from the games, this way exploring the connection between the property, the system, and the proof. The benefits of the approach are illustrated on a case study that concerns the ExoMars Rover.
teaspoon
(2018)
Answer Set Programming (ASP) is an approach to declarative problem solving, combining a rich yet simple modeling language with high performance solving capacities. We here develop an ASP-based approach to curriculum-based course timetabling (CB-CTT), one of the most widely studied course timetabling problems. The resulting teaspoon system reads a CB-CTT instance of a standard input format and converts it into a set of ASP facts. In turn, these facts are combined with a first-order encoding for CB-CTT solving, which can subsequently be solved by any off-the-shelf ASP systems. We establish the competitiveness of our approach by empirically contrasting it to the best known bounds obtained so far via dedicated implementations. Furthermore, we extend the teaspoon system to multi-objective course timetabling and consider minimal perturbation problems.
The course timetabling problem can be generally defined as the task of assigning a number of lectures to a limited set of timeslots and rooms, subject to a given set of hard and soft constraints. The modeling language for course timetabling is required to be expressive enough to specify a wide variety of soft constraints and objective functions. Furthermore, the resulting encoding is required to be extensible for capturing new constraints and for switching them between hard and soft, and to be flexible enough to deal with different formulations. In this paper, we propose to make effective use of ASP as a modeling language for course timetabling. We show that our ASP-based approach can naturally satisfy the above requirements, through an ASP encoding of the curriculum-based course timetabling problem proposed in the third track of the second international timetabling competition (ITC-2007). Our encoding is compact and human-readable, since each constraint is individually expressed by either one or two rules. Each hard constraint is expressed by using integrity constraints and aggregates of ASP. Each soft constraint S is expressed by rules in which the head is the form of penalty (S, V, C), and a violation V and its penalty cost C are detected and calculated respectively in the body. We carried out experiments on four different benchmark sets with five different formulations. We succeeded either in improving the bounds or producing the same bounds for many combinations of problem instances and formulations, compared with the previous best known bounds.
Programs are often subjected to significant optimizing and parallelizing transformations based on extensive dependence analysis. Formal validation of such transformations needs modelling paradigms which can capture both control and data dependences in the program vividly. Being value-based with an inherent scope of capturing parallelism, the untimed coloured Petri net (CPN) models, reported in the literature, fit the bill well; accordingly, they are likely to be more convenient as the intermediate representations (IRs) of both the source and the transformed codes for translation validation than strictly sequential variable-based IRs like sequential control flow graphs (CFGs). In this work, an efficient path-based equivalence checking method for CPN models of programs on integers is presented. Extensive experimentation has been carried out on several sequential and parallel examples. Complexity and correctness issues have been treated rigorously for the method.
Regardless of what is intended by government curriculum
specifications and advised by educational experts, the competencies
taught and learned in and out of classrooms can vary considerably.
In this paper, we discuss in particular how we can investigate the
perceptions that individual teachers have of competencies in ICT,
and how these and other factors may influence students’ learning. We
report case study research which identifies contradictions within the
teaching of ICT competencies as an activity system, highlighting issues
concerning the object of the curriculum, the roles of the participants and
the school cultures. In a particular case, contradictions in the learning
objectives between higher order skills and the use of application tools
have been resolved by a change in the teacher’s perceptions which
have not led to changes in other aspects of the activity system. We look
forward to further investigation of the effects of these contradictions in
other case studies and on forthcoming curriculum change.
Large-scale literature mining to assess the relation between anti-cancer drugs and cancer types
(2021)
Background:
There is a huge body of scientific literature describing the relation between tumor types and anti-cancer drugs. The vast amount of scientific literature makes it impossible for researchers and physicians to extract all relevant information manually.
Methods:
In order to cope with the large amount of literature we applied an automated text mining approach to assess the relations between 30 most frequent cancer types and 270 anti-cancer drugs. We applied two different approaches, a classical text mining based on named entity recognition and an AI-based approach employing word embeddings. The consistency of literature mining results was validated with 3 independent methods: first, using data from FDA approvals, second, using experimentally measured IC-50 cell line data and third, using clinical patient survival data.
Results:
We demonstrated that the automated text mining was able to successfully assess the relation between cancer types and anti-cancer drugs. All validation methods showed a good correspondence between the results from literature mining and independent confirmatory approaches. The relation between most frequent cancer types and drugs employed for their treatment were visualized in a large heatmap. All results are accessible in an interactive web-based knowledge base using the following link: .
Conclusions:
Our approach is able to assess the relations between compounds and cancer types in an automated manner. Both, cancer types and compounds could be grouped into different clusters. Researchers can use the interactive knowledge base to inspect the presented results and follow their own research questions, for example the identification of novel indication areas for known drugs.
Computational methods for the design of effective therapies against drug resistant HIV strains
(2005)
The development of drug resistance is a major obstacle to successful treatment of HIV infection. The extraordinary replication dynamics of HIV facilitates its escape from selective pressure exerted by the human immune system and by combination drug therapy. We have developed several computational methods whose combined use can support the design of optimal antiretroviral therapies based on viral genomic data
We introduce and investigate input-revolving finite automata, which are (nondeterministic) finite state automata with the additional ability to shift the remaining part of the input. Three different modes of shifting are considered, namely revolving to the left, revolving to the right, and circular-interchanging. We investigate the computational capacities of these three types of automata and their deterministic variants, comparing any of the six classes of automata with each other and with further classes of well-known automata. In particular, it is shown that nondeterminism is better than determinism, that is, for all three modes of shifting there is a language accepted by the nondeterministic model but not accepted by any deterministic automaton of the same type. Concerning the closure properties most of the deterministic language families studied are not closed under standard operations. For example, we show that the family of languages accepted by deterministic right-revolving finite automata is an anti-AFL which is not closed under reversal and intersection.
Graded paraconsistency
(2000)
Significant inferences
(2000)
Circumscribing inconsistency
(1997)
Compressions and extensions
(1998)
One of the main problems in machine learning is to train a predictive model from training data and to make predictions on test data. Most predictive models are constructed under the assumption that the training data is governed by the exact same distribution which the model will later be exposed to. In practice, control over the data collection process is often imperfect. A typical scenario is when labels are collected by questionnaires and one does not have access to the test population. For example, parts of the test population are underrepresented in the survey, out of reach, or do not return the questionnaire. In many applications training data from the test distribution are scarce because they are difficult to obtain or very expensive. Data from auxiliary sources drawn from similar distributions are often cheaply available. This thesis centers around learning under differing training and test distributions and covers several problem settings with different assumptions on the relationship between training and test distributions-including multi-task learning and learning under covariate shift and sample selection bias. Several new models are derived that directly characterize the divergence between training and test distributions, without the intermediate step of estimating training and test distributions separately. The integral part of these models are rescaling weights that match the rescaled or resampled training distribution to the test distribution. Integrated models are studied where only one optimization problem needs to be solved for learning under differing distributions. With a two-step approximation to the integrated models almost any supervised learning algorithm can be adopted to biased training data. In case studies on spam filtering, HIV therapy screening, targeted advertising, and other applications the performance of the new models is compared to state-of-the-art reference methods.
We address classification problems for which the training instances are governed by an input distribution that is allowed to differ arbitrarily from the test distribution-problems also referred to as classification under covariate shift. We derive a solution that is purely discriminative: neither training nor test distribution are modeled explicitly. The problem of learning under covariate shift can be written as an integrated optimization problem. Instantiating the general optimization problem leads to a kernel logistic regression and an exponential model classifier for covariate shift. The optimization problem is convex under certain conditions; our findings also clarify the relationship to the known kernel mean matching procedure. We report on experiments on problems of spam filtering, text classification, and landmine detection.
We address classification problems for which the training instances are governed by an input distribution that is allowed to differ arbitrarily from the test distribution-problems also referred to as classification under covariate shift. We derive a solution that is purely discriminative: neither training nor test distribution are modeled explicitly. The problem of learning under covariate shift can be written as an integrated optimization problem. Instantiating the general optimization problem leads to a kernel logistic regression and an exponential model classifier for covariate shift. The optimization problem is convex under certain conditions; our findings also clarify the relationship to the known kernel mean matching procedure. We report on experiments on problems of spam filtering, text classification, and landmine detection.
Through the use of next generation sequencing (NGS) technology, a lot of newly sequenced organisms are now available. Annotating those genes is one of the most challenging tasks in sequence biology. Here, we present an automated workflow to find homologue proteins, annotate sequences according to function and create a three-dimensional model.
The Berlin Brain-Computer Interface (BBCI) project develops a noninvasive BCI system whose key features are 1) the use of well-established motor competences as control paradigms, 2) high-dimensional features from 128-channel electroencephalogram (EEG), and 3) advanced machine learning techniques. As reported earlier, our experiments demonstrate that very high information transfer rates can be achieved using the readiness potential (RP) when predicting the laterality of upcoming left-versus right-hand movements in healthy subjects. A more recent study showed that the RP similarily accompanies phantom movements in arm amputees, but the signal strength decreases with longer loss of the limb. In a complementary approach, oscillatory features are used to discriminate imagined movements (left hand versus right hand versus foot). In a recent feedback study with six healthy subjects with no or very little experience with BCI control, three subjects achieved an information transfer rate above 35 bits per minute (bpm), and further two subjects above 24 and 15 bpm, while one subject could not achieve any BCI control. These results are encouraging for an EEG-based BCI system in untrained subjects that is independent of peripheral nervous system activity and does not rely on evoked potentials even when compared to results with very well-trained subjects operating other BCI systems
Interest in developing a new method of man-to-machine communication-a brain-computer interface (BCI)-has grown steadily over the past few decades. BCIs create a new communication channel between the brain and an output device by bypassing conventional motor output pathways of nerves and muscles. These systems use signals recorded from the scalp, the surface of the cortex, or from inside the brain to enable users to control a variety of applications including simple word-processing software and orthotics. BCI technology could therefore provide a new communication and control option for individuals who cannot otherwise express their wishes to the outside world. Signal processing and classification methods are essential tools in the development of improved BCI technology. We organized the BCI Competition 2003 to evaluate the current state of the art of these tools. Four laboratories well versed in EEG-based BCI research provided six data sets in a documented format. We made these data sets (i.e., labeled training sets and unlabeled test sets) and their descriptions available on the Internet. The goal in the competition was to maximize the performance measure for the test labels. Researchers worldwide tested their algorithms and competed for the best classification results. This paper describes the six data sets and the results and function of the most successful algorithms
A brain-computer interface (BCI) is a system that allows its users to control external devices with brain activity. Although the proof-of-concept was given decades ago, the reliable translation of user intent into device control commands is still a major challenge. Success requires the effective interaction of two adaptive controllers: the user's brain, which produces brain activity that encodes intent, and the BCI system, which translates that activity into device control commands. In order to facilitate this interaction, many laboratories are exploring a variety of signal analysis techniques to improve the adaptation of the BCI system to the user. In the literature, many machine learning and pattern classification algorithms have been reported to give impressive results when applied to BCI data in offline analyses. However, it is more difficult to evaluate their relative value for actual online use. BCI data competitions have been organized to provide objective formal evaluations of alternative methods. Prompted by the great interest in the first two BCI Competitions, we organized the third BCI Competition to address several of the most difficult and important analysis problems in BCI research. The paper describes the data sets that were provided to the competitors and gives an overview of the results.