Refine
Year of publication
Document Type
- Article (538)
- Doctoral Thesis (126)
- Monograph/Edited Volume (82)
- Other (26)
- Postprint (10)
- Conference Proceeding (5)
- Part of a Book (4)
- Preprint (4)
- Bachelor Thesis (1)
- Habilitation Thesis (1)
Language
- English (798) (remove)
Is part of the Bibliography
- yes (798) (remove)
Keywords
- answer set programming (12)
- Answer Set Programming (10)
- Answer set programming (10)
- Machine Learning (7)
- Maschinelles Lernen (7)
- Antwortmengenprogrammierung (5)
- Internet of Things (4)
- MQTT (4)
- machine learning (4)
- security (4)
Institute
- Institut für Informatik und Computational Science (798) (remove)
The protein classification workflow described in this report enables users to get information about a novel protein sequence automatically. The information is derived by different bioinformatic analysis tools which calculate or predict features of a protein sequence. Also, databases are used to compare the novel sequence with known proteins.
This paper describes the implementation of a workflow model for service-oriented computing of potential areas for wind turbines in jABC. By implementing a re-executable model the manual effort of a multi-criteria site analysis can be reduced. The aim is to determine the shift of typical geoprocessing tools of geographic information systems (GIS) from the desktop to the web. The analysis is based on a vector data set and mainly uses web services of the “Center for Spatial Information Science and Systems” (CSISS). This paper discusses effort, benefits and problems associated with the use of the web services.
Efficient deterministic approaches solving the general layout problem in graphs : Forschungsbericht
(1996)
If sites, cities, and landscapes are captured at different points in time using technology such as LiDAR, large collections of 3D point clouds result. Their efficient storage, processing, analysis, and presentation constitute a challenging task because of limited computation, memory, and time resources. In this work, we present an approach to detect changes in massive 3D point clouds based on an out-of-core spatial data structure that is designed to store data acquired at different points in time and to efficiently attribute 3D points with distance information. Based on this data structure, we present and evaluate different processing schemes optimized for performing the calculation on the CPU and GPU. In addition, we present a point-based rendering technique adapted for attributed 3D point clouds, to enable effective out-of-core real-time visualization of the computation results. Our approach enables conclusions to be drawn about temporal changes in large highly accurate 3D geodata sets of a captured area at reasonable preprocessing and rendering times. We evaluate our approach with two data sets from different points in time for the urban area of a city, describe its characteristics, and report on applications.
The radiation-sensitive field-effect transistors (RADFETs) with an oxide thickness of 400 nm are irradiated with gate voltages of 2, 4 and 6 V, and without gate voltage.
A detailed analysis of the mechanisms responsible for the creation of traps during irradiation is performed.
The creation of the traps in the oxide, near and at the silicon/silicon-dioxide (Si/SiO2) interface during irradiation is modelled very well. This modelling can also be used for other MOS transistors containing SiO2.
The behaviour of radiation traps during postirradiation annealing is analysed, and the corresponding functions for their modelling are obtained. The switching traps (STs) do not have significant influence on threshold voltage shift, and two radiation-induced trap types fit the fixed traps (FTs) very well. The fading does not depend on the positive gate voltage applied during irradiation, but it is twice lower in case there is no gate voltage.
A new dosimetric parameter, called the Golden Ratio (GR), is proposed, which represents the ratio between the threshold voltage shift after irradiation and fading after spontaneous annealing. This parameter can be useful for comparing MOS dosimeters.
Most information systems log events (e.g., transaction logs, audit traits) to audit and monitor the processes they support. At the same time, many of these processes have been explicitly modeled. For example, SAP R/3 logs events in transaction logs and there are EPCs (Event-driven Process Chains) describing the so-called reference models. These reference models describe how the system should be used. The coexistence of event logs and process models raises an interesting question: "Does the event log conform to the process model and vice versa?". This paper demonstrates that there is not a simple answer to this question. To tackle the problem, we distinguish two dimensions of conformance: fitness (the event log may be the result of the process modeled) and appropriateness (the model is a likely candidate from a structural and behavioral point of view). Different metrics have been defined and a Conformance Checker has been implemented within the ProM Framework
In this work we consider statistical learning problems. A learning machine aims to extract information from a set of training examples such that it is able to predict the associated label on unseen examples. We consider the case where the resulting classification or regression rule is a combination of simple rules - also called base hypotheses. The so-called boosting algorithms iteratively find a weighted linear combination of base hypotheses that predict well on unseen data. We address the following issues: o The statistical learning theory framework for analyzing boosting methods. We study learning theoretic guarantees on the prediction performance on unseen examples. Recently, large margin classification techniques emerged as a practical result of the theory of generalization, in particular Boosting and Support Vector Machines. A large margin implies a good generalization performance. Hence, we analyze how large the margins in boosting are and find an improved algorithm that is able to generate the maximum margin solution. o How can boosting methods be related to mathematical optimization techniques? To analyze the properties of the resulting classification or regression rule, it is of high importance to understand whether and under which conditions boosting converges. We show that boosting can be used to solve large scale constrained optimization problems, whose solutions are well characterizable. To show this, we relate boosting methods to methods known from mathematical optimization, and derive convergence guarantees for a quite general family of boosting algorithms. o How to make Boosting noise robust? One of the problems of current boosting techniques is that they are sensitive to noise in the training sample. In order to make boosting robust, we transfer the soft margin idea from support vector learning to boosting. We develop theoretically motivated regularized algorithms that exhibit a high noise robustness. o How to adapt boosting to regression problems? Boosting methods are originally designed for classification problems. To extend the boosting idea to regression problems, we use the previous convergence results and relations to semi-infinite programming to design boosting-like algorithms for regression problems. We show that these leveraging algorithms have desirable theoretical and practical properties. o Can boosting techniques be useful in practice? The presented theoretical results are guided by simulation results either to illustrate properties of the proposed algorithms or to show that they work well in practice. We report on successful applications in a non-intrusive power monitoring system, chaotic time series analysis and a drug discovery process. --- Anmerkung: Der Autor ist Träger des von der Mathematisch-Naturwissenschaftlichen Fakultät der Universität Potsdam vergebenen Michelson-Preises für die beste Promotion des Jahres 2001/2002.
SVM and boosting : one class
(2000)
Robust ensemble learning
(2000)
The Internet of Things (IoT) is a system of physical objects that can be discovered, monitored, controlled, or interacted with by electronic devices that communicate over various networking interfaces and eventually can be connected to the wider Internet. [Guinard and Trifa, 2016]. IoT devices are equipped with sensors and/or actuators and may be constrained in terms of memory, computational power, network bandwidth, and energy. Interoperability can help to manage such heterogeneous devices. Interoperability is the ability of different types of systems to work together smoothly. There are four levels of interoperability: physical, network and transport, integration, and data. The data interoperability is subdivided into syntactic and semantic data. Semantic data describes the meaning of data and the common understanding of vocabulary e.g. with the help of dictionaries, taxonomies, ontologies. To achieve interoperability, semantic interoperability is necessary.
Many organizations and companies are working on standards and solutions for interoperability in the IoT. However, the commercial solutions produce a vendor lock-in. They focus on centralized approaches such as cloud-based solutions. This thesis proposes a decentralized approach namely Edge Computing. Edge Computing is based on the concepts of mesh networking and distributed processing. This approach has an advantage that information collection and processing are placed closer to the sources of this information. The goals are to reduce traffic, latency, and to be robust against a lossy or failed Internet connection.
We see management of IoT devices from the network configuration management perspective. This thesis proposes a framework for network configuration management of heterogeneous, constrained IoT devices by using semantic descriptions for interoperability. The MYNO framework is an acronym for MQTT, YANG, NETCONF and Ontology. The NETCONF protocol is the IETF standard for network configuration management. The MQTT protocol is the de-facto standard in the IoT. We picked up the idea of the NETCONF-MQTT bridge, originally proposed by Scheffler and Bonneß[2017], and extended it with semantic device descriptions. These device descriptions provide a description of the device capabilities. They are based on the oneM2M Base ontology and formalized by the Semantic Web Standards.
The novel approach is using a ontology-based device description directly on a constrained device in combination with the MQTT protocol. The bridge was extended in order to query such descriptions. Using a semantic annotation, we achieved that the device capabilities are self-descriptive, machine readable and re-usable.
The concept of a Virtual Device was introduced and implemented, based on semantic device descriptions. A Virtual Device aggregates the capabilities of all devices at the edge network and contributes therefore to the scalability. Thus, it is possible to control all devices via a single RPC call.
The model-driven NETCONF Web-Client is generated automatically from this YANG model which is generated by the bridge based on the semantic device description. The Web-Client provides a user-friendly interface, offers RPC calls and displays sensor values. We demonstrate the feasibility of this approach in different use cases: sensor and actuator scenarios, as well as event configuration and triggering.
The semantic approach results in increased memory overhead. Therefore, we evaluated CBOR and RDF HDT for optimization of ontology-based device descriptions for use on constrained devices. The evaluation shows that CBOR is not suitable for long strings and RDF HDT is a promising candidate but is still a W3C Member Submission. Finally, we used an optimized JSON-LD format for the syntax of the device descriptions.
One of the security tasks of network management is the distribution of firmware updates. The MYNO Update Protocol (MUP) was developed and evaluated on constrained devices CC2538dk and 6LoWPAN. The MYNO update process is focused on freshness and authenticity of the firmware. The evaluation shows that it is challenging but feasible to bring the firmware updates to constrained devices using MQTT. As a new requirement for the next MQTT version, we propose to add a slicing feature for the better support of constrained devices. The MQTT broker should slice data to the maximum packet size specified by the device and transfer it slice-by-slice.
For the performance and scalability evaluation of MYNO framework, we setup the High Precision Agriculture demonstrator with 10 ESP-32 NodeMCU boards at the edge of the network. The ESP-32 NodeMCU boards, connected by WLAN, were equipped with six sensors and two actuators. The performance evaluation shows that the processing of ontology-based descriptions on a Raspberry Pi 3B with the RDFLib is a challenging task regarding computational power. Nevertheless, it is feasible because it must be done only once per device during the discovery process.
The MYNO framework was tested with heterogeneous devices such as CC2538dk from Texas Instruments, Arduino Yún Rev 3, and ESP-32 NodeMCU, and IP-based networks such as 6LoWPAN and WLAN.
Summarizing, with the MYNO framework we could show that the semantic approach on constrained devices is feasible in the IoT.
MUP
(2020)
Message Queuing Telemetry Transport (MQTT) is one of the dominating protocols for edge- and cloud-based Internet of Things (IoT) solutions. When a security vulnerability of an IoT device is known, it has to be fixed as soon as possible. This requires a firmware update procedure. In this paper, we propose a secure update protocol for MQTT-connected devices which ensures the freshness of the firmware, authenticates the new firmware and considers constrained devices. We show that the update protocol is easy to integrate in an MQTT-based IoT network using a semantic approach. The feasibility of our approach is demonstrated by a detailed performance analysis of our prototype implementation on a IoT device with 32 kB RAM. Thereby, we identify design issues in MQTT 5 which can help to improve the support of constrained devices.
MUP
(2020)
Message Queuing Telemetry Transport (MQTT) is one of the dominating protocols for edge- and cloud-based Internet of Things (IoT) solutions. When a security vulnerability of an IoT device is known, it has to be fixed as soon as possible. This requires a firmware update procedure. In this paper, we propose a secure update protocol for MQTT-connected devices which ensures the freshness of the firmware, authenticates the new firmware and considers constrained devices. We show that the update protocol is easy to integrate in an MQTT-based IoT network using a semantic approach. The feasibility of our approach is demonstrated by a detailed performance analysis of our prototype implementation on a IoT device with 32 kB RAM. Thereby, we identify design issues in MQTT 5 which can help to improve the support of constrained devices.
An IoT network may consist of hundreds heterogeneous devices. Some of them may be constrained in terms of memory, power, processing and network capacity. Manual network and service management of IoT devices are challenging. We propose a usage of an ontology for the IoT device descriptions enabling automatic network management as well as service discovery and aggregation. Our IoT architecture approach ensures interoperability using existing standards, i.e. MQTT protocol and SemanticWeb technologies. We herein introduce virtual IoT devices and their semantic framework deployed at the edge of network. As a result, virtual devices are enabled to aggregate capabilities of IoT devices, derive new services by inference, delegate requests/responses and generate events. Furthermore, they can collect and pre-process sensor data. These tasks on the edge computing overcome the shortcomings of the cloud usage regarding siloization, network bandwidth, latency and speed. We validate our proposition by implementing a virtual device on a Raspberry Pi.
Software-as-a-Service (SaaS) offers several advantages to both service providers and users. Service providers can benefit from the reduction of Total Cost of Ownership (TCO), better scalability, and better resource utilization. On the other hand, users can use the service anywhere and anytime, and minimize upfront investment by following the pay-as-you-go model. Despite the benefits of SaaS, users still have concerns about the security and privacy of their data. Due to the nature of SaaS and the Cloud in general, the data and the computation are beyond the users' control, and hence data security becomes a vital factor in this new paradigm. Furthermore, in multi-tenant SaaS applications, the tenants become more concerned about the confidentiality of their data since several tenants are co-located onto a shared infrastructure.
To address those concerns, we start protecting the data from the provisioning process by controlling how tenants are being placed in the infrastructure. We present a resource allocation algorithm designed to minimize the risk of co-resident tenants called SecPlace. It enables the SaaS provider to control the resource (i.e., database instance) allocation process while taking into account the security of tenants as a requirement.
Due to the design principles of the multi-tenancy model, tenants follow some degree of sharing on both application and infrastructure levels. Thus, strong security-isolation should be present. Therefore, we develop SignedQuery, a technique that prevents one tenant from accessing others' data. We use the Signing Concept to create a signature that is used to sign the tenant's request, then the server can verifies the signature and recognizes the requesting tenant, and hence ensures that the data to be accessed is belonging to the legitimate tenant.
Finally, Data confidentiality remains a critical concern due to the fact that data in the Cloud is out of users' premises, and hence beyond their control. Cryptography is increasingly proposed as a potential approach to address such a challenge. Therefore, we present SecureDB, a system designed to run SQL-based applications over an encrypted database. SecureDB captures the schema design and analyzes it to understand the internal structure of the data (i.e., relationships between the tables and their attributes). Moreover, we determine the appropriate partialhomomorphic encryption scheme for each attribute where computation is possible even when the data is encrypted.
To evaluate our work, we conduct extensive experiments with di↵erent settings. The main use case in our work is a popular open source HRM application, called OrangeHRM. The results show that our multi-layered approach is practical, provides enhanced security and isolation among tenants, and have a moderate complexity in terms of processing encrypted data.
A method of construction of combinational self-checking units with detection of all single faults
(1999)
The field of machine learning studies algorithms that infer predictive models from data. Predictive models are applicable for many practical tasks such as spam filtering, face and handwritten digit recognition, and personalized product recommendation. In general, they are used to predict a target label for a given data instance. In order to make an informed decision about the deployment of a predictive model, it is crucial to know the model’s approximate performance. To evaluate performance, a set of labeled test instances is required that is drawn from the distribution the model will be exposed to at application time. In many practical scenarios, unlabeled test instances are readily available, but the process of labeling them can be a time- and cost-intensive task and may involve a human expert. This thesis addresses the problem of evaluating a given predictive model accurately with minimal labeling effort. We study an active model evaluation process that selects certain instances of the data according to an instrumental sampling distribution and queries their labels. We derive sampling distributions that minimize estimation error with respect to different performance measures such as error rate, mean squared error, and F-measures. An analysis of the distribution that governs the estimator leads to confidence intervals, which indicate how precise the error estimation is. Labeling costs may vary across different instances depending on certain characteristics of the data. For instance, documents differ in their length, comprehensibility, and technical requirements; these attributes affect the time a human labeler needs to judge relevance or to assign topics. To address this, the sampling distribution is extended to incorporate instance-specific costs. We empirically study conditions under which the active evaluation processes are more accurate than a standard estimate that draws equally many instances from the test distribution. We also address the problem of comparing the risks of two predictive models. The standard approach would be to draw instances according to the test distribution, label the selected instances, and apply statistical tests to identify significant differences. Drawing instances according to an instrumental distribution affects the power of a statistical test. We derive a sampling procedure that maximizes test power when used to select instances, and thereby minimizes the likelihood of choosing the inferior model. Furthermore, we investigate the task of comparing several alternative models; the objective of an evaluation could be to rank the models according to the risk that they incur or to identify the model with lowest risk. An experimental study shows that the active procedure leads to higher test power than the standard test in many application domains. Finally, we study the problem of evaluating the performance of ranking functions, which are used for example for web search. In practice, ranking performance is estimated by applying a given ranking model to a representative set of test queries and manually assessing the relevance of all retrieved items for each query. We apply the concepts of active evaluation and active comparison to ranking functions and derive optimal sampling distributions for the commonly used performance measures Discounted Cumulative Gain and Expected Reciprocal Rank. Experiments on web search engine data illustrate significant reductions in labeling costs.
Evaluating the quality of ranking functions is a core task in web search and other information retrieval domains. Because query distributions and item relevance change over time, ranking models often cannot be evaluated accurately on held-out training data. Instead, considerable effort is spent on manually labeling the relevance of query results for test queries in order to track ranking performance. We address the problem of estimating ranking performance as accurately as possible on a fixed labeling budget. Estimates are based on a set of most informative test queries selected by an active sampling distribution. Query labeling costs depend on the number of result items as well as item-specific attributes such as document length. We derive cost-optimal sampling distributions for the commonly used performance measures Discounted Cumulative Gain and Expected Reciprocal Rank. Experiments on web search engine data illustrate significant reductions in labeling costs.
Multi tenancy for cloud-based in-memory column databases : workload management and data placement
(2013)
The family of default logics
(1998)
Preferred well-founded semantics for logic programming by alternating fixpoints : preliminary report
(2002)
Answer Set Programming faces an increasing popularity for problem solving in various domains. While its modeling language allows us to express many complex problems in an easy way, its solving technology enables their effective resolution. In what follows, we detail some of the key factors of its success. Answer Set Programming [ASP; Brewka et al. Commun ACM 54(12):92–103, (2011)] is seeing a rapid proliferation in academia and industry due to its easy and flexible way to model and solve knowledge-intense combinatorial (optimization) problems. To this end, ASP offers a high-level modeling language paired with high-performance solving technology. As a result, ASP systems provide out-off-the-box, general-purpose search engines that allow for enumerating (optimal) solutions. They are represented as answer sets, each being a set of atoms representing a solution. The declarative approach of ASP allows a user to concentrate on a problem’s specification rather than the computational means to solve it. This makes ASP a prime candidate for rapid prototyping and an attractive tool for teaching key AI techniques since complex problems can be expressed in a succinct and elaboration tolerant way. This is eased by the tuning of ASP’s modeling language to knowledge representation and reasoning (KRR). The resulting impact is nicely reflected by a growing range of successful applications of ASP [Erdem et al. AI Mag 37(3):53–68, 2016; Falkner et al. Industrial applications of answer set programming. K++nstliche Intelligenz (2018)]
Location analyses are among the most common tasks while working with spatial data and geographic information systems. Automating the most frequently used procedures is therefore an important aspect of improving their usability. In this context, this project aims to design and implement a workflow, providing some basic tools for a location analysis. For the implementation with jABC, the workflow was applied to the problem of finding a suitable location for placing an artificial reef. For this analysis three parameters (bathymetry, slope and grain size of the ground material) were taken into account, processed, and visualized with the The Generic Mapping Tools (GMT), which were integrated into the workflow as jETI-SIBs. The implemented workflow thereby showed that the approach to combine jABC with GMT resulted in an user-centric yet user-friendly tool with high-quality cartographic outputs.
This paper analyses data privacy issues as they arise from different deployment scenarios for networks that use embedded sensor devices. Maintaining data privacy in pervasive environments requires the management and implementation of privacy protection measures close to the data source. We propose a set of atomic privacy parameters that is generic enough to form specific privacy classes and might be applied directly at the embedded sensor device.
The UDKM1DSIM toolbox is a collection of MATLAB (MathWorks Inc.) classes and routines to simulate the structural dynamics and the according X-ray diffraction response in one-dimensional crystalline sample structures upon an arbitrary time-dependent external stimulus, e.g. an ultrashort laser pulse. The toolbox provides the capabilities to define arbitrary layered structures on the atomic level including a rich database of corresponding element-specific physical properties. The excitation of ultrafast dynamics is represented by an N-temperature model which is commonly applied for ultrafast optical excitations. Structural dynamics due to thermal stress are calculated by a linear-chain model of masses and springs. The resulting X-ray diffraction response is computed by dynamical X-ray theory. The UDKM1DSIM toolbox is highly modular and allows for introducing user-defined results at any step in the simulation procedure.
Program summary
Program title: udkm1Dsim
Catalogue identifier: AERH_v1_0
Program summary URL: http://cpc.cs.qub.ac.uk/summaries/AERH_v1_0.html
Licensing provisions: BSD
No. of lines in distributed program, including test data, etc.: 130221
No. of bytes in distributed program, including test data, etc.: 2746036
Distribution format: tar.gz
Programming language: Matlab (MathWorks Inc.).
Computer: PC/Workstation.
Operating system: Running Matlab installation required (tested on MS Win XP -7, Ubuntu Linux 11.04-13.04).
Has the code been vectorized or parallelized?: Parallelization for dynamical XRD computations. Number of processors used: 1-12 for Matlab Parallel Computing Toolbox; 1 - infinity for Matlab Distributed Computing Toolbox
External routines:
Optional: Matlab Parallel Computing Toolbox, Matlab Distributed Computing Toolbox Required (included in the package): mtimesx Fast Matrix Multiply for Matlab by James Tursa, xml io tools by Jaroslaw Tuszynski, textprogressbar by Paul Proteus
Nature of problem:
Simulate the lattice dynamics of 1D crystalline sample structures due to an ultrafast excitation including thermal transport and compute the corresponding transient X-ray diffraction pattern.
Solution method:
Restrictions:
The program is restricted to 1D sample structures and is further limited to longitudinal acoustic phonon modes and symmetrical X-ray diffraction geometries.
Unusual features: The program is highly modular and allows the inclusion of user-defined inputs at any time of the simulation procedure.
Running time: The running time is highly dependent on the number of unit cells in the sample structure and other simulation parameters such as time span or angular grid for X-ray diffraction computations. However, the example files are computed in approx. 1-5 min each on a 8 Core Processor with 16 GB RAM available.
Stripe rust (Pst) is a major disease of wheat crops leading untreated to severe yield losses. The use of fungicides is often essential to control Pst when sudden outbreaks are imminent. Sensors capable of detecting Pst in wheat crops could optimize the use of fungicides and improve disease monitoring in high-throughput field phenotyping. Now, deep learning provides new tools for image recognition and may pave the way for new camera based sensors that can identify symptoms in early stages of a disease outbreak within the field. The aim of this study was to teach an image classifier to detect Pst symptoms in winter wheat canopies based on a deep residual neural network (ResNet). For this purpose, a large annotation database was created from images taken by a standard RGB camera that was mounted on a platform at a height of 2 m. Images were acquired while the platform was moved over a randomized field experiment with Pst-inoculated and Pst-free plots of winter wheat. The image classifier was trained with 224 x 224 px patches tiled from the original, unprocessed camera images. The image classifier was tested on different stages of the disease outbreak. At patch level the image classifier reached a total accuracy of 90%. To test the image classifier on image level, the image classifier was evaluated with a sliding window using a large striding length of 224 px allowing for fast test performance. At image level, the image classifier reached a total accuracy of 77%. Even in a stage with very low disease spreading (0.5%) at the very beginning of the Pst outbreak, a detection accuracy of 57% was obtained. Still in the initial phase of the Pst outbreak with 2 to 4% of Pst disease spreading, detection accuracy with 76% could be attained. With further optimizations, the image classifier could be implemented in embedded systems and deployed on drones, vehicles or scanning systems for fast mapping of Pst outbreaks.
Design Issues in the Implementation of MPI2 One Sided Communication in Ethernet based Networks
(2007)
In current research, one sided communication of the MPI2 standard is pushed as a promising technique [6, 7, 10, 18]. But measurements of applications and MPI2 primitives show a different picture [17]. In this paper we analyze de sign issues of MPI2 one sided communication and its im plementations. We focus on asynchronous communication for parallel applications in Ethernet cluster environments. Further, one sided communication is compared to two sided communication. This paper will prove that the key problem to performance is not only the implementation of MPI2 one sided communication - it is the design.
Emotions are a central element of human experience. They occur with high frequency in everyday life and play an important role in decision making. However, currently there is no consensus among researchers on what constitutes an emotion and on how emotions should be investigated. This dissertation identifies three problems of current emotion research: the problem of ground truth, the problem of incomplete constructs and the problem of optimal representation. I argue for a focus on the detailed measurement of emotion manifestations with computer-aided methods to solve these problems. This approach is demonstrated in three research projects, which describe the development of methods specific to these problems as well as their application to concrete research questions.
The problem of ground truth describes the practice to presuppose a certain structure of emotions as the a priori ground truth. This determines the range of emotion descriptions and sets a standard for the correct assignment of these descriptions. The first project illustrates how this problem can be circumvented with a multidimensional emotion perception paradigm which stands in contrast to the emotion recognition paradigm typically employed in emotion research. This paradigm allows to calculate an objective difficulty measure and to collect subjective difficulty ratings for the perception of emotional stimuli. Moreover, it enables the use of an arbitrary number of emotion stimuli categories as compared to the commonly used six basic emotion categories. Accordingly, we collected data from 441 participants using dynamic facial expression stimuli from 40 emotion categories. Our findings suggest an increase in emotion perception difficulty with increasing actor age and provide evidence to suggest that young adults, the elderly and men underestimate their emotion perception difficulty. While these effects were predicted from the literature, we also found unexpected and novel results. In particular, the increased difficulty on the objective difficulty measure for female actors and observers stood in contrast to reported findings. Exploratory analyses revealed low relevance of person-specific variables for the prediction of emotion perception difficulty, but highlighted the importance of a general pleasure dimension for the ease of emotion perception.
The second project targets the problem of incomplete constructs which relates to vaguely defined psychological constructs on emotion with insufficient ties to tangible manifestations. The project exemplifies how a modern data collection method such as face tracking data can be used to sharpen these constructs on the example of arousal, a long-standing but fuzzy construct in emotion research. It describes how measures of distance, speed and magnitude of acceleration can be computed from face tracking data and investigates their intercorrelations. We find moderate to strong correlations among all measures of static information on one hand and all measures of dynamic information on the other. The project then investigates how self-rated arousal is tied to these measures in 401 neurotypical individuals and 19 individuals with autism. Distance to the neutral face was predictive of arousal ratings in both groups. Lower mean arousal ratings were found for the autistic group, but no difference in correlation of the measures and arousal ratings could be found between groups. Results were replicated in a high autistic traits group consisting of 41 participants. The findings suggest a qualitatively similar perception of arousal for individuals with and without autism. No correlations between valence ratings and any of the measures could be found which emphasizes the specificity of our tested measures for the construct of arousal.
The problem of optimal representation refers to the search for the best representation of emotions and the assumption that there is a one-fits-all solution. In the third project we introduce partial least squares analysis as a general method to find an optimal representation to relate two high-dimensional data sets to each other. The project demonstrates its applicability to emotion research on the question of emotion perception differences between men and women. The method was used with emotion rating data from 441 participants and face tracking data computed on 306 videos. We found quantitative as well as qualitative differences in the perception of emotional facial expressions between these groups. We showed that women’s emotional perception systematically captured more of the variance in facial expressions. Additionally, we could show that significant differences exist in the way that women and men perceive some facial expressions which could be visualized as concrete facial expression sequences. These expressions suggest differing perceptions of masked and ambiguous facial expressions between the sexes. In order to facilitate use of the developed method by the research community, a package for the statistical environment R was written. Furthermore, to call attention to the method and its usefulness for emotion research, a website was designed that allows users to explore a model of emotion ratings and facial expression data in an interactive fashion.
Arousal is one of the dimensions of core affect and frequently used to describe experienced or observed emotional states. While arousal ratings of facial expressions are collected in many studies it is not well understood how arousal is displayed in or interpreted from facial expressions. In the context of socioemotional disorders such as Autism Spectrum Disorder, this poses the question of a differential use of facial information for arousal perception. In this study, we demonstrate how automated face-tracking tools can be used to extract predictors of arousal judgments. We find moderate to strong correlations among all measures of static information on one hand and all measures of dynamic information on the other. Based on these results, we tested two measures, average distance to the neutral face and average facial movement speed, within and between neurotypical individuals (N = 401) and individuals with autism (N = 19). Distance to the neutral face was predictive of arousal in both groups. Lower mean arousal ratings were found for the autistic group, but no difference in correlation of the measures and arousal ratings could be found between groups. Results were replicated in an high autistic traits group. The findings suggest a qualitatively similar perception of arousal for individuals with and without autism. No correlations between valence ratings and any of the measures could be found, emphasizing the specificity of our tested measures. Distance and speed predictors share variability and thus speed should not be discarded as a predictor of arousal ratings.
Advances in biotechnologies rapidly increase the number of molecules of a cell which can be observed simultaneously. This includes expression levels of thousands or ten-thousands of genes as well as concentration levels of metabolites or proteins. Such Profile data, observed at different times or at different experimental conditions (e.g., heat or dry stress), show how the biological experiment is reflected on the molecular level. This information is helpful to understand the molecular behaviour and to identify molecules or combination of molecules that characterise specific biological condition (e.g., disease). This work shows the potentials of component extraction algorithms to identify the major factors which influenced the observed data. This can be the expected experimental factors such as the time or temperature as well as unexpected factors such as technical artefacts or even unknown biological behaviour. Extracting components means to reduce the very high-dimensional data to a small set of new variables termed components. Each component is a combination of all original variables. The classical approach for that purpose is the principal component analysis (PCA). It is shown that, in contrast to PCA which maximises the variance only, modern approaches such as independent component analysis (ICA) are more suitable for analysing molecular data. The condition of independence between components of ICA fits more naturally our assumption of individual (independent) factors which influence the data. This higher potential of ICA is demonstrated by a crossing experiment of the model plant Arabidopsis thaliana (Thale Cress). The experimental factors could be well identified and, in addition, ICA could even detect a technical artefact. However, in continuously observations such as in time experiments, the data show, in general, a nonlinear distribution. To analyse such nonlinear data, a nonlinear extension of PCA is used. This nonlinear PCA (NLPCA) is based on a neural network algorithm. The algorithm is adapted to be applicable to incomplete molecular data sets. Thus, it provides also the ability to estimate the missing data. The potential of nonlinear PCA to identify nonlinear factors is demonstrated by a cold stress experiment of Arabidopsis thaliana. The results of component analysis can be used to build a molecular network model. Since it includes functional dependencies it is termed functional network. Applied to the cold stress data, it is shown that functional networks are appropriate to visualise biological processes and thereby reveals molecular dynamics.
Motivation: Visualizing and analysing the potential non-linear structure of a dataset is becoming an important task in molecular biology. This is even more challenging when the data have missing values. Results: Here, we propose an inverse model that performs non-linear principal component analysis (NLPCA) from incomplete datasets. Missing values are ignored while optimizing the model, but can be estimated afterwards. Results are shown for both artificial and experimental datasets. In contrast to linear methods, non-linear methods were able to give better missing value estimations for non-linear structured data. Application: We applied this technique to a time course of metabolite data from a cold stress experiment on the model plant Arabidopsis thaliana, and could approximate the mapping function from any time point to the metabolite responses. Thus, the inverse NLPCA provides greatly improved information for better understanding the complex response to cold stress
Reliable and robust data processing is one of the hardest requirements for systems in fields such as medicine, security, automotive, aviation, and space, to prevent critical system failures caused by changes in operating or environmental conditions. In particular, Signal Integrity (SI) effects such as crosstalk may distort the signal information in sensitive mixed-signal designs. A challenge for hardware systems used in the space are radiation effects. Namely, Single Event Effects (SEEs) induced by high-energy particle hits may lead to faulty computation, corrupted configuration settings, undesired system behavior, or even total malfunction.
Since these applications require an extra effort in design and implementation, it is beneficial to master the standard cell design process and corresponding design flow methodologies optimized for such challenges. Especially for reliable, low-noise differential signaling logic such as Current Mode Logic (CML), a digital design flow is an orthogonal approach compared to traditional manual design. As a consequence, mandatory preliminary considerations need to be addressed in more detail. First of all, standard cell library concepts with suitable cell extensions for reliable systems and robust space applications have to be elaborated. Resulting design concepts at the cell level should enable the logical synthesis for differential logic design or improve the radiation-hardness. In parallel, the main objectives of the proposed cell architectures are to reduce the occupied area, power, and delay overhead. Second, a special setup for standard cell characterization is additionally required for a proper and accurate logic gate modeling. Last but not least, design methodologies for mandatory design flow stages such as logic synthesis and place and route need to be developed for the respective hardware systems to keep the reliability or the radiation-hardness at an acceptable level.
This Thesis proposes and investigates standard cell-based design methodologies and techniques for reliable and robust hardware systems implemented in a conventional semi-conductor technology. The focus of this work is on reliable differential logic design and robust radiation-hardening-by-design circuits. The synergistic connections of the digital design flow stages are systematically addressed for these two types of hardware systems. In more detail, a library for differential logic is extended with single-ended pseudo-gates for intermediate design steps to support the logic synthesis and layout generation with commercial Computer-Aided Design (CAD) tools. Special cell layouts are proposed to relax signal routing. A library set for space applications is similarly extended by novel Radiation-Hardening-by-Design (RHBD) Triple Modular Redundancy (TMR) cells, enabling a one fault correction. Therein, additional optimized architectures for glitch filter cells, robust scannable and self-correcting flip-flops, and clock-gates are proposed. The circuit concepts and the physical layout representation views of the differential logic gates and the RHBD cells are discussed. However, the quality of results of designs depends implicitly on the accuracy of the standard cell characterization which is examined for both types therefore. The entire design flow is elaborated from the hardware design description to the layout representations. A 2-Phase routing approach together with an intermediate design conversion step is proposed after the initial place and route stage for reliable, pure differential designs, whereas a special constraining for RHBD applications in a standard technology is presented.
The digital design flow for differential logic design is successfully demonstrated on a reliable differential bipolar CML application. A balanced routing result of its differential signal pairs is obtained by the proposed 2-Phase-routing approach. Moreover, the elaborated standard cell concepts and design methodology for RHBD circuits are applied to the digital part of a 7.5-15.5 MSPS 14-bit Analog-to-Digital Converter (ADC) and a complex microcontroller architecture. The ADC is implemented in an unhardened standard semiconductor technology and successfully verified by electrical measurements. The overhead of the proposed hardening approach is additionally evaluated by design exploration of the microcontroller application. Furthermore, the first obtained related measurement results of novel RHBD-∆TMR flip-flops show a radiation-tolerance up to a threshold Linear Energy Transfer (LET) of 46.1, 52.0, and 62.5 MeV cm2 mg-1 and savings in silicon area of 25-50 % for selected TMR standard cell candidates.
As a conclusion, the presented design concepts at the cell and library levels, as well as the design flow modifications are adaptable and transferable to other technology nodes. In particular, the design of hybrid solutions with integrated reliable differential logic modules together with robust radiation-tolerant circuit parts is enabled by the standard cell concepts and design methods proposed in this work.
Use of a standard non-rad-hard digital cell library in the rad-hard design can be a cost-effective solution for space applications. In this paper we demonstrate how a standard non-rad-hard flip-flop, as one of the most vulnerable digital cells, can be converted into a rad-hard flip-flop without modifying its internal structure. We present five variants of a Triple Modular Redundancy (TMR) flip-flop: baseline TMR flip-flop, latch-based TMR flip-flop, True-Single Phase Clock (TSPC) TMR flip-flop, scannable TMR flip-flop and self-correcting TMR flipflop. For all variants, the multi-bit upsets have been addressed by applying special placement constraints, while the Single Event Transient (SET) mitigation was achieved through the usage of customized SET filters and selection of optimal inverter sizes for the clock and reset trees. The proposed flip-flop variants feature differing performance, thus enabling to choose the optimal solution for every sensitive node in the circuit, according to the predefined design constraints. Several flip-flop designs have been validated on IHP's 130nm BiCMOS process, by irradiation of custom-designed shift registers. It has been shown that the proposed TMR flip-flops are robust to soft errors with a threshold Linear Energy Transfer (LET) from (32.4 MeV.cm(2)/mg) to (62.5 MeV.cm(2)/mg), depending on the variant.
Analyses of metagenomes in life sciences present new opportunities as well as challenges to the scientific community and call for advanced computational methods and workflows. The large amount of data collected from samples via next-generation sequencing (NGS) technologies render manual approaches to sequence comparison and annotation unsuitable. Rather, fast and efficient computational pipelines are needed to provide comprehensive statistics and summaries and enable the researcher to choose appropriate tools for more specific analyses. The workflow presented here builds upon previous pipelines designed for automated clustering and annotation of raw sequence reads obtained from next-generation sequencing technologies such as 454 and Illumina. Employing specialized algorithms, the sequence reads are processed at three different levels. First, raw reads are clustered at high similarity cutoff to yield clusters which can be exported as multifasta files for further analyses. Independently, open reading frames (ORFs) are predicted from raw reads and clustered at two strictness levels to yield sets of non-redundant sequences and ORF families. Furthermore, single ORFs are annotated by performing searches against the Pfam database
Manufacturing industries are undergoing a major paradigm shift towards more autonomy. Automated planning and scheduling then becomes a necessity. The Planning and Execution Competition for Logistics Robots in Simulation held at ICAPS is based on this scenario and provides an interesting testbed. However, the posed problem is challenging as also demonstrated by the somewhat weak results in 2017. The domain requires temporal reasoning and dealing with uncertainty. We propose a novel planning system based on Answer Set Programming and the Clingo solver to tackle these problems and incentivize robot cooperation. Our results show a significant performance improvement, both, in terms of lowering computational requirements and better game metrics.
With the jABC it is possible to realize workflows for numerous questions in different fields. The goal of this project was to create a workflow for the identification of differentially expressed genes. This is of special interest in biology, for it gives the opportunity to get a better insight in cellular changes due to exogenous stress, diseases and so on. With the knowledge that can be derived from the differentially expressed genes in diseased tissues, it becomes possible to find new targets for treatment.
Nowadays, model-driven engineering (MDE) promises to ease software development by decreasing the inherent complexity of classical software development. In order to deliver on this promise, MDE increases the level of abstraction and automation, through a consideration of domain-specific models (DSMs) and model operations (e.g. model transformations or code generations). DSMs conform to domain-specific modeling languages (DSMLs), which increase the level of abstraction, and model operations are first-class entities of software development because they increase the level of automation. Nevertheless, MDE has to deal with at least two new dimensions of complexity, which are basically caused by the increased linguistic and technological heterogeneity. The first dimension of complexity is setting up an MDE environment, an activity comprised of the implementation or selection of DSMLs and model operations. Setting up an MDE environment is both time-consuming and error-prone because of the implementation or adaptation of model operations. The second dimension of complexity is concerned with applying MDE for actual software development. Applying MDE is challenging because a collection of DSMs, which conform to potentially heterogeneous DSMLs, are required to completely specify a complex software system. A single DSML can only be used to describe a specific aspect of a software system at a certain level of abstraction and from a certain perspective. Additionally, DSMs are usually not independent but instead have inherent interdependencies, reflecting (partial) similar aspects of a software system at different levels of abstraction or from different perspectives. A subset of these dependencies are applications of various model operations, which are necessary to keep the degree of automation high. This becomes even worse when addressing the first dimension of complexity. Due to continuous changes, all kinds of dependencies, including the applications of model operations, must also be managed continuously. This comprises maintaining the existence of these dependencies and the appropriate (re-)application of model operations. The contribution of this thesis is an approach that combines traceability and model management to address the aforementioned challenges of configuring and applying MDE for software development. The approach is considered as a traceability approach because it supports capturing and automatically maintaining dependencies between DSMs. The approach is considered as a model management approach because it supports managing the automated (re-)application of heterogeneous model operations. In addition, the approach is considered as a comprehensive model management. Since the decomposition of model operations is encouraged to alleviate the first dimension of complexity, the subsequent composition of model operations is required to counteract their fragmentation. A significant portion of this thesis concerns itself with providing a method for the specification of decoupled yet still highly cohesive complex compositions of heterogeneous model operations. The approach supports two different kinds of compositions - data-flow compositions and context compositions. Data-flow composition is used to define a network of heterogeneous model operations coupled by sharing input and output DSMs alone. Context composition is related to a concept used in declarative model transformation approaches to compose individual model transformation rules (units) at any level of detail. In this thesis, context composition provides the ability to use a collection of dependencies as context for the composition of other dependencies, including model operations. In addition, the actual implementation of model operations, which are going to be composed, do not need to implement any composition concerns. The approach is realized by means of a formalism called an executable and dynamic hierarchical megamodel, based on the original idea of megamodels. This formalism supports specifying compositions of dependencies (traceability and model operations). On top of this formalism, traceability is realized by means of a localization concept, and model management by means of an execution concept.
Geospatial data has become a natural part of a growing number of information systems and services in the economy, society, and people's personal lives. In particular, virtual 3D city and landscape models constitute valuable information sources within a wide variety of applications such as urban planning, navigation, tourist information, and disaster management. Today, these models are often visualized in detail to provide realistic imagery. However, a photorealistic rendering does not automatically lead to high image quality, with respect to an effective information transfer, which requires important or prioritized information to be interactively highlighted in a context-dependent manner.
Approaches in non-photorealistic renderings particularly consider a user's task and camera perspective when attempting optimal expression, recognition, and communication of important or prioritized information. However, the design and implementation of non-photorealistic rendering techniques for 3D geospatial data pose a number of challenges, especially when inherently complex geometry, appearance, and thematic data must be processed interactively. Hence, a promising technical foundation is established by the programmable and parallel computing architecture of graphics processing units.
This thesis proposes non-photorealistic rendering techniques that enable both the computation and selection of the abstraction level of 3D geospatial model contents according to user interaction and dynamically changing thematic information. To achieve this goal, the techniques integrate with hardware-accelerated rendering pipelines using shader technologies of graphics processing units for real-time image synthesis. The techniques employ principles of artistic rendering, cartographic generalization, and 3D semiotics—unlike photorealistic rendering—to synthesize illustrative renditions of geospatial feature type entities such as water surfaces, buildings, and infrastructure networks. In addition, this thesis contributes a generic system that enables to integrate different graphic styles—photorealistic and non-photorealistic—and provide their seamless transition according to user tasks, camera view, and image resolution.
Evaluations of the proposed techniques have demonstrated their significance to the field of geospatial information visualization including topics such as spatial perception, cognition, and mapping. In addition, the applications in illustrative and focus+context visualization have reflected their potential impact on optimizing the information transfer regarding factors such as cognitive load, integration of non-realistic information, visualization of uncertainty, and visualization on small displays.
Geometric generalization is a fundamental concept in the digital mapping process. An increasing amount of spatial data is provided on the web as well as a range of tools to process it. This jABC workflow is used for the automatic testing of web-based generalization services like mapshaper.org by executing its functionality, overlaying both datasets before and after the transformation and displaying them visually in a .tif file. Mostly Web Services and command line tools are used to build an environment where ESRI shapefiles can be uploaded, processed through a chosen generalization service and finally visualized in Irfanview.
The objective of this thesis is to provide new space compaction techniques for testing or concurrent checking of digital circuits. In particular, the work focuses on the design of space compactors that achieve high compaction ratio and minimal loss of testability of the circuits. In the first part, the compactors are designed for combinational circuits based on the knowledge of the circuit structure. Several algorithms for analyzing circuit structures are introduced and discussed for the first time. The complexity of each design procedure is linear with respect to the number of gates of the circuit. Thus, the procedures are applicable to large circuits. In the second part, the first structural approach for output compaction for sequential circuits is introduced. Essentially, it enhances the first part. For the approach introduced in the third part it is assumed that the structure of the circuit and the underlying fault model are unknown. The space compaction approach requires only the knowledge of the fault-free test responses for a precomputed test set. The proposed compactor design guarantees zero-aliasing with respect to the precomputed test set.
Non-stationarities are ubiquitous in EEG signals. They are especially apparent in the use of EEG-based brain- computer interfaces (BCIs): (a) in the differences between the initial calibration measurement and the online operation of a BCI, or (b) caused by changes in the subject's brain processes during an experiment (e.g. due to fatigue, change of task involvement, etc). In this paper, we quantify for the first time such systematic evidence of statistical differences in data recorded during offline and online sessions. Furthermore, we propose novel techniques of investigating and visualizing data distributions, which are particularly useful for the analysis of (non-) stationarities. Our study shows that the brain signals used for control can change substantially from the offline calibration sessions to online control, and also within a single session. In addition to this general characterization of the signals, we propose several adaptive classification schemes and study their performance on data recorded during online experiments. An encouraging result of our study is that surprisingly simple adaptive methods in combination with an offline feature selection scheme can significantly increase BCI performance
Testability evaluation of sequential designs incorporating the multi-mode scannable memory element
(1999)
Business process models are used within a range of organizational initiatives, where every stakeholder has a unique perspective on a process and demands the respective model. As a consequence, multiple process models capturing the very same business process coexist. Keeping such models in sync is a challenge within an ever changing business environment: once a process is changed, all its models have to be updated. Due to a large number of models and their complex relations, model maintenance becomes error-prone and expensive. Against this background, business process model abstraction emerged as an operation reducing the number of stored process models and facilitating model management. Business process model abstraction is an operation preserving essential process properties and leaving out insignificant details in order to retain information relevant for a particular purpose. Process model abstraction has been addressed by several researchers. The focus of their studies has been on particular use cases and model transformations supporting these use cases. This thesis systematically approaches the problem of business process model abstraction shaping the outcome into a framework. We investigate the current industry demand in abstraction summarizing it in a catalog of business process model abstraction use cases. The thesis focuses on one prominent use case where the user demands a model with coarse-grained activities and overall process ordering constraints. We develop model transformations that support this use case starting with the transformations based on process model structure analysis. Further, abstraction methods considering the semantics of process model elements are investigated. First, we suggest how semantically related activities can be discovered in process models-a barely researched challenge. The thesis validates the designed abstraction methods against sets of industrial process models and discusses the method implementation aspects. Second, we develop a novel model transformation, which combined with the related activity discovery allows flexible non-hierarchical abstraction. In this way this thesis advocates novel model transformations that facilitate business process model management and provides the foundations for innovative tool support.