Refine
Year of publication
Document Type
- Article (573)
- Doctoral Thesis (129)
- Monograph/Edited Volume (82)
- Other (26)
- Postprint (10)
- Conference Proceeding (8)
- Preprint (5)
- Part of a Book (4)
- Master's Thesis (3)
- Bachelor Thesis (1)
Language
- English (842) (remove)
Keywords
- answer set programming (12)
- Answer Set Programming (10)
- Answer set programming (10)
- Machine Learning (7)
- Maschinelles Lernen (7)
- Antwortmengenprogrammierung (5)
- Computer Science Education (5)
- Internet of Things (4)
- MQTT (4)
- Optimization (4)
Institute
- Institut für Informatik und Computational Science (842) (remove)
Design Issues in the Implementation of MPI2 One Sided Communication in Ethernet based Networks
(2007)
In current research, one sided communication of the MPI2 standard is pushed as a promising technique [6, 7, 10, 18]. But measurements of applications and MPI2 primitives show a different picture [17]. In this paper we analyze de sign issues of MPI2 one sided communication and its im plementations. We focus on asynchronous communication for parallel applications in Ethernet cluster environments. Further, one sided communication is compared to two sided communication. This paper will prove that the key problem to performance is not only the implementation of MPI2 one sided communication - it is the design.
Emotions are a central element of human experience. They occur with high frequency in everyday life and play an important role in decision making. However, currently there is no consensus among researchers on what constitutes an emotion and on how emotions should be investigated. This dissertation identifies three problems of current emotion research: the problem of ground truth, the problem of incomplete constructs and the problem of optimal representation. I argue for a focus on the detailed measurement of emotion manifestations with computer-aided methods to solve these problems. This approach is demonstrated in three research projects, which describe the development of methods specific to these problems as well as their application to concrete research questions.
The problem of ground truth describes the practice to presuppose a certain structure of emotions as the a priori ground truth. This determines the range of emotion descriptions and sets a standard for the correct assignment of these descriptions. The first project illustrates how this problem can be circumvented with a multidimensional emotion perception paradigm which stands in contrast to the emotion recognition paradigm typically employed in emotion research. This paradigm allows to calculate an objective difficulty measure and to collect subjective difficulty ratings for the perception of emotional stimuli. Moreover, it enables the use of an arbitrary number of emotion stimuli categories as compared to the commonly used six basic emotion categories. Accordingly, we collected data from 441 participants using dynamic facial expression stimuli from 40 emotion categories. Our findings suggest an increase in emotion perception difficulty with increasing actor age and provide evidence to suggest that young adults, the elderly and men underestimate their emotion perception difficulty. While these effects were predicted from the literature, we also found unexpected and novel results. In particular, the increased difficulty on the objective difficulty measure for female actors and observers stood in contrast to reported findings. Exploratory analyses revealed low relevance of person-specific variables for the prediction of emotion perception difficulty, but highlighted the importance of a general pleasure dimension for the ease of emotion perception.
The second project targets the problem of incomplete constructs which relates to vaguely defined psychological constructs on emotion with insufficient ties to tangible manifestations. The project exemplifies how a modern data collection method such as face tracking data can be used to sharpen these constructs on the example of arousal, a long-standing but fuzzy construct in emotion research. It describes how measures of distance, speed and magnitude of acceleration can be computed from face tracking data and investigates their intercorrelations. We find moderate to strong correlations among all measures of static information on one hand and all measures of dynamic information on the other. The project then investigates how self-rated arousal is tied to these measures in 401 neurotypical individuals and 19 individuals with autism. Distance to the neutral face was predictive of arousal ratings in both groups. Lower mean arousal ratings were found for the autistic group, but no difference in correlation of the measures and arousal ratings could be found between groups. Results were replicated in a high autistic traits group consisting of 41 participants. The findings suggest a qualitatively similar perception of arousal for individuals with and without autism. No correlations between valence ratings and any of the measures could be found which emphasizes the specificity of our tested measures for the construct of arousal.
The problem of optimal representation refers to the search for the best representation of emotions and the assumption that there is a one-fits-all solution. In the third project we introduce partial least squares analysis as a general method to find an optimal representation to relate two high-dimensional data sets to each other. The project demonstrates its applicability to emotion research on the question of emotion perception differences between men and women. The method was used with emotion rating data from 441 participants and face tracking data computed on 306 videos. We found quantitative as well as qualitative differences in the perception of emotional facial expressions between these groups. We showed that women’s emotional perception systematically captured more of the variance in facial expressions. Additionally, we could show that significant differences exist in the way that women and men perceive some facial expressions which could be visualized as concrete facial expression sequences. These expressions suggest differing perceptions of masked and ambiguous facial expressions between the sexes. In order to facilitate use of the developed method by the research community, a package for the statistical environment R was written. Furthermore, to call attention to the method and its usefulness for emotion research, a website was designed that allows users to explore a model of emotion ratings and facial expression data in an interactive fashion.
Arousal is one of the dimensions of core affect and frequently used to describe experienced or observed emotional states. While arousal ratings of facial expressions are collected in many studies it is not well understood how arousal is displayed in or interpreted from facial expressions. In the context of socioemotional disorders such as Autism Spectrum Disorder, this poses the question of a differential use of facial information for arousal perception. In this study, we demonstrate how automated face-tracking tools can be used to extract predictors of arousal judgments. We find moderate to strong correlations among all measures of static information on one hand and all measures of dynamic information on the other. Based on these results, we tested two measures, average distance to the neutral face and average facial movement speed, within and between neurotypical individuals (N = 401) and individuals with autism (N = 19). Distance to the neutral face was predictive of arousal in both groups. Lower mean arousal ratings were found for the autistic group, but no difference in correlation of the measures and arousal ratings could be found between groups. Results were replicated in an high autistic traits group. The findings suggest a qualitatively similar perception of arousal for individuals with and without autism. No correlations between valence ratings and any of the measures could be found, emphasizing the specificity of our tested measures. Distance and speed predictors share variability and thus speed should not be discarded as a predictor of arousal ratings.
Advances in biotechnologies rapidly increase the number of molecules of a cell which can be observed simultaneously. This includes expression levels of thousands or ten-thousands of genes as well as concentration levels of metabolites or proteins. Such Profile data, observed at different times or at different experimental conditions (e.g., heat or dry stress), show how the biological experiment is reflected on the molecular level. This information is helpful to understand the molecular behaviour and to identify molecules or combination of molecules that characterise specific biological condition (e.g., disease). This work shows the potentials of component extraction algorithms to identify the major factors which influenced the observed data. This can be the expected experimental factors such as the time or temperature as well as unexpected factors such as technical artefacts or even unknown biological behaviour. Extracting components means to reduce the very high-dimensional data to a small set of new variables termed components. Each component is a combination of all original variables. The classical approach for that purpose is the principal component analysis (PCA). It is shown that, in contrast to PCA which maximises the variance only, modern approaches such as independent component analysis (ICA) are more suitable for analysing molecular data. The condition of independence between components of ICA fits more naturally our assumption of individual (independent) factors which influence the data. This higher potential of ICA is demonstrated by a crossing experiment of the model plant Arabidopsis thaliana (Thale Cress). The experimental factors could be well identified and, in addition, ICA could even detect a technical artefact. However, in continuously observations such as in time experiments, the data show, in general, a nonlinear distribution. To analyse such nonlinear data, a nonlinear extension of PCA is used. This nonlinear PCA (NLPCA) is based on a neural network algorithm. The algorithm is adapted to be applicable to incomplete molecular data sets. Thus, it provides also the ability to estimate the missing data. The potential of nonlinear PCA to identify nonlinear factors is demonstrated by a cold stress experiment of Arabidopsis thaliana. The results of component analysis can be used to build a molecular network model. Since it includes functional dependencies it is termed functional network. Applied to the cold stress data, it is shown that functional networks are appropriate to visualise biological processes and thereby reveals molecular dynamics.
Motivation: Visualizing and analysing the potential non-linear structure of a dataset is becoming an important task in molecular biology. This is even more challenging when the data have missing values. Results: Here, we propose an inverse model that performs non-linear principal component analysis (NLPCA) from incomplete datasets. Missing values are ignored while optimizing the model, but can be estimated afterwards. Results are shown for both artificial and experimental datasets. In contrast to linear methods, non-linear methods were able to give better missing value estimations for non-linear structured data. Application: We applied this technique to a time course of metabolite data from a cold stress experiment on the model plant Arabidopsis thaliana, and could approximate the mapping function from any time point to the metabolite responses. Thus, the inverse NLPCA provides greatly improved information for better understanding the complex response to cold stress
Reliable and robust data processing is one of the hardest requirements for systems in fields such as medicine, security, automotive, aviation, and space, to prevent critical system failures caused by changes in operating or environmental conditions. In particular, Signal Integrity (SI) effects such as crosstalk may distort the signal information in sensitive mixed-signal designs. A challenge for hardware systems used in the space are radiation effects. Namely, Single Event Effects (SEEs) induced by high-energy particle hits may lead to faulty computation, corrupted configuration settings, undesired system behavior, or even total malfunction.
Since these applications require an extra effort in design and implementation, it is beneficial to master the standard cell design process and corresponding design flow methodologies optimized for such challenges. Especially for reliable, low-noise differential signaling logic such as Current Mode Logic (CML), a digital design flow is an orthogonal approach compared to traditional manual design. As a consequence, mandatory preliminary considerations need to be addressed in more detail. First of all, standard cell library concepts with suitable cell extensions for reliable systems and robust space applications have to be elaborated. Resulting design concepts at the cell level should enable the logical synthesis for differential logic design or improve the radiation-hardness. In parallel, the main objectives of the proposed cell architectures are to reduce the occupied area, power, and delay overhead. Second, a special setup for standard cell characterization is additionally required for a proper and accurate logic gate modeling. Last but not least, design methodologies for mandatory design flow stages such as logic synthesis and place and route need to be developed for the respective hardware systems to keep the reliability or the radiation-hardness at an acceptable level.
This Thesis proposes and investigates standard cell-based design methodologies and techniques for reliable and robust hardware systems implemented in a conventional semi-conductor technology. The focus of this work is on reliable differential logic design and robust radiation-hardening-by-design circuits. The synergistic connections of the digital design flow stages are systematically addressed for these two types of hardware systems. In more detail, a library for differential logic is extended with single-ended pseudo-gates for intermediate design steps to support the logic synthesis and layout generation with commercial Computer-Aided Design (CAD) tools. Special cell layouts are proposed to relax signal routing. A library set for space applications is similarly extended by novel Radiation-Hardening-by-Design (RHBD) Triple Modular Redundancy (TMR) cells, enabling a one fault correction. Therein, additional optimized architectures for glitch filter cells, robust scannable and self-correcting flip-flops, and clock-gates are proposed. The circuit concepts and the physical layout representation views of the differential logic gates and the RHBD cells are discussed. However, the quality of results of designs depends implicitly on the accuracy of the standard cell characterization which is examined for both types therefore. The entire design flow is elaborated from the hardware design description to the layout representations. A 2-Phase routing approach together with an intermediate design conversion step is proposed after the initial place and route stage for reliable, pure differential designs, whereas a special constraining for RHBD applications in a standard technology is presented.
The digital design flow for differential logic design is successfully demonstrated on a reliable differential bipolar CML application. A balanced routing result of its differential signal pairs is obtained by the proposed 2-Phase-routing approach. Moreover, the elaborated standard cell concepts and design methodology for RHBD circuits are applied to the digital part of a 7.5-15.5 MSPS 14-bit Analog-to-Digital Converter (ADC) and a complex microcontroller architecture. The ADC is implemented in an unhardened standard semiconductor technology and successfully verified by electrical measurements. The overhead of the proposed hardening approach is additionally evaluated by design exploration of the microcontroller application. Furthermore, the first obtained related measurement results of novel RHBD-∆TMR flip-flops show a radiation-tolerance up to a threshold Linear Energy Transfer (LET) of 46.1, 52.0, and 62.5 MeV cm2 mg-1 and savings in silicon area of 25-50 % for selected TMR standard cell candidates.
As a conclusion, the presented design concepts at the cell and library levels, as well as the design flow modifications are adaptable and transferable to other technology nodes. In particular, the design of hybrid solutions with integrated reliable differential logic modules together with robust radiation-tolerant circuit parts is enabled by the standard cell concepts and design methods proposed in this work.
Use of a standard non-rad-hard digital cell library in the rad-hard design can be a cost-effective solution for space applications. In this paper we demonstrate how a standard non-rad-hard flip-flop, as one of the most vulnerable digital cells, can be converted into a rad-hard flip-flop without modifying its internal structure. We present five variants of a Triple Modular Redundancy (TMR) flip-flop: baseline TMR flip-flop, latch-based TMR flip-flop, True-Single Phase Clock (TSPC) TMR flip-flop, scannable TMR flip-flop and self-correcting TMR flipflop. For all variants, the multi-bit upsets have been addressed by applying special placement constraints, while the Single Event Transient (SET) mitigation was achieved through the usage of customized SET filters and selection of optimal inverter sizes for the clock and reset trees. The proposed flip-flop variants feature differing performance, thus enabling to choose the optimal solution for every sensitive node in the circuit, according to the predefined design constraints. Several flip-flop designs have been validated on IHP's 130nm BiCMOS process, by irradiation of custom-designed shift registers. It has been shown that the proposed TMR flip-flops are robust to soft errors with a threshold Linear Energy Transfer (LET) from (32.4 MeV.cm(2)/mg) to (62.5 MeV.cm(2)/mg), depending on the variant.
Analyses of metagenomes in life sciences present new opportunities as well as challenges to the scientific community and call for advanced computational methods and workflows. The large amount of data collected from samples via next-generation sequencing (NGS) technologies render manual approaches to sequence comparison and annotation unsuitable. Rather, fast and efficient computational pipelines are needed to provide comprehensive statistics and summaries and enable the researcher to choose appropriate tools for more specific analyses. The workflow presented here builds upon previous pipelines designed for automated clustering and annotation of raw sequence reads obtained from next-generation sequencing technologies such as 454 and Illumina. Employing specialized algorithms, the sequence reads are processed at three different levels. First, raw reads are clustered at high similarity cutoff to yield clusters which can be exported as multifasta files for further analyses. Independently, open reading frames (ORFs) are predicted from raw reads and clustered at two strictness levels to yield sets of non-redundant sequences and ORF families. Furthermore, single ORFs are annotated by performing searches against the Pfam database
Manufacturing industries are undergoing a major paradigm shift towards more autonomy. Automated planning and scheduling then becomes a necessity. The Planning and Execution Competition for Logistics Robots in Simulation held at ICAPS is based on this scenario and provides an interesting testbed. However, the posed problem is challenging as also demonstrated by the somewhat weak results in 2017. The domain requires temporal reasoning and dealing with uncertainty. We propose a novel planning system based on Answer Set Programming and the Clingo solver to tackle these problems and incentivize robot cooperation. Our results show a significant performance improvement, both, in terms of lowering computational requirements and better game metrics.
With the jABC it is possible to realize workflows for numerous questions in different fields. The goal of this project was to create a workflow for the identification of differentially expressed genes. This is of special interest in biology, for it gives the opportunity to get a better insight in cellular changes due to exogenous stress, diseases and so on. With the knowledge that can be derived from the differentially expressed genes in diseased tissues, it becomes possible to find new targets for treatment.
Nowadays, model-driven engineering (MDE) promises to ease software development by decreasing the inherent complexity of classical software development. In order to deliver on this promise, MDE increases the level of abstraction and automation, through a consideration of domain-specific models (DSMs) and model operations (e.g. model transformations or code generations). DSMs conform to domain-specific modeling languages (DSMLs), which increase the level of abstraction, and model operations are first-class entities of software development because they increase the level of automation. Nevertheless, MDE has to deal with at least two new dimensions of complexity, which are basically caused by the increased linguistic and technological heterogeneity. The first dimension of complexity is setting up an MDE environment, an activity comprised of the implementation or selection of DSMLs and model operations. Setting up an MDE environment is both time-consuming and error-prone because of the implementation or adaptation of model operations. The second dimension of complexity is concerned with applying MDE for actual software development. Applying MDE is challenging because a collection of DSMs, which conform to potentially heterogeneous DSMLs, are required to completely specify a complex software system. A single DSML can only be used to describe a specific aspect of a software system at a certain level of abstraction and from a certain perspective. Additionally, DSMs are usually not independent but instead have inherent interdependencies, reflecting (partial) similar aspects of a software system at different levels of abstraction or from different perspectives. A subset of these dependencies are applications of various model operations, which are necessary to keep the degree of automation high. This becomes even worse when addressing the first dimension of complexity. Due to continuous changes, all kinds of dependencies, including the applications of model operations, must also be managed continuously. This comprises maintaining the existence of these dependencies and the appropriate (re-)application of model operations. The contribution of this thesis is an approach that combines traceability and model management to address the aforementioned challenges of configuring and applying MDE for software development. The approach is considered as a traceability approach because it supports capturing and automatically maintaining dependencies between DSMs. The approach is considered as a model management approach because it supports managing the automated (re-)application of heterogeneous model operations. In addition, the approach is considered as a comprehensive model management. Since the decomposition of model operations is encouraged to alleviate the first dimension of complexity, the subsequent composition of model operations is required to counteract their fragmentation. A significant portion of this thesis concerns itself with providing a method for the specification of decoupled yet still highly cohesive complex compositions of heterogeneous model operations. The approach supports two different kinds of compositions - data-flow compositions and context compositions. Data-flow composition is used to define a network of heterogeneous model operations coupled by sharing input and output DSMs alone. Context composition is related to a concept used in declarative model transformation approaches to compose individual model transformation rules (units) at any level of detail. In this thesis, context composition provides the ability to use a collection of dependencies as context for the composition of other dependencies, including model operations. In addition, the actual implementation of model operations, which are going to be composed, do not need to implement any composition concerns. The approach is realized by means of a formalism called an executable and dynamic hierarchical megamodel, based on the original idea of megamodels. This formalism supports specifying compositions of dependencies (traceability and model operations). On top of this formalism, traceability is realized by means of a localization concept, and model management by means of an execution concept.
Geospatial data has become a natural part of a growing number of information systems and services in the economy, society, and people's personal lives. In particular, virtual 3D city and landscape models constitute valuable information sources within a wide variety of applications such as urban planning, navigation, tourist information, and disaster management. Today, these models are often visualized in detail to provide realistic imagery. However, a photorealistic rendering does not automatically lead to high image quality, with respect to an effective information transfer, which requires important or prioritized information to be interactively highlighted in a context-dependent manner.
Approaches in non-photorealistic renderings particularly consider a user's task and camera perspective when attempting optimal expression, recognition, and communication of important or prioritized information. However, the design and implementation of non-photorealistic rendering techniques for 3D geospatial data pose a number of challenges, especially when inherently complex geometry, appearance, and thematic data must be processed interactively. Hence, a promising technical foundation is established by the programmable and parallel computing architecture of graphics processing units.
This thesis proposes non-photorealistic rendering techniques that enable both the computation and selection of the abstraction level of 3D geospatial model contents according to user interaction and dynamically changing thematic information. To achieve this goal, the techniques integrate with hardware-accelerated rendering pipelines using shader technologies of graphics processing units for real-time image synthesis. The techniques employ principles of artistic rendering, cartographic generalization, and 3D semiotics—unlike photorealistic rendering—to synthesize illustrative renditions of geospatial feature type entities such as water surfaces, buildings, and infrastructure networks. In addition, this thesis contributes a generic system that enables to integrate different graphic styles—photorealistic and non-photorealistic—and provide their seamless transition according to user tasks, camera view, and image resolution.
Evaluations of the proposed techniques have demonstrated their significance to the field of geospatial information visualization including topics such as spatial perception, cognition, and mapping. In addition, the applications in illustrative and focus+context visualization have reflected their potential impact on optimizing the information transfer regarding factors such as cognitive load, integration of non-realistic information, visualization of uncertainty, and visualization on small displays.
Geometric generalization is a fundamental concept in the digital mapping process. An increasing amount of spatial data is provided on the web as well as a range of tools to process it. This jABC workflow is used for the automatic testing of web-based generalization services like mapshaper.org by executing its functionality, overlaying both datasets before and after the transformation and displaying them visually in a .tif file. Mostly Web Services and command line tools are used to build an environment where ESRI shapefiles can be uploaded, processed through a chosen generalization service and finally visualized in Irfanview.
The objective of this thesis is to provide new space compaction techniques for testing or concurrent checking of digital circuits. In particular, the work focuses on the design of space compactors that achieve high compaction ratio and minimal loss of testability of the circuits. In the first part, the compactors are designed for combinational circuits based on the knowledge of the circuit structure. Several algorithms for analyzing circuit structures are introduced and discussed for the first time. The complexity of each design procedure is linear with respect to the number of gates of the circuit. Thus, the procedures are applicable to large circuits. In the second part, the first structural approach for output compaction for sequential circuits is introduced. Essentially, it enhances the first part. For the approach introduced in the third part it is assumed that the structure of the circuit and the underlying fault model are unknown. The space compaction approach requires only the knowledge of the fault-free test responses for a precomputed test set. The proposed compactor design guarantees zero-aliasing with respect to the precomputed test set.
Non-stationarities are ubiquitous in EEG signals. They are especially apparent in the use of EEG-based brain- computer interfaces (BCIs): (a) in the differences between the initial calibration measurement and the online operation of a BCI, or (b) caused by changes in the subject's brain processes during an experiment (e.g. due to fatigue, change of task involvement, etc). In this paper, we quantify for the first time such systematic evidence of statistical differences in data recorded during offline and online sessions. Furthermore, we propose novel techniques of investigating and visualizing data distributions, which are particularly useful for the analysis of (non-) stationarities. Our study shows that the brain signals used for control can change substantially from the offline calibration sessions to online control, and also within a single session. In addition to this general characterization of the signals, we propose several adaptive classification schemes and study their performance on data recorded during online experiments. An encouraging result of our study is that surprisingly simple adaptive methods in combination with an offline feature selection scheme can significantly increase BCI performance
Testability evaluation of sequential designs incorporating the multi-mode scannable memory element
(1999)
Business process models are used within a range of organizational initiatives, where every stakeholder has a unique perspective on a process and demands the respective model. As a consequence, multiple process models capturing the very same business process coexist. Keeping such models in sync is a challenge within an ever changing business environment: once a process is changed, all its models have to be updated. Due to a large number of models and their complex relations, model maintenance becomes error-prone and expensive. Against this background, business process model abstraction emerged as an operation reducing the number of stored process models and facilitating model management. Business process model abstraction is an operation preserving essential process properties and leaving out insignificant details in order to retain information relevant for a particular purpose. Process model abstraction has been addressed by several researchers. The focus of their studies has been on particular use cases and model transformations supporting these use cases. This thesis systematically approaches the problem of business process model abstraction shaping the outcome into a framework. We investigate the current industry demand in abstraction summarizing it in a catalog of business process model abstraction use cases. The thesis focuses on one prominent use case where the user demands a model with coarse-grained activities and overall process ordering constraints. We develop model transformations that support this use case starting with the transformations based on process model structure analysis. Further, abstraction methods considering the semantics of process model elements are investigated. First, we suggest how semantically related activities can be discovered in process models-a barely researched challenge. The thesis validates the designed abstraction methods against sets of industrial process models and discusses the method implementation aspects. Second, we develop a novel model transformation, which combined with the related activity discovery allows flexible non-hierarchical abstraction. In this way this thesis advocates novel model transformations that facilitate business process model management and provides the foundations for innovative tool support.
We construct a new RC phase shift network based Chua's circuit, which exhibits a period-doubling bifurcation route to chaos. Using coupled versions of such a phase-shift network based Chua's oscillators, we describe a new method for achieving complete synchronization (CS), approximate lag synchronization (LS), and approximate anticipating synchronization (AS) without delay or parameter mismatch. Employing the Pecora and Carroll approach, chaos synchronization is achieved in coupled chaotic oscillators, where the drive system variables control the response system. As a result, AS or LS or CS is demonstrated without using a variable delay line both experimentally and numerically.
The notion of coherence relations is quite widely accepted in general, but concrete proposals differ considerably on the questions of how they should be motivated, which relations are to be assumed, and how they should be defined. This paper takes a "bottom-up" perspective by assessing the contribution made by linguistic signals (connectives), using insights from the relevant literature as well as verification by practical text annotation. We work primarily with the German language here and focus on the realm of contrast. Thus, we suggest a new inventory of contrastive connective functions and discuss their relationship to contrastive coherence relations that have been proposed in earlier work.
Programmers make many changes to the program to eventually find a good solution for a given task. In this course of change, every intermediate development state can of value, when, for example, a promising ideas suddenly turn out inappropriate or the interplay of objects turns out more complex than initially expected before making changes. Programmers would benefit from tool support that provides immediate access to source code and run-time of previous development states of interest. We present IDE extensions, implemented for Squeak/Smalltalk, to preserve, retrieve, and work with this information. With such tool support, programmers can work without worries because they can rely on tools that help them with whatever their explorations will reveal. They no longer have to follow certain best practices only to avoid undesired consequences of changing code.
The use of neural networks is considered as the state of the art in the field of image classification. A large number of different networks are available for this purpose, which, appropriately trained, permit a high level of classification accuracy. Typically, these networks are applied to uncompressed image data, since a corresponding training was also carried out using image data of similar high quality. However, if image data contains image errors, the classification accuracy deteriorates drastically. This applies in particular to coding artifacts which occur due to image and video compression. Typical application scenarios for video compression are narrowband transmission channels for which video coding is required but a subsequent classification is to be carried out on the receiver side. In this paper we present a special H.264/Advanced Video Codec (AVC) based video codec that allows certain regions of a picture to be coded with near constant picture quality in order to allow a reliable classification using neural networks, whereas the remaining image will be coded using constant bit rate. We have combined this feature with the ability to run with lowest latency properties, which is usually also required in remote control applications scenarios. The codec has been implemented as a fully hardwired High Definition video capable hardware architecture which is suitable for Field Programmable Gate Arrays.
Background: The biological interpretation of large-scale gene expression data is one of the paramount challenges in current bioinformatics. In particular, placing the results in the context of other available functional genomics data, such as existing bio-ontologies, has already provided substantial improvement for detecting and categorizing genes of interest. One common approach is to look for functional annotations that are significantly enriched within a group or cluster of genes, as compared to a reference group. Results: In this work, we suggest the information-theoretic concept of mutual information to investigate the relationship between groups of genes, as given by data-driven clustering, and their respective functional categories. Drawing upon related approaches (Gibbons and Roth, Genome Research 12: 1574-1581, 2002), we seek to quantify to what extent individual attributes are sufficient to characterize a given group or cluster of genes. Conclusion: We show that the mutual information provides a systematic framework to assess the relationship between groups or clusters of genes and their functional annotations in a quantitative way. Within this framework, the mutual information allows us to address and incorporate several important issues, such as the interdependence of functional annotations and combinatorial combinations of attributes. It thus supplements and extends the conventional search for overrepresented attributes within a group or cluster of genes. In particular taking combinations of attributes into account, the mutual information opens the way to uncover specific functional descriptions of a group of genes or clustering result. All datasets and functional annotations used in this study are publicly available. All scripts used in the analysis are provided as additional files.
And/Or reasoning graphs for determining prime implicants in multi-level combinational networks
(1997)
PLATON
(2019)
Lesson planning is both an important and demanding task—especially as part of teacher training. This paper presents the requirements for a lesson planning system and evaluates existing systems regarding these requirements. One major drawback of existing software tools is that most are limited to a text- or form-based representation of the lesson designs. In this article, a new approach with a graphical, time-based representation with (automatic) analyses methods is proposed and the system architecture and domain model are described in detail. The approach is implemented in an interactive, web-based prototype called PLATON, which additionally supports the management of lessons in units as well as the modelling of teacher and student-generated resources. The prototype was evaluated in a study with 61 prospective teachers (bachelor’s and master’s preservice teachers as well as teacher trainees in post-university teacher training) in Berlin, Germany, with a focus on usability. The results show that this approach proofed usable for lesson planning and offers positive effects for the perception of time and self-reflection.
PLATON
(2019)
Lesson planning is both an important and demanding task—especially as part of teacher training. This paper presents the requirements for a lesson planning system and evaluates existing systems regarding these requirements. One major drawback of existing software tools is that most are limited to a text- or form-based representation of the lesson designs. In this article, a new approach with a graphical, time-based representation with (automatic) analyses methods is proposed and the system architecture and domain model are described in detail. The approach is implemented in an interactive, web-based prototype called PLATON, which additionally supports the management of lessons in units as well as the modelling of teacher and student-generated resources. The prototype was evaluated in a study with 61 prospective teachers (bachelor’s and master’s preservice teachers as well as teacher trainees in post-university teacher training) in Berlin, Germany, with a focus on usability. The results show that this approach proofed usable for lesson planning and offers positive effects for the perception of time and self-reflection.
A well-known result by Stein (1956) shows that in particular situations, biased estimators can yield better parameter estimates than their generally preferred unbiased counterparts. This letter follows the same spirit, as we will stabilize the unbiased generalization error estimates by regularization and finally obtain more robust model selection criteria for learning. We trade a small bias against a larger variance reduction, which has the beneficial effect of being more precise on a single training set. We focus on the subspace information criterion (SIC), which is an unbiased estimator of the expected generalization error measured by the reproducing kernel Hilbert space norm. SIC can be applied to the kernel regression, and it was shown in earlier experiments that a small regularization of SIC has a stabilization effect. However, it remained open how to appropriately determine the degree of regularization in SIC. In this article, we derive an unbiased estimator of the expected squared error, between SIC and the expected generalization error and propose determining the degree of regularization of SIC such that the estimator of the expected squared error is minimized. Computer simulations with artificial and real data sets illustrate that the proposed method works effectively for improving the precision of SIC, especially in the high-noise-level cases. We furthermore compare the proposed method to the original SIC, the cross-validation, and an empirical Bayesian method in ridge parameter selection, with good results
Think logarithmically!
(2015)
We discuss here a number of algorithmic topics which we
use in our teaching and in learning of mathematics and informatics to
illustrate and document the power of logarithm in designing very efficient
algorithms and computations – logarithmic thinking is one of the
most important key competencies for solving real world practical problems.
We demonstrate also how to introduce logarithm independently
of mathematical formalism using a conceptual model for reducing a
problem size by at least half. It is quite surprising that the idea, which
leads to logarithm, is present in Euclid’s algorithm described almost
2000 years before John Napier invented logarithm.
In recent years, many efforts have been made to apply image processing techniques for plant leaf identification. However, categorizing leaf images at the cultivar/variety level, because of the very low inter-class variability, is still a challenging task. In this research, we propose an automatic discriminative method based on convolutional neural networks (CNNs) for classifying 12 different cultivars of common beans that belong to three various species. We show that employing advanced loss functions, such as Additive Angular Margin Loss and Large Margin Cosine Loss, instead of the standard softmax loss function for the classification can yield better discrimination between classes and thereby mitigate the problem of low inter-class variability. The method was evaluated by classifying species (level I), cultivars from the same species (level II), and cultivars from different species (level III), based on images from the leaf foreside and backside. The results indicate that the performance of the classification algorithm on the leaf backside image dataset is superior. The maximum mean classification accuracies of 95.86, 91.37 and 86.87% were obtained at the levels I, II and III, respectively. The proposed method outperforms the previous relevant works and provides a reliable approach for plant cultivars identification.
Geocoder accuracy ranking
(2014)
Finding an address on a map is sometimes tricky: the chosen map application may be unfamiliar with the enclosed region. There are several geocoders on the market, they have different databases and algorithms to compute the query. Consequently, the geocoding results differ in their quality. Fortunately the geocoders provide a rich set of metadata. The workflow described in this paper compares this metadata with the aim to find out which geocoder is offering the best-fitting coordinate for a given address.
Biology has made great progress in identifying and measuring the building blocks of life. The availability of high-throughput methods in molecular biology has dramatically accelerated the growth of biological knowledge for various organisms. The advancements in genomic, proteomic and metabolomic technologies allow for constructing complex models of biological systems. An increasing number of biological repositories is available on the web, incorporating thousands of biochemical reactions and genetic regulations. Systems Biology is a recent research trend in life science, which fosters a systemic view on biology. In Systems Biology one is interested in integrating the knowledge from all these different sources into models that capture the interaction of these entities. By studying these models one wants to understand the emerging properties of the whole system, such as robustness. However, both measurements as well as biological networks are prone to considerable incompleteness, heterogeneity and mutual inconsistency, which makes it highly non-trivial to draw biologically meaningful conclusions in an automated way. Therefore, we want to promote Answer Set Programming (ASP) as a tool for discrete modeling in Systems Biology. ASP is a declarative problem solving paradigm, in which a problem is encoded as a logic program such that its answer sets represent solutions to the problem. ASP has intrinsic features to cope with incompleteness, offers a rich modeling language and highly efficient solving technology. We present ASP solutions, for the analysis of genetic regulatory networks, determining consistency with observed measurements and identifying minimal causes for inconsistency. We extend this approach for computing minimal repairs on model and data that restore consistency. This method allows for predicting unobserved data even in case of inconsistency. Further, we present an ASP approach to metabolic network expansion. This approach exploits the easy characterization of reachability in ASP and its various reasoning methods, to explore the biosynthetic capabilities of metabolic reaction networks and generate hypotheses for extending the network. Finally, we present the BioASP library, a Python library which encapsulates our ASP solutions into the imperative programming paradigm. The library allows for an easy integration of ASP solution into system rich environments, as they exist in Systems Biology.
One of the goals of artificial intelligence is to develop agents that learn and act in complex environments. Realistic environments typically feature a variable number of objects, relations amongst them, and non-deterministic transition behavior. While standard probabilistic sequence models provide efficient inference and learning techniques for sequential data, they typically cannot fully capture the relational complexity. On the other hand, statistical relational learning techniques are often too inefficient to cope with complex sequential data. In this paper, we introduce a simple model that occupies an intermediate position in this expressiveness/efficiency trade-off. It is based on CP-logic (Causal Probabilistic Logic), an expressive probabilistic logic for modeling causality. However, by specializing CP-logic to represent a probability distribution over sequences of relational state descriptions and employing a Markov assumption, inference and learning become more tractable and effective. Specifically, we show how to solve part of the inference and learning problems directly at the first-order level, while transforming the remaining part into the problem of computing all satisfying assignments for a Boolean formula in a binary decision diagram. We experimentally validate that the resulting technique is able to handle probabilistic relational domains with a substantial number of objects and relations.
The usage of mobile devices is rapidly growing with Android being the most prevalent mobile operating system. Thanks to the vast variety of mobile applications, users are preferring smartphones over desktops for day to day tasks like Internet surfing. Consequently, smartphones store a plenitude of sensitive data. This data together with the high values of smartphones make them an attractive target for device/data theft (thieves/malicious applications).
Unfortunately, state-of-the-art anti-theft solutions do not work if they do not have an active network connection, e.g., if the SIM card was removed from the device. In the majority of these cases, device owners permanently lose their smartphone together with their personal data, which is even worse.
Apart from that malevolent applications perform malicious activities to steal sensitive information from smartphones. Recent research considered static program analysis to detect dangerous data leaks. These analyses work well for data leaks due to inter-component communication, but suffer from shortcomings for inter-app communication with respect to precision, soundness, and scalability.
This thesis focuses on enhancing users' privacy on Android against physical device loss/theft and (un)intentional data leaks. It presents three novel frameworks: (1) ThiefTrap, an anti-theft framework for Android, (2) IIFA, a modular inter-app intent information flow analysis of Android applications, and (3) PIAnalyzer, a precise approach for PendingIntent vulnerability analysis.
ThiefTrap is based on a novel concept of an anti-theft honeypot account that protects the owner's data while preventing a thief from resetting the device.
We implemented the proposed scheme and evaluated it through an empirical user study with 35 participants. In this study, the owner's data could be protected, recovered, and anti-theft functionality could be performed unnoticed from the thief in all cases.
IIFA proposes a novel approach for Android's inter-component/inter-app communication (ICC/IAC) analysis. Our main contribution is the first fully automatic, sound, and precise ICC/IAC information flow analysis that is scalable for realistic apps due to modularity, avoiding combinatorial explosion: Our approach determines communicating apps using short summaries rather than inlining intent calls between components and apps, which requires simultaneously analyzing all apps installed on a device.
We evaluate IIFA in terms of precision, recall, and demonstrate its scalability to a large corpus of real-world apps. IIFA reports 62 problematic ICC-/IAC-related information flows via two or more apps/components.
PIAnalyzer proposes a novel approach to analyze PendingIntent related vulnerabilities. PendingIntents are a powerful and universal feature of Android for inter-component communication. We empirically evaluate PIAnalyzer on a set of 1000 randomly selected applications and find 1358 insecure usages of PendingIntents, including 70 severe vulnerabilities.
Many Android applications embed webpages via WebView components and execute JavaScript code within Android. Hybrid applications leverage dedicated APIs to load a resource and render it in a WebView. Furthermore, Android objects can be shared with the JavaScript world. However, bridging the interfaces of the Android and JavaScript world might also incur severe security threats: Potentially untrusted webpages and their JavaScript might interfere with the Android environment and its access to native features.
No general analysis is currently available to assess the implications of such hybrid apps bridging the two worlds. To understand the semantics and effects of hybrid apps, we perform a large-scale study on the usage of the hybridization APIs in the wild. We analyze and categorize the parameters to hybridization APIs for 7,500 randomly selected and the 196 most popular applications from the Google Playstore as well as 1000 malware samples. Our results advance the general understanding of hybrid applications, as well as implications for potential program analyses, and the current security situation: We discovered thousands of flows of sensitive data from Android to JavaScript, the vast majority of which could flow to potentially untrustworthy code. Our analysis identified numerous web pages embedding vulnerabilities, which we exemplarily exploited. Additionally, we discovered a multitude of applications in which potentially untrusted JavaScript code may interfere with (trusted) Android objects, both in benign and malign applications.
Answer set planning
(2022)
Answer Set Planning refers to the use of Answer Set Programming (ASP) to compute plans, that is, solutions to planning problems, that transform a given state of the world to another state. The development of efficient and scalable answer set solvers has provided a significant boost to the development of ASP-based planning systems. This paper surveys the progress made during the last two and a half decades in the area of answer set planning, from its foundations to its use in challenging planning domains. The survey explores the advantages and disadvantages of answer set planning. It also discusses typical applications of answer set planning and presents a set of challenges for future research.
This thesis addresses real-time rendering techniques for 3D information lenses based on the focus & context metaphor. It analyzes, conceives, implements, and reviews its applicability to objects and structures of virtual 3D city models. In contrast to digital terrain models, the application of focus & context visualization to virtual 3D city models is barely researched. However, the purposeful visualization of contextual data of is extreme importance for the interactive exploration and analysis of this field. Programmable hardware enables the implementation of new lens techniques, that allow the augmentation of the perceptive and cognitive quality of the visualization compared to classical perspective projections. A set of 3D information lenses is integrated into a 3D scene-graph system: • Occlusion lenses modify the appearance of virtual 3D city model objects to resolve their occlusion and consequently facilitate the navigation. • Best-view lenses display city model objects in a priority-based manner and mediate their meta information. Thus, they support exploration and navigation of virtual 3D city models. • Color and deformation lenses modify the appearance and geometry of 3D city models to facilitate their perception. The presented techniques for 3D information lenses and their application to virtual 3D city models clarify their potential for interactive visualization and form a base for further development.
Interactive rendering techniques for focus+context visualization of 3D geovirtual environments
(2013)
This thesis introduces a collection of new real-time rendering techniques and applications for focus+context visualization of interactive 3D geovirtual environments such as virtual 3D city and landscape models. These environments are generally characterized by a large number of objects and are of high complexity with respect to geometry and textures. For these reasons, their interactive 3D rendering represents a major challenge. Their 3D depiction implies a number of weaknesses such as occlusions, cluttered image contents, and partial screen-space usage. To overcome these limitations and, thus, to facilitate the effective communication of geo-information, principles of focus+context visualization can be used for the design of real-time 3D rendering techniques for 3D geovirtual environments (see Figure). In general, detailed views of a 3D geovirtual environment are combined seamlessly with abstracted views of the context within a single image. To perform the real-time image synthesis required for interactive visualization, dedicated parallel processors (GPUs) for rasterization of computer graphics primitives are used. For this purpose, the design and implementation of appropriate data structures and rendering pipelines are necessary. The contribution of this work comprises the following five real-time rendering methods: • The rendering technique for 3D generalization lenses enables the combination of different 3D city geometries (e.g., generalized versions of a 3D city model) in a single image in real time. The method is based on a generalized and fragment-precise clipping approach, which uses a compressible, raster-based data structure. It enables the combination of detailed views in the focus area with the representation of abstracted variants in the context area. • The rendering technique for the interactive visualization of dynamic raster data in 3D geovirtual environments facilitates the rendering of 2D surface lenses. It enables a flexible combination of different raster layers (e.g., aerial images or videos) using projective texturing for decoupling image and geometry data. Thus, various overlapping and nested 2D surface lenses of different contents can be visualized interactively. • The interactive rendering technique for image-based deformation of 3D geovirtual environments enables the real-time image synthesis of non-planar projections, such as cylindrical and spherical projections, as well as multi-focal 3D fisheye-lenses and the combination of planar and non-planar projections. • The rendering technique for view-dependent multi-perspective views of 3D geovirtual environments, based on the application of global deformations to the 3D scene geometry, can be used for synthesizing interactive panorama maps to combine detailed views close to the camera (focus) with abstract views in the background (context). This approach reduces occlusions, increases the usage the available screen space, and reduces the overload of image contents. • The object-based and image-based rendering techniques for highlighting objects and focus areas inside and outside the view frustum facilitate preattentive perception. The concepts and implementations of interactive image synthesis for focus+context visualization and their selected applications enable a more effective communication of spatial information, and provide building blocks for design and development of new applications and systems in the field of 3D geovirtual environments.
The submission and management of computational jobs is a traditional part of utility computing environments. End users and developers of domain-specific software abstractions often have to deal with the heterogeneity of such batch processing systems. This lead to a number of application programming interface and job description standards in the past, which are implemented and established for cluster and Grid systems. With the recent rise of cloud computing as new utility computing paradigm, the standardized access to batch processing facilities operated on cloud resources becomes an important issue. Furthermore, the design of such a standard has to consider a tradeoff between feature completeness and the achievable level of interoperability. The article discusses this general challenge, and presents some existing standards with traditional cluster and Grid computing background that may be applicable to cloud environments. We present OCCI-DRMAA as one approach for standardized access to batch processing facilities hosted in a cloud.
Monitoring virtual team collaboration : methods, applications and experiences in engineering design
(2010)
Computational analysis of virtual team collaboration in teh early stages of engineering design
(2010)
The concept of Linked Data has made its entrance in the cultural heritage sector due to its potential use for the integration of heterogeneous collections and deriving additional value out of existing metadata. However, practitioners and researchers alike need a better understanding of what outcome they can reasonably expect of the reconciliation process between their local metadata and established controlled vocabularies which are already a part of the Linked Data cloud. This paper offers an in-depth analysis of how a locally developed vocabulary can be successfully reconciled with the Library of Congress Subject Headings (LCSH) and the Arts and Architecture Thesaurus (AAT) through the help of a general-purpose tool for interactive data transformation (OpenRefine). Issues negatively affecting the reconciliation process are identified and solutions are proposed in order to derive maximum value from existing metadata and controlled vocabularies in an automated manner.
Deciphering the functioning of biological networks is one of the central tasks in systems biology. In particular, signal transduction networks are crucial for the understanding of the cellular response to external and internal perturbations. Importantly, in order to cope with the complexity of these networks, mathematical and computational modeling is required. We propose a computational modeling framework in order to achieve more robust discoveries in the context of logical signaling networks. More precisely, we focus on modeling the response of logical signaling networks by means of automated reasoning using Answer Set Programming (ASP). ASP provides a declarative language for modeling various knowledge representation and reasoning problems. Moreover, available ASP solvers provide several reasoning modes for assessing the multitude of answer sets. Therefore, leveraging its rich modeling language and its highly efficient solving capacities, we use ASP to address three challenging problems in the context of logical signaling networks: learning of (Boolean) logical networks, experimental design, and identification of intervention strategies. Overall, the contribution of this thesis is three-fold. Firstly, we introduce a mathematical framework for characterizing and reasoning on the response of logical signaling networks. Secondly, we contribute to a growing list of successful applications of ASP in systems biology. Thirdly, we present a software providing a complete pipeline for automated reasoning on the response of logical signaling networks.
Boolean networks provide a simple yet powerful qualitative modeling approach in systems biology. However, manual identification of logic rules underlying the system being studied is in most cases out of reach. Therefore, automated inference of Boolean logical networks from experimental data is a fundamental question in this field. This paper addresses the problem consisting of learning from a prior knowledge network describing causal interactions and phosphorylation activities at a pseudo-steady state, Boolean logic models of immediate-early response in signaling transduction networks. The underlying optimization problem has been so far addressed through mathematical programming approaches and the use of dedicated genetic algorithms. In a recent work we have shown severe limitations of stochastic approaches in this domain and proposed to use Answer Set Programming (ASP), considering a simpler problem setting. Herein, we extend our previous work in order to consider more realistic biological conditions including numerical datasets, the presence of feedback-loops in the prior knowledge network and the necessity of multi-objective optimization. In order to cope with such extensions, we propose several discretization schemes and elaborate upon our previous ASP encoding. Towards real-world biological data, we evaluate the performance of our approach over in silico numerical datasets based on a real and large-scale prior knowledge network. The correctness of our encoding and discretization schemes are dealt with in Appendices A-B. (C) 2014 Elsevier B.V. All rights reserved.
Exploratory Data Analysis
(2014)
In bioinformatics the term exploratory data analysis refers to different methods to get an overview of large biological data sets. Hence, it helps to create a framework for further analysis and hypothesis testing. The workflow facilitates this first important step of the data analysis created by high-throughput technologies. The results are different plots showing the structure of the measurements. The goal of the workflow is the automatization of the exploratory data analysis, but also the flexibility should be guaranteed. The basic tool is the free software R.
In recent years, named entity linking (NEL) tools were primarily developed in terms of a general approach, whereas today numerous tools are focusing on specific domains such as e.g. the mapping of persons and organizations only, or the annotation of locations or events in microposts. However, the available benchmark datasets necessary for the evaluation of NEL tools do not reflect this focalizing trend. We have analyzed the evaluation process applied in the NEL benchmarking framework GERBIL [in: Proceedings of the 24th International Conference on World Wide Web (WWW’15), International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland, 2015, pp. 1133–1143, Semantic Web 9(5) (2018), 605–625] and all its benchmark datasets. Based on these insights we have extended the GERBIL framework to enable a more fine grained evaluation and in depth analysis of the available benchmark datasets with respect to different emphases. This paper presents the implementation of an adaptive filter for arbitrary entities and customized benchmark creation as well as the automated determination of typical NEL benchmark dataset properties, such as the extent of content-related ambiguity and diversity. These properties are integrated on different levels, which also enables to tailor customized new datasets out of the existing ones by remixing documents based on desired emphases. Besides a new system library to enrich provided NIF [in: International Semantic Web Conference (ISWC’13), Lecture Notes in Computer Science, Vol. 8219, Springer, Berlin, Heidelberg, 2013, pp. 98–113] datasets with statistical information, best practices for dataset remixing are presented, and an in depth analysis of the performance of entity linking systems on special focus datasets is presented.
Formalizing informal logic
(2015)
In this paper we investigate the extent to which formal argumentation models can handle ten basic characteristics of informal logic identified in the informal logic literature. By showing how almost all of these characteristics can be successfully modelled formally, we claim that good progress can be made toward the project of formalizing informal logic. Of the formal argumentation models available, we chose the Carneades Argumentation System (CAS), a formal, computational model of argument that uses argument graphs as its basis, structures of a kind very familiar to practitioners of informal logic through their use of argument diagrams.