Institut für Informatik und Computational Science
Refine
Year of publication
- 2019 (21) (remove)
Document Type
- Article (10)
- Doctoral Thesis (4)
- Other (4)
- Postprint (3)
Language
- English (21)
Is part of the Bibliography
- yes (21)
Keywords
- Equilibrium logic (3)
- answer set programming (3)
- Answer Set Programming (2)
- Answer set programming (2)
- Non-monotonic reasoning (2)
- automatic feedback (2)
- lesson planning (2)
- lesson preparation (2)
- support system (2)
- Aggregates (1)
- Android Security (1)
- Argumentation (1)
- Augenbewegungen (1)
- Beweis (1)
- Beweisassistent (1)
- Beweisumgebung (1)
- Bot Detection (1)
- Computer security (1)
- Coq (1)
- Curry (1)
- Customer ownership (1)
- Denotational semantics (1)
- Digitalization (1)
- Emotionen (1)
- Emotionsforschung (1)
- Entity Linking (1)
- Explicit negation (1)
- Forgetting (1)
- Formalismus (1)
- Formalitätsgrad (1)
- GERBIL (1)
- Gesichtsausdruck (1)
- Graph Convolutional Neural Networks (1)
- Graph Embedding (1)
- Https traffic (1)
- Insurance industry (1)
- Logik (1)
- Machine learning (1)
- Mathematikdidaktik (1)
- Mathematikphilosophie (1)
- Multi-sided platforms (1)
- Neural networks (1)
- OSSE (1)
- Objektive Schwierigkeit (1)
- Privacy Protection (1)
- SWOT (1)
- Social Media Analysis (1)
- Static Analysis (1)
- Strong equivalence (1)
- Taktik (1)
- Traffic data (1)
- Value network (1)
- Wahrnehmung (1)
- Wahrnehmung von Arousal (1)
- Wahrnehmungsunterschiede (1)
- action and change (1)
- argumentation (1)
- arousal perception (1)
- automated planning (1)
- benchmark (1)
- bioinformatics (1)
- biometrics (1)
- biometrische Identifikation (1)
- computational methods (1)
- computergestützte Methoden (1)
- correlated errors (1)
- degree of formality (1)
- detrending (1)
- emotion (1)
- emotion representation (1)
- emotion research (1)
- ensemble kalman filter (1)
- epistemic logic programs (1)
- epistemic specifications (1)
- evaluation (1)
- explicit negation (1)
- eye movements (1)
- face tracking (1)
- facial expression (1)
- formalism (1)
- gap-filling (1)
- hybrid solving (1)
- knowledge representation and nonmonotonic reasoning (1)
- linear programming (1)
- logic (1)
- mathematics education (1)
- metabolic network (1)
- non-monotonic reasoning (1)
- objective difficulty (1)
- perception (1)
- perception differences (1)
- philosophy of mathematics (1)
- probabilistic deep learning (1)
- probabilistic deep metric learning (1)
- probabilistische tiefe neuronale Netze (1)
- probabilistisches tiefes metrisches Lernen (1)
- projection (1)
- proof (1)
- proof assistant (1)
- proof environment (1)
- tactic (1)
- technical notes and rapid communications (1)
Institute
A central insight from psychological studies on human eye movements is that eye movement patterns are highly individually characteristic. They can, therefore, be used as a biometric feature, that is, subjects can be identified based on their eye movements. This thesis introduces new machine learning methods to identify subjects based on their eye movements while viewing arbitrary content. The thesis focuses on probabilistic modeling of the problem, which has yielded the best results in the most recent literature. The thesis studies the problem in three phases by proposing a purely probabilistic, probabilistic deep learning, and probabilistic deep metric learning approach. In the first phase, the thesis studies models that rely on psychological concepts about eye movements. Recent literature illustrates that individual-specific distributions of gaze patterns can be used to accurately identify individuals. In these studies, models were based on a simple parametric family of distributions. Such simple parametric models can be robustly estimated from sparse data, but have limited flexibility to capture the differences between individuals. Therefore, this thesis proposes a semiparametric model of gaze patterns that is flexible yet robust for individual identification. These patterns can be understood as domain knowledge derived from psychological literature. Fixations and saccades are examples of simple gaze patterns. The proposed semiparametric densities are drawn under a Gaussian process prior centered at a simple parametric distribution. Thus, the model will stay close to the parametric class of densities if little data is available, but it can also deviate from this class if enough data is available, increasing the flexibility of the model. The proposed method is evaluated on a large-scale dataset, showing significant improvements over the state-of-the-art. Later, the thesis replaces the model based on gaze patterns derived from psychological concepts with a deep neural network that can learn more informative and complex patterns from raw eye movement data. As previous work has shown that the distribution of these patterns across a sequence is informative, a novel statistical aggregation layer called the quantile layer is introduced. It explicitly fits the distribution of deep patterns learned directly from the raw eye movement data. The proposed deep learning approach is end-to-end learnable, such that the deep model learns to extract informative, short local patterns while the quantile layer learns to approximate the distributions of these patterns. Quantile layers are a generic approach that can converge to standard pooling layers or have a more detailed description of the features being pooled, depending on the problem. The proposed model is evaluated in a large-scale study using the eye movements of subjects viewing arbitrary visual input. The model improves upon the standard pooling layers and other statistical aggregation layers proposed in the literature. It also improves upon the state-of-the-art eye movement biometrics by a wide margin. Finally, for the model to identify any subject — not just the set of subjects it is trained on — a metric learning approach is developed. Metric learning learns a distance function over instances. The metric learning model maps the instances into a metric space, where sequences of the same individual are close, and sequences of different individuals are further apart. This thesis introduces a deep metric learning approach with distributional embeddings. The approach represents sequences as a set of continuous distributions in a metric space; to achieve this, a new loss function based on Wasserstein distances is introduced. The proposed method is evaluated on multiple domains besides eye movement biometrics. This approach outperforms the state of the art in deep metric learning in several domains while also outperforming the state of the art in eye movement biometrics.
A common feature in Answer Set Programming is the use of a second negation, stronger than default negation and sometimes called explicit, strong or classical negation. This explicit negation is normally used in front of atoms, rather than allowing its use as a regular operator. In this paper we consider the arbitrary combination of explicit negation with nested expressions, as those defined by Lifschitz, Tang and Turner. We extend the concept of reduct for this new syntax and then prove that it can be captured by an extension of Equilibrium Logic with this second negation. We study some properties of this variant and compare to the already known combination of Equilibrium Logic with Nelson's strong negation.
In this work we tackle the problem of checking strong equivalence of logic programs that may contain local auxiliary atoms, to be removed from their stable models and to be forbidden in any external context. We call this property projective strong equivalence (PSE). It has been recently proved that not any logic program containing auxiliary atoms can be reformulated, under PSE, as another logic program or formula without them – this is known as strongly persistent forgetting. In this paper, we introduce a conservative extension of Equilibrium Logic and its monotonic basis, the logic of Here-and-There, in which we deal with a new connective ‘|’ we call fork. We provide a semantic characterisation of PSE for forks and use it to show that, in this extension, it is always possible to forget auxiliary atoms under strong persistence. We further define when the obtained fork is representable as a regular formula.
Detect me if you can
(2019)
Spam Bots have become a threat to online social networks with their malicious behavior, posting misinformation messages and influencing online platforms to fulfill their motives. As spam bots have become more advanced over time, creating algorithms to identify bots remains an open challenge. Learning low-dimensional embeddings for nodes in graph structured data has proven to be useful in various domains. In this paper, we propose a model based on graph convolutional neural networks (GCNN) for spam bot detection. Our hypothesis is that to better detect spam bots, in addition to defining a features set, the social graph must also be taken into consideration. GCNNs are able to leverage both the features of a node and aggregate the features of a node’s neighborhood. We compare our approach, with two methods that work solely on a features set and on the structure of the graph. To our knowledge, this work is the first attempt of using graph convolutional neural networks in spam bot detection.
In this thesis we introduce the concept of the degree of formality. It is directed against a dualistic point of view, which only distinguishes between formal and informal proofs. This dualistic attitude does not respect the differences between the argumentations classified as informal and it is unproductive because the individual potential of the respective argumentation styles cannot be appreciated and remains untapped.
This thesis has two parts. In the first of them we analyse the concept of the degree of formality (including a discussion about the respective benefits for each degree) while in the second we demonstrate its usefulness in three case studies. In the first case study we will repair Haskell B. Curry's view of mathematics, which incidentally is of great importance in the first part of this thesis, in light of the different degrees of formality. In the second case study we delineate how awareness of the different degrees of formality can be used to help students to learn how to prove. Third, we will show how the advantages of proofs of different degrees of formality can be combined by the development of so called tactics having a medium degree of formality. Together the three case studies show that the degrees of formality provide a convincing solution to the problem of untapped potential.
A distinguishing feature of Answer Set Programming is that all atoms belonging to a stable model must be founded. That is, an atom must not only be true but provably true. This can be made precise by means of the constructive logic of Here-and-There, whose equilibrium models correspond to stable models. One way of looking at foundedness is to regard Boolean truth values as ordered by letting true be greater than false. Then, each Boolean variable takes the smallest truth value that can be proven for it. This idea was generalized by Aziz to ordered domains and applied to constraint satisfaction problems. As before, the idea is that a, say integer, variable gets only assigned to the smallest integer that can be justified. In this paper, we present a logical reconstruction of Aziz’ idea in the setting of the logic of Here-and-There. More precisely, we start by defining the logic of Here-and-There with lower bound founded variables along with its equilibrium models and elaborate upon its formal properties. Finally, we compare our approach with related ones and sketch future work.
Answer Set Programming (ASP) has become a popular and widespread paradigm for practical Knowledge Representation thanks to its expressiveness and the available enhancements of its input language. One of such enhancements is the use of aggregates, for which different semantic proposals have been made. In this paper, we show that any ASP aggregate interpreted under Gelfond and Zhang's (GZ) semantics can be replaced (under strong equivalence) by a propositional formula. Restricted to the original GZ syntax, the resulting formula is reducible to a disjunction of conjunctions of literals but the formulation is still applicable even when the syntax is extended to allow for arbitrary formulas (including nested aggregates) in the condition. Once GZ-aggregates are represented as formulas, we establish a formal comparison (in terms of the logic of Here-and-There) to Ferraris' (F) aggregates, which are defined by a different formula translation involving nested implications. In particular, we prove that if we replace an F-aggregate by a GZ-aggregate in a rule head, we do not lose answer sets (although more can be gained). This extends the previously known result that the opposite happens in rule bodies, i.e., replacing a GZ-aggregate by an F-aggregate in the body may yield more answer sets. Finally, we characterize a class of aggregates for which GZ- and F-semantics coincide.
plasp 3
(2019)
We describe the new version of the Planning Domain Definition Language (PDDL)-to-Answer Set Programming (ASP) translator plasp. First, it widens the range of accepted PDDL features. Second, it contains novel planning encodings, some inspired by Satisfiability Testing (SAT) planning and others exploiting ASP features such as well-foundedness. All of them are designed for handling multivalued fluents in order to capture both PDDL as well as SAS planning formats. Third, enabled by multishot ASP solving, it offers advanced planning algorithms also borrowed from SAT planning. As a result, plasp provides us with an ASP-based framework for studying a variety of planning techniques in a uniform setting. Finally, we demonstrate in an empirical analysis that these techniques have a significant impact on the performance of ASP planning.
In a recent line of research, two familiar concepts from logic programming semantics (unfounded sets and splitting) were extrapolated to the case of epistemic logic programs. The property of epistemic splitting provides a natural and modular way to understand programs without epistemic cycles but, surprisingly, was only fulfilled by Gelfond's original semantics (G91), among the many proposals in the literature. On the other hand, G91 may suffer from a kind of self-supported, unfounded derivations when epistemic cycles come into play. Recently, the absence of these derivations was also formalised as a property of epistemic semantics called foundedness. Moreover, a first semantics proved to satisfy foundedness was also proposed, the so-called Founded Autoepistemic Equilibrium Logic (FAEEL). In this paper, we prove that FAEEL also satisfies the epistemic splitting property something that, together with foundedness, was not fulfilled by any other approach up to date. To prove this result, we provide an alternative characterisation of FAEEL as a combination of G91 with a simpler logic we called Founded Epistemic Equilibrium Logic (FEEL), which is somehow an extrapolation of the stable model semantics to the modal logic S5.
In this paper, we consider counting and projected model counting of extensions in abstract argumentation for various semantics. When asking for projected counts we are interested in counting the number of extensions of a given argumentation framework while multiple extensions that are identical when restricted to the projected arguments count as only one projected extension. We establish classical complexity results and parameterized complexity results when the problems are parameterized by treewidth of the undirected argumentation graph. To obtain upper bounds for counting projected extensions, we introduce novel algorithms that exploit small treewidth of the undirected argumentation graph of the input instance by dynamic programming (DP). Our algorithms run in time double or triple exponential in the treewidth depending on the considered semantics. Finally, we take the exponential time hypothesis (ETH) into account and establish lower bounds of bounded treewidth algorithms for counting extensions and projected extension.
Metabolic networks play a crucial role in biology since they capture all chemical reactions in an organism. While there are networks of high quality for many model organisms, networks for less studied organisms are often of poor quality and suffer from incompleteness. To this end, we introduced in previous work an answer set programming (ASP)-based approach to metabolic network completion. Although this qualitative approach allows for restoring moderately degraded networks, it fails to restore highly degraded ones. This is because it ignores quantitative constraints capturing reaction rates. To address this problem, we propose a hybrid approach to metabolic network completion that integrates our qualitative ASP approach with quantitative means for capturing reaction rates. We begin by formally reconciling existing stoichiometric and topological approaches to network completion in a unified formalism. With it, we develop a hybrid ASP encoding and rely upon the theory reasoning capacities of the ASP system dingo for solving the resulting logic program with linear constraints over reals. We empirically evaluate our approach by means of the metabolic network of Escherichia coli. Our analysis shows that our novel approach yields greatly superior results than obtainable from purely qualitative or quantitative approaches.
The Surface Water and Ocean Topography (SWOT) mission is a next generation satellite mission expected to provide a 2 km-resolution observation of the sea surface height (SSH) on a two-dimensional swath. Processing SWOT data will be challenging because of the large amount of data, the mismatch between a high spatial resolution and a low temporal resolution, and the observation errors. The present paper focuses on the reduction of the spatially structured errors of SWOT SSH data. It investigates a new error reduction method and assesses its performance in an observing system simulation experiment. The proposed error-reduction method first projects the SWOT SSH onto a subspace spanned by the SWOT spatially structured errors. This projection is removed from the SWOT SSH to obtain a detrended SSH. The detrended SSH is then processed within an ensemble data assimilation analysis to retrieve a full SSH field. In the latter step, the detrending is applied to both the SWOT data and an ensemble of model-simulated SSH fields. Numerical experiments are performed with synthetic SWOT observations and an ensemble from a North Atlantic, 1/60 degrees simulation of the ocean circulation (NATL60). The data assimilation analysis is carried out with an ensemble Kalman filter. The results are assessed with root mean square errors, power spectrum density, and spatial coherence. They show that a significant part of the large scale SWOT errors is reduced. The filter analysis also reduces the small scale errors and allows for an accurate recovery of the energy of the signal down to 25 km scales. In addition, using the SWOT nadir data to adjust the SSH detrending further reduces the errors.
Multi-sided platforms (MSP) strongly affect markets and play a crucial part within the digital and networked economy. Although empirical evidence indicates their occurrence in many industries, research has not investigated the game-changing impact of MSP on traditional markets to a sufficient extent. More specifically, we have little knowledge of how MSP affect value creation and customer interaction in entire markets, exploiting the potential of digital technologies to offer new value propositions. Our paper addresses this research gap and provides an initial systematic approach to analyze the impact of MSP on the insurance industry. For this purpose, we analyze the state of the art in research and practice in order to develop a reference model of the value network for the insurance industry. On this basis, we conduct a case-study analysis to discover and analyze roles which are occupied or even newly created by MSP. As a final step, we categorize MSP with regard to their relation to traditional insurance companies, resulting in a classification scheme with four MSP standard types: Competition, Coordination, Cooperation, Collaboration.
Detection of malware-infected computers and detection of malicious web domains based on their encrypted HTTPS traffic are challenging problems, because only addresses, timestamps, and data volumes are observable. The detection problems are coupled, because infected clients tend to interact with malicious domains. Traffic data can be collected at a large scale, and antivirus tools can be used to identify infected clients in retrospect. Domains, by contrast, have to be labeled individually after forensic analysis. We explore transfer learning based on sluice networks; this allows the detection models to bootstrap each other. In a large-scale experimental study, we find that the model outperforms known reference models and detects previously unknown malware, previously unknown malware families, and previously unknown malicious domains.
The target article discusses the question of how educational makerspaces can become places supportive of knowledge construction. This question is too often neglected by people who run makerspaces, as they mostly explain how to use different tools and focus on the creation of a product. In makerspaces, often pupils also engage in physical computing activities and thus in the creation of interactive artifacts containing embedded systems, such as smart shoes or wristbands, plant monitoring systems or drink mixing machines. This offers the opportunity to reflect on teaching physical computing in computer science education, where similarly often the creation of the product is so strongly focused upon that the reflection of the learning process is pushed into the background.
Emotions are a central element of human experience. They occur with high frequency in everyday life and play an important role in decision making. However, currently there is no consensus among researchers on what constitutes an emotion and on how emotions should be investigated. This dissertation identifies three problems of current emotion research: the problem of ground truth, the problem of incomplete constructs and the problem of optimal representation. I argue for a focus on the detailed measurement of emotion manifestations with computer-aided methods to solve these problems. This approach is demonstrated in three research projects, which describe the development of methods specific to these problems as well as their application to concrete research questions.
The problem of ground truth describes the practice to presuppose a certain structure of emotions as the a priori ground truth. This determines the range of emotion descriptions and sets a standard for the correct assignment of these descriptions. The first project illustrates how this problem can be circumvented with a multidimensional emotion perception paradigm which stands in contrast to the emotion recognition paradigm typically employed in emotion research. This paradigm allows to calculate an objective difficulty measure and to collect subjective difficulty ratings for the perception of emotional stimuli. Moreover, it enables the use of an arbitrary number of emotion stimuli categories as compared to the commonly used six basic emotion categories. Accordingly, we collected data from 441 participants using dynamic facial expression stimuli from 40 emotion categories. Our findings suggest an increase in emotion perception difficulty with increasing actor age and provide evidence to suggest that young adults, the elderly and men underestimate their emotion perception difficulty. While these effects were predicted from the literature, we also found unexpected and novel results. In particular, the increased difficulty on the objective difficulty measure for female actors and observers stood in contrast to reported findings. Exploratory analyses revealed low relevance of person-specific variables for the prediction of emotion perception difficulty, but highlighted the importance of a general pleasure dimension for the ease of emotion perception.
The second project targets the problem of incomplete constructs which relates to vaguely defined psychological constructs on emotion with insufficient ties to tangible manifestations. The project exemplifies how a modern data collection method such as face tracking data can be used to sharpen these constructs on the example of arousal, a long-standing but fuzzy construct in emotion research. It describes how measures of distance, speed and magnitude of acceleration can be computed from face tracking data and investigates their intercorrelations. We find moderate to strong correlations among all measures of static information on one hand and all measures of dynamic information on the other. The project then investigates how self-rated arousal is tied to these measures in 401 neurotypical individuals and 19 individuals with autism. Distance to the neutral face was predictive of arousal ratings in both groups. Lower mean arousal ratings were found for the autistic group, but no difference in correlation of the measures and arousal ratings could be found between groups. Results were replicated in a high autistic traits group consisting of 41 participants. The findings suggest a qualitatively similar perception of arousal for individuals with and without autism. No correlations between valence ratings and any of the measures could be found which emphasizes the specificity of our tested measures for the construct of arousal.
The problem of optimal representation refers to the search for the best representation of emotions and the assumption that there is a one-fits-all solution. In the third project we introduce partial least squares analysis as a general method to find an optimal representation to relate two high-dimensional data sets to each other. The project demonstrates its applicability to emotion research on the question of emotion perception differences between men and women. The method was used with emotion rating data from 441 participants and face tracking data computed on 306 videos. We found quantitative as well as qualitative differences in the perception of emotional facial expressions between these groups. We showed that women’s emotional perception systematically captured more of the variance in facial expressions. Additionally, we could show that significant differences exist in the way that women and men perceive some facial expressions which could be visualized as concrete facial expression sequences. These expressions suggest differing perceptions of masked and ambiguous facial expressions between the sexes. In order to facilitate use of the developed method by the research community, a package for the statistical environment R was written. Furthermore, to call attention to the method and its usefulness for emotion research, a website was designed that allows users to explore a model of emotion ratings and facial expression data in an interactive fashion.
PLATON
(2019)
Lesson planning is both an important and demanding task—especially as part of teacher training. This paper presents the requirements for a lesson planning system and evaluates existing systems regarding these requirements. One major drawback of existing software tools is that most are limited to a text- or form-based representation of the lesson designs. In this article, a new approach with a graphical, time-based representation with (automatic) analyses methods is proposed and the system architecture and domain model are described in detail. The approach is implemented in an interactive, web-based prototype called PLATON, which additionally supports the management of lessons in units as well as the modelling of teacher and student-generated resources. The prototype was evaluated in a study with 61 prospective teachers (bachelor’s and master’s preservice teachers as well as teacher trainees in post-university teacher training) in Berlin, Germany, with a focus on usability. The results show that this approach proofed usable for lesson planning and offers positive effects for the perception of time and self-reflection.
PLATON
(2019)
Lesson planning is both an important and demanding task—especially as part of teacher training. This paper presents the requirements for a lesson planning system and evaluates existing systems regarding these requirements. One major drawback of existing software tools is that most are limited to a text- or form-based representation of the lesson designs. In this article, a new approach with a graphical, time-based representation with (automatic) analyses methods is proposed and the system architecture and domain model are described in detail. The approach is implemented in an interactive, web-based prototype called PLATON, which additionally supports the management of lessons in units as well as the modelling of teacher and student-generated resources. The prototype was evaluated in a study with 61 prospective teachers (bachelor’s and master’s preservice teachers as well as teacher trainees in post-university teacher training) in Berlin, Germany, with a focus on usability. The results show that this approach proofed usable for lesson planning and offers positive effects for the perception of time and self-reflection.
The usage of mobile devices is rapidly growing with Android being the most prevalent mobile operating system. Thanks to the vast variety of mobile applications, users are preferring smartphones over desktops for day to day tasks like Internet surfing. Consequently, smartphones store a plenitude of sensitive data. This data together with the high values of smartphones make them an attractive target for device/data theft (thieves/malicious applications).
Unfortunately, state-of-the-art anti-theft solutions do not work if they do not have an active network connection, e.g., if the SIM card was removed from the device. In the majority of these cases, device owners permanently lose their smartphone together with their personal data, which is even worse.
Apart from that malevolent applications perform malicious activities to steal sensitive information from smartphones. Recent research considered static program analysis to detect dangerous data leaks. These analyses work well for data leaks due to inter-component communication, but suffer from shortcomings for inter-app communication with respect to precision, soundness, and scalability.
This thesis focuses on enhancing users' privacy on Android against physical device loss/theft and (un)intentional data leaks. It presents three novel frameworks: (1) ThiefTrap, an anti-theft framework for Android, (2) IIFA, a modular inter-app intent information flow analysis of Android applications, and (3) PIAnalyzer, a precise approach for PendingIntent vulnerability analysis.
ThiefTrap is based on a novel concept of an anti-theft honeypot account that protects the owner's data while preventing a thief from resetting the device.
We implemented the proposed scheme and evaluated it through an empirical user study with 35 participants. In this study, the owner's data could be protected, recovered, and anti-theft functionality could be performed unnoticed from the thief in all cases.
IIFA proposes a novel approach for Android's inter-component/inter-app communication (ICC/IAC) analysis. Our main contribution is the first fully automatic, sound, and precise ICC/IAC information flow analysis that is scalable for realistic apps due to modularity, avoiding combinatorial explosion: Our approach determines communicating apps using short summaries rather than inlining intent calls between components and apps, which requires simultaneously analyzing all apps installed on a device.
We evaluate IIFA in terms of precision, recall, and demonstrate its scalability to a large corpus of real-world apps. IIFA reports 62 problematic ICC-/IAC-related information flows via two or more apps/components.
PIAnalyzer proposes a novel approach to analyze PendingIntent related vulnerabilities. PendingIntents are a powerful and universal feature of Android for inter-component communication. We empirically evaluate PIAnalyzer on a set of 1000 randomly selected applications and find 1358 insecure usages of PendingIntents, including 70 severe vulnerabilities.