004 Datenverarbeitung; Informatik
Refine
Year of publication
- 2021 (78) (remove)
Document Type
- Article (57)
- Doctoral Thesis (7)
- Monograph/Edited Volume (6)
- Conference Proceeding (5)
- Postprint (3)
Is part of the Bibliography
- yes (78) (remove)
Keywords
- N-of-1 trial (2)
- SCED (2)
- app (2)
- blockchain (2)
- business process management (2)
- business processes (2)
- computer vision (2)
- cyber-physical systems (2)
- deferred choice (2)
- digital health (2)
Institute
- Hasso-Plattner-Institut für Digital Engineering GmbH (23)
- Hasso-Plattner-Institut für Digital Engineering gGmbH (19)
- Institut für Informatik und Computational Science (10)
- Fachgruppe Betriebswirtschaftslehre (7)
- Bürgerliches Recht (4)
- Department Linguistik (3)
- Extern (3)
- Institut für Biochemie und Biologie (3)
- Department Erziehungswissenschaft (2)
- Digital Engineering Fakultät (2)
PC2P
(2021)
Motivation:
Prediction of protein complexes from protein-protein interaction (PPI) networks is an important problem in systems biology, as they control different cellular functions. The existing solutions employ algorithms for network community detection that identify dense subgraphs in PPI networks. However, gold standards in yeast and human indicate that protein complexes can also induce sparse subgraphs, introducing further challenges in protein complex prediction.
Results:
To address this issue, we formalize protein complexes as biclique spanned subgraphs, which include both sparse and dense subgraphs. We then cast the problem of protein complex prediction as a network partitioning into biclique spanned subgraphs with removal of minimum number of edges, called coherent partition. Since finding a coherent partition is a computationally intractable problem, we devise a parameter-free greedy approximation algorithm, termed Protein Complexes from Coherent Partition (PC2P), based on key properties of biclique spanned subgraphs. Through comparison with nine contenders, we demonstrate that PC2P: (i) successfully identifies modular structure in networks, as a prerequisite for protein complex prediction, (ii) outperforms the existing solutions with respect to a composite score of five performance measures on 75% and 100% of the analyzed PPI networks and gold standards in yeast and human, respectively, and (iii,iv) does not compromise GO semantic similarity and enrichment score of the predicted protein complexes. Therefore, our study demonstrates that clustering of networks in terms of biclique spanned subgraphs is a promising framework for detection of complexes in PPI networks.
In recent years, many efforts have been made to apply image processing techniques for plant leaf identification. However, categorizing leaf images at the cultivar/variety level, because of the very low inter-class variability, is still a challenging task. In this research, we propose an automatic discriminative method based on convolutional neural networks (CNNs) for classifying 12 different cultivars of common beans that belong to three various species. We show that employing advanced loss functions, such as Additive Angular Margin Loss and Large Margin Cosine Loss, instead of the standard softmax loss function for the classification can yield better discrimination between classes and thereby mitigate the problem of low inter-class variability. The method was evaluated by classifying species (level I), cultivars from the same species (level II), and cultivars from different species (level III), based on images from the leaf foreside and backside. The results indicate that the performance of the classification algorithm on the leaf backside image dataset is superior. The maximum mean classification accuracies of 95.86, 91.37 and 86.87% were obtained at the levels I, II and III, respectively. The proposed method outperforms the previous relevant works and provides a reliable approach for plant cultivars identification.
We study the classical, two-sided stable marriage problem under pairwise preferences. In the most general setting, agents are allowed to express their preferences as comparisons of any two of their edges, and they also have the right to declare a draw or even withdraw from such a comparison. This freedom is then gradually restricted as we specify six stages of orderedness in the preferences, ending with the classical case of strictly ordered lists. We study all cases occurring when combining the three known notions of stability-weak, strong, and super-stability-under the assumption that each side of the bipartite market obtains one of the six degrees of orderedness. By designing three polynomial algorithms and two NP-completeness proofs, we determine the complexity of all cases not yet known and thus give an exact boundary in terms of preference structure between tractable and intractable cases.
Our input is a complete graph G on n vertices where each vertex has a strict ranking of all other vertices in G. The goal is to construct a matching in G that is popular. A matching M is popular if M does not lose a head-to-head election against any matching M ': here each vertex casts a vote for the matching in {M,M '} in which it gets a better assignment. Popular matchings need not exist in the given instance G and the popular matching problem is to decide whether one exists or not. The popular matching problem in G is easy to solve for odd n. Surprisingly, the problem becomes NP-complete for even n, as we show here. This is one of the few graph theoretic problems efficiently solvable when n has one parity and NP-complete when n has the other parity.
VLDB 2021
(2021)
The 47th International Conference on Very Large Databases (VLDB'21) was held on August 16-20, 2021 as a hybrid conference. It attracted 180 in-person attendees in Copenhagen and 840 remote attendees. In this paper, we describe our key decisions as general chairs and program committee chairs and share the lessons we learned.
The devil in disguise
(2021)
Envy constitutes a serious issue on Social Networking Sites (SNSs), as this painful emotion can severely diminish individuals' well-being. With prior research mainly focusing on the affective consequences of envy in the SNS context, its behavioral consequences remain puzzling. While negative interactions among SNS users are an alarming issue, it remains unclear to which extent the harmful emotion of malicious envy contributes to these toxic dynamics. This study constitutes a first step in understanding malicious envy’s causal impact on negative interactions within the SNS sphere. Within an online experiment, we experimentally induce malicious envy and measure its immediate impact on users’ negative behavior towards other users. Our findings show that malicious envy seems to be an essential factor fueling negativity among SNS users and further illustrate that this effect is especially pronounced when users are provided an objective factor to mask their envy and justify their norm-violating negative behavior.
Viper
(2021)
Key-value stores (KVSs) have found wide application in modern software systems. For persistence, their data resides in slow secondary storage, which requires KVSs to employ various techniques to increase their read and write performance from and to the underlying medium. Emerging persistent memory (PMem) technologies offer data persistence at close-to-DRAM speed, making them a promising alternative to classical disk-based storage. However, simply drop-in replacing existing storage with PMem does not yield good results, as block-based access behaves differently in PMem than on disk and ignores PMem's byte addressability, layout, and unique performance characteristics. In this paper, we propose three PMem-specific access patterns and implement them in a hybrid PMem-DRAM KVS called Viper. We employ a DRAM-based hash index and a PMem-aware storage layout to utilize the random-write speed of DRAM and efficient sequential-write performance PMem. Our evaluation shows that Viper significantly outperforms existing KVSs for core KVS operations while providing full data persistence. Moreover, Viper outperforms existing PMem-only, hybrid, and disk-based KVSs by 4-18x for write workloads, while matching or surpassing their get performance.
Selbstbestimmtes Lernen mit Onlinekursen findet zunehmend mehr Akzeptanz in unserer Gesellschaft. Lernende können mithilfe von Onlinekursen selbst festlegen, was sie wann lernen und Kurse können durch vielfältige Adaptionen an den Lernfortschritt der Nutzer angepasst und individualisiert werden. Auf der einen Seite ist eine große Zielgruppe für diese Lernangebote vorhanden. Auf der anderen Seite sind die Erstellung von Onlinekursen, ihre Bereitstellung, Wartung und Betreuung kostenintensiv, wodurch hochwertige Angebote häufig kostenpflichtig angeboten werden müssen, um als Anbieter zumindest kostenneutral agieren zu können. In diesem Beitrag erörtern und diskutieren wir ein offenes, nachhaltiges datengetriebenes zweiseitiges Geschäftsmodell zur Verwertung geprüfter Onlinekurse und deren kostenfreie Bereitstellung für jeden Lernenden. Kern des Geschäftsmodells ist die Nutzung der dabei entstehenden Verhaltensdaten, die daraus mögliche Ableitung von Persönlichkeitsmerkmalen und Interessen und deren Nutzung im kommerziellen Kontext. Dies ist eine bei der Websuche bereits weitläufig akzeptierte Methode, welche nun auf den Lernkontext übertragen wird. Welche Möglichkeiten, Herausforderungen, aber auch Barrieren überwunden werden müssen, damit das Geschäftsmodell nachhaltig und ethisch vertretbar funktioniert, werden zwei unabhängige, jedoch synergetisch verbundene Geschäftsmodelle vorgestellt und diskutiert. Zusätzlich wurde die Akzeptanz und Erwartung der Zielgruppe für das vorgestellte Geschäftsmodell untersucht, um notwendige Kernressourcen für die Praxis abzuleiten. Die Ergebnisse der Untersuchung zeigen, dass das Geschäftsmodell von den Nutzer*innen grundlegend akzeptiert wird. 10 % der Befragten würden es bevorzugen, mit virtuellen Assistenten – anstelle mit Tutor*innen zu lernen. Zudem ist der Großteil der Nutzer*innen sich nicht darüber bewusst, dass Persönlichkeitsmerkmale anhand des Nutzerverhaltens abgeleitet werden können.
In control theory, to solve a finite-horizon sequential decision problem (SDP) commonly means to find a list of decision rules that result in an optimal expected total reward (or cost) when taking a given number of decision steps. SDPs are routinely solved using Bellman's backward induction. Textbook authors (e.g. Bertsekas or Puterman) typically give more or less formal proofs to show that the backward induction algorithm is correct as solution method for deterministic and stochastic SDPs. Botta, Jansson and Ionescu propose a generic framework for finite horizon, monadic SDPs together with a monadic version of backward induction for solving such SDPs. In monadic SDPs, the monad captures a generic notion of uncertainty, while a generic measure function aggregates rewards. In the present paper, we define a notion of correctness for monadic SDPs and identify three conditions that allow us to prove a correctness result for monadic backward induction that is comparable to textbook correctness proofs for ordinary backward induction. The conditions that we impose are fairly general and can be cast in category-theoretical terms using the notion of Eilenberg-Moore algebra. They hold in familiar settings like those of deterministic or stochastic SDPs, but we also give examples in which they fail. Our results show that backward induction can safely be employed for a broader class of SDPs than usually treated in textbooks. However, they also rule out certain instances that were considered admissible in the context of Botta et al. 's generic framework. Our development is formalised in Idris as an extension of the Botta et al. framework and the sources are available as supplementary material.
The increasing demand for software engineers cannot completely be fulfilled by university education and conventional training approaches due to limited capacities. Accordingly, an alternative approach is necessary where potential software engineers are being educated in software engineering skills using new methods. We suggest micro tasks combined with theoretical lessons to overcome existing skill deficits and acquire fast trainable capabilities. This paper addresses the gap between demand and supply of software engineers by introducing an actionoriented and scenario-based didactical approach, which enables non-computer scientists to code. Therein, the learning content is provided in small tasks and embedded in learning factory scenarios. Therefore, different requirements for software engineers from the market side and from an academic viewpoint are analyzed and synthesized into an integrated, yet condensed skills catalogue. This enables the development of training and education units that focus on the most important skills demanded on the market. To achieve this objective, individual learning scenarios are developed. Of course, proper basic skills in coding cannot be learned over night but software programming is also no sorcery.
Um in der digitalisierten Wirtschaft mitzuspielen, müssen Unternehmen, Markt und insbesondere Kunden detailliert verstanden werden. Neben den „Big Playern“ aus dem Silicon Valley sieht der deutsche Mittelstand, der zu großen Teilen noch auf gewachsenen IT-Infrastrukturen und Prozessen agiert, oft alt aus. Um in den nächsten Jahren nicht gänzlich abgehängt zu werden, ist ein Umbruch notwendig. Sowohl Leistungserstellungsprozesse als auch Leistungsangebot müssen transparent und datenbasiert ausgerichtet werden. Nur so können Geschäftsvorfälle, das Marktgeschehen sowie Handeln der Akteure integrativ bewertet und fundierte Entscheidungen getroffen werden. In diesem Beitrag wird das Konzept der Data-Driven Organization vorgestellt und aufgezeigt, wie Unternehmen den eigenen Analyticsreifegrad ermitteln und in einem iterativen Transformationsprozess steigern können.
Machine learning for improvement of thermal conditions inside a hybrid ventilated animal building
(2021)
In buildings with hybrid ventilation, natural ventilation opening positions (windows), mechanical ventilation rates, heating, and cooling are manipulated to maintain desired thermal conditions. The indoor temperature is regulated solely by ventilation (natural and mechanical) when the external conditions are favorable to save external heating and cooling energy. The ventilation parameters are determined by a rule-based control scheme, which is not optimal. This study proposes a methodology to enable real-time optimum control of ventilation parameters. We developed offline prediction models to estimate future thermal conditions from the data collected from building in operation. The developed offline model is then used to find the optimal controllable ventilation parameters in real-time to minimize the setpoint deviation in the building. With the proposed methodology, the experimental building's setpoint deviation improved for 87% of time, on average, by 0.53 degrees C compared to the current deviations.
Graphs play an important role in many areas of Computer Science. In particular, our work is motivated by model-driven software development and by graph databases. For this reason, it is very important to have the means to express and to reason about the properties that a given graph may satisfy. With this aim, in this paper we present a visual logic that allows us to describe graph properties, including navigational properties, i.e., properties about the paths in a graph. The logic is equipped with a deductive tableau method that we have proved to be sound and complete.
Despite the phenomenal growth of Big Data Analytics in the last few years, little research is done to explicate the relationship between Big Data Analytics Capability (BDAC) and indirect strategic value derived from such digital capabilities. We attempt to address this gap by proposing a conceptual model of the BDAC - Innovation relationship using dynamic capability theory. The work expands on BDAC business value research and extends the nominal research done on BDAC – innovation. We focus on BDAC's relationship with different innovation objects, namely product, business process, and business model innovation, impacting all value chain activities. The insights gained will stimulate academic and practitioner interest in explicating strategic value generated from BDAC and serve as a framework for future research on the subject
Coherent network partitions
(2021)
We continue to study coherent partitions of graphs whereby the vertex set is partitioned into subsets that induce biclique spanned subgraphs. The problem of identifying the minimum number of edges to obtain biclique spanned connected components (CNP), called the coherence number, is NP-hard even on bipartite graphs. Here, we propose a graph transformation geared towards obtaining an O (log n)-approximation algorithm for the CNP on a bipartite graph with n vertices. The transformation is inspired by a new characterization of biclique spanned subgraphs. In addition, we study coherent partitions on prime graphs, and show that finding coherent partitions reduces to the problem of finding coherent partitions in a prime graph. Therefore, these results provide future directions for approximation algorithms for the coherence number of a given graph.
In the field of Business Process Management (BPM), modeling business processes and related data is a critical issue since process activities need to manage data stored in databases. The connection between processes and data is usually handled at the implementation level, even if modeling both processes and data at the conceptual level should help designers in improving business process models and identifying requirements for implementation. Especially in data -and decision-intensive contexts, business process activities need to access data stored both in databases and data warehouses. In this paper, we complete our approach for defining a novel conceptual view that bridges process activities and data. The proposed approach allows the designer to model the connection between business processes and database models and define the operations to perform, providing interesting insights on the overall connected perspective and hints for identifying activities that are crucial for decision support.
Despite advances in machine learning-based clinical prediction models, only few of such models are actually deployed in clinical contexts. Among other reasons, this is due to a lack of validation studies. In this paper, we present and discuss the validation results of a machine learning model for the prediction of acute kidney injury in cardiac surgery patients initially developed on the MIMIC-III dataset when applied to an external cohort of an American research hospital. To help account for the performance differences observed, we utilized interpretability methods based on feature importance, which allowed experts to scrutinize model behavior both at the global and local level, making it possible to gain further insights into why it did not behave as expected on the validation cohort. The knowledge gleaned upon derivation can be potentially useful to assist model update during validation for more generalizable and simpler models. We argue that interpretability methods should be considered by practitioners as a further tool to help explain performance differences and inform model update in validation studies.
Data encoding has been applied to database systems for decades as it mitigates bandwidth bottlenecks and reduces storage requirements. But even in the presence of these advantages, most in-memory database systems use data encoding only conservatively as the negative impact on runtime performance can be severe. Real-world systems with large parts being infrequently accessed and cost efficiency constraints in cloud environments require solutions that automatically and efficiently select encoding techniques, including heavy-weight compression. In this paper, we introduce workload-driven approaches to automaticaly determine memory budget-constrained encoding configurations using greedy heuristics and linear programming. We show for TPC-H, TPC-DS, and the Join Order Benchmark that optimized encoding configurations can reduce the main memory footprint significantly without a loss in runtime performance over state-of-the-art dictionary encoding. To yield robust selections, we extend the linear programming-based approach to incorporate query runtime constraints and mitigate unexpected performance regressions.
Helping overcome distance, the use of videoconferencing tools has surged during the pandemic. To shed light on the consequences of videoconferencing at work, this study takes a granular look at the implications of the self-view feature for meeting outcomes. Building on self-awareness research and self-regulation theory, we argue that by heightening the state of self-awareness, self-view engagement depletes participants’ mental resources and thereby can undermine online meeting outcomes. Evaluation of our theoretical model on a sample of 179 employees reveals a nuanced picture. Self-view engagement while speaking and while listening is positively associated with self-awareness, which, in turn, is negatively associated with satisfaction with meeting process, perceived productivity, and meeting enjoyment. The criticality of the communication role is put forward: looking at self while listening to other attendees has a negative direct and indirect effect on meeting outcomes; however, looking at self while speaking produces equivocal effects.
Correction to: Knowledge bases and software support for variant interpretation in precision oncology
(2021)
Precision oncology is a rapidly evolving interdisciplinary medical specialty. Comprehensive cancer panels are becoming increasingly available at pathology departments worldwide, creating the urgent need for scalable cancer variant annotation and molecularly informed treatment recommendations. A wealth of mainly academia-driven knowledge bases calls for software tools supporting the multi-step diagnostic process. We derive a comprehensive list of knowledge bases relevant for variant interpretation by a review of existing literature followed by a survey among medical experts from university hospitals in Germany. In addition, we review cancer variant interpretation tools, which integrate multiple knowledge bases. We categorize the knowledge bases along the diagnostic process in precision oncology and analyze programmatic access options as well as the integration of knowledge bases into software tools. The most commonly used knowledge bases provide good programmatic access options and have been integrated into a range of software tools. For the wider set of knowledge bases, access options vary across different parts of the diagnostic process. Programmatic access is limited for information regarding clinical classifications of variants and for therapy recommendations. The main issue for databases used for biological classification of pathogenic variants and pathway context information is the lack of standardized interfaces. There is no single cancer variant interpretation tool that integrates all identified knowledge bases. Specialized tools are available and need to be further developed for different steps in the diagnostic process.
Phe2vec
(2021)
Robust phenotyping of patients from electronic health records (EHRs) at scale is a challenge in clinical informatics. Here, we introduce Phe2vec, an automated framework for disease phenotyping from EHRs based on unsupervised learning and assess its effectiveness against standard rule-based algorithms from Phenotype KnowledgeBase (PheKB). Phe2vec is based on pre-computing embeddings of medical concepts and patients' clinical history. Disease phenotypes are then derived from a seed concept and its neighbors in the embedding space. Patients are linked to a disease if their embedded representation is close to the disease phenotype. Comparing Phe2vec and PheKB cohorts head-to-head using chart review, Phe2vec performed on par or better in nine out of ten diseases. Differently from other approaches, it can scale to any condition and was validated against widely adopted expert-based standards. Phe2vec aims to optimize clinical informatics research by augmenting current frameworks to characterize patients by condition and derive reliable disease cohorts.
Student teachers often struggle to keep track of everything that is happening in the classroom, and particularly to notice and respond when students cause disruptions. The complexity of the classroom environment is a potential contributing factor that has not been empirically tested. In this experimental study, we utilized a virtual reality (VR) classroom to examine whether classroom complexity affects the likelihood of student teachers noticing disruptions and how they react after noticing. Classroom complexity was operationalized as the number of disruptions and the existence of overlapping disruptions (multidimensionality) as well as the existence of parallel teaching tasks (simultaneity). Results showed that student teachers (n = 50) were less likely to notice the scripted disruptions, and also less likely to respond to the disruptions in a comprehensive and effortful manner when facing greater complexity. These results may have implications for both teacher training and the design of VR for training or research purpose. This study contributes to the field from two aspects: 1) it revealed how features of the classroom environment can affect student teachers' noticing of and reaction to disruptions; and 2) it extends the functionality of the VR environment-from a teacher training tool to a testbed of fundamental classroom processes that are difficult to manipulate in real-life.
We present a general approach to planning with incomplete information in Answer Set Programming (ASP). More precisely, we consider the problems of conformant and conditional planning with sensing actions and assumptions. We represent planning problems using a simple formalism where logic programs describe the transition function between states, the initial states and the goal states. For solving planning problems, we use Quantified Answer Set Programming (QASP), an extension of ASP with existential and universal quantifiers over atoms that is analogous to Quantified Boolean Formulas (QBFs). We define the language of quantified logic programs and use it to represent the solutions different variants of conformant and conditional planning. On the practical side, we present a translation-based QASP solver that converts quantified logic programs into QBFs and then executes a QBF solver, and we evaluate experimentally the approach on conformant and conditional planning benchmarks.
Business processes are often specified in descriptive or normative models. Both types of models should adhere to internal and external regulations, such as company guidelines or laws. Employing compliance checking techniques, it is possible to verify process models against rules. While traditionally compliance checking focuses on well-structured processes, we address case management scenarios. In case management, knowledge workers drive multi-variant and adaptive processes. Our contribution is based on the fragment-based case management approach, which splits a process into a set of fragments. The fragments are synchronized through shared data but can, otherwise, be dynamically instantiated and executed. We formalize case models using Petri nets. We demonstrate the formalization for design-time and run-time compliance checking and present a proof-of-concept implementation. The application of the implemented compliance checking approach to a use case exemplifies its effectiveness while designing a case model. The empirical evaluation on a set of case models for measuring the performance of the approach shows that rules can often be checked in less than a second.
I can see it in your eyes
(2021)
Over the past years, extensive research has been dedicated to developing robust platforms and data-driven dialog models to support long-term human-robot interactions. However, little is known about how people's perception of robots and engagement with them develop over time and how these can be accurately assessed through implicit and continuous measurement techniques. In this paper, we explore this by involving participants in three interaction sessions with multiple days of zero exposure in between. Each session consists of a joint task with a robot as well as two short social chats with it before and after the task. We measure participants' gaze patterns with a wearable eye-tracker and gauge their perception of the robot and engagement with it and the joint task using questionnaires. Results disclose that aversion of gaze in a social chat is an indicator of a robot's uncanniness and that the more people gaze at the robot in a joint task, the worse they perform. In contrast with most HRI literature, our results show that gaze toward an object of shared attention, rather than gaze toward a robotic partner, is the most meaningful predictor of engagement in a joint task. Furthermore, the analyses of gaze patterns in repeated interactions disclose that people's mutual gaze in a social chat develops congruently with their perceptions of the robot over time. These are key findings for the HRI community as they entail that gaze behavior can be used as an implicit measure of people's perception of robots in a social chat and of their engagement and task performance in a joint task.
A simplified run time analysis of the univariate marginal distribution algorithm on LeadingOnes
(2021)
With elementary means, we prove a stronger run time guarantee for the univariate marginal distribution algorithm (UMDA) optimizing the LEADINGONES benchmark function in the desirable regime with low genetic drift. If the population size is at least quasilinear, then, with high probability, the UMDA samples the optimum in a number of iterations that is linear in the problem size divided by the logarithm of the UMDA's selection rate. This improves over the previous guarantee, obtained by Dang and Lehre (2015) via the deep level-based population method, both in terms of the run time and by demonstrating further run time gains from small selection rates. Under similar assumptions, we prove a lower bound that matches our upper bound up to constant factors.
Effective query optimization is a core feature of any database management system. While most query optimization techniques make use of simple metadata, such as cardinalities and other basic statistics, other optimization techniques are based on more advanced metadata including data dependencies, such as functional, uniqueness, order, or inclusion dependencies. This survey provides an overview, intuitive descriptions, and classifications of query optimization and execution strategies that are enabled by data dependencies. We consider the most popular types of data dependencies and focus on optimization strategies that target the optimization of relational database queries. The survey supports database vendors to identify optimization opportunities as well as DBMS researchers to find related work and open research questions.
Recently, initial conflicts were introduced in the framework of M-adhesive categories as an important optimization of critical pairs. In particular, they represent a proper subset such that each conflict is represented in a minimal context by a unique initial one. The theory of critical pairs has been extended in the framework of M-adhesive categories to rules with nested application conditions (ACs), restricting the applicability of a rule and generalizing the well-known negative application conditions. A notion of initial conflicts for rules with ACs does not exist yet.
In this paper, on the one hand, we extend the theory of initial conflicts in the framework of M-adhesive categories to transformation rules with ACs. They represent a proper subset again of critical pairs for rules with ACs, and represent each conflict in a minimal context uniquely. They are moreover symbolic because we can show that in general no finite and complete set of conflicts for rules with ACs exists. On the other hand, we show that critical pairs are minimally M-complete, whereas initial conflicts are minimally complete. Finally, we introduce important special cases of rules with ACs for which we can obtain finite, minimally (M-)complete sets of conflicts.
Bitcoin is gaining traction as an alternative store of value. Its market capitalization transcends all other cryptocurrencies in the market. But its high monetary value also makes it an attractive target to cyber criminal actors. Hacking campaigns usually target an ecosystem's weakest points. In Bitcoin, the exchange platforms are one of them. Each exchange breach is a threat not only to direct victims, but to the credibility of Bitcoin's entire ecosystem. Based on an extensive analysis of 36 breaches of Bitcoin exchanges, we show the attack patterns used to exploit Bitcoin exchange platforms using an industry standard for reporting intelligence on cyber security breaches. Based on this we are able to provide an overview of the most common attack vectors, showing that all except three hacks were possible due to relatively lax security. We show that while the security regimen of Bitcoin exchanges is subpar compared to other financial service providers, the use of stolen credentials, which does not require any hacking, is decreasing. We also show that the amount of BTC taken during a breach is decreasing, as well as the exchanges that terminate after being breached. Furthermore we show that overall security posture has improved, but still has major flaws. To discover adversarial methods post-breach, we have analyzed two cases of BTC laundering. Through this analysis we provide insight into how exchange platforms with lax cyber security even further increase the intermediary risk introduced by them into the Bitcoin ecosystem.
According to the personalization principle, addressing learners by means of a personalized compared to a nonpersonalized message can foster learning. Interestingly, though, a recent study found that the personalization principle can invert for aversive contents. The present study investigated whether the negative effect of a personalized message for an aversive content can be compensated when learners are in a happy mood. It was hypothesized that the negative effect of a personalized compared to a nonpersonalized message would only be observable for participants in a sad mood, while for participants in a happy mood a personalized message should be beneficial. A 2 x 2 between-subject design with mood (happy vs. sad) and personalization (personalized vs. nonpersonalized message) was used (N = 125 University students). Mood was experimentally varied prior to learning. Learning outcomes were measured by a retention and a transfer test. Results were essentially in line with the assumption: For participants in the sad mood condition, a negative effect of a personalized message was observable for retention and transfer. For participants in the happy mood condition, a positive effect of personalized message was observable for retention, but no effect for transfer. Note that the manipulation check measure for the mood induction procedure did not detect differences between conditions; this may be due to a shortcoming of the used measure (as indicated by an additional evaluation study). The study emphasizes the importance to consider the inherent emotional content of a topic, such as its aversive nature, since the emotional content of a topic can be a boundary condition for design principles in multimedia learning. The study also highlights the complex interplay of externally induced and inherently arising emotions.
Image feature detection is a key task in computer vision. Scale Invariant Feature Transform (SIFT) is a prevalent and well known algorithm for robust feature detection. However, it is computationally demanding and software implementations are not applicable for real-time performance. In this paper, a versatile and pipelined hardware implementation is proposed, that is capable of computing keypoints and rotation invariant descriptors on-chip. All computations are performed in single precision floating-point format which makes it possible to implement the original algorithm with little alteration. Various rotation resolutions and filter kernel sizes are supported for images of any resolution up to ultra-high definition. For full high definition images, 84 fps can be processed. Ultra high definition images can be processed at 21 fps.
Perfectionism is a personality disposition characterized by setting extremely high performance-standards coupled with critical self-evaluations. Often conceived as positive, perfectionism can yield not only beneficial but also deleterious outcomes ranging from anxiety to burnout. In this proposal, we set out to investigate the role of the technology and, particularly, social media in individuals’ strivings for perfection. We lay down theoretical bases for the possibility that social media plays a role in the development of perfectionism. To empirically test the hypothesized relationship, we propose a comprehensive study design based on the experience sampling method. Lastly, we provide an overview of the planned analysis and future steps.
We introduce a new measure of descriptional complexity on finite automata, called the number of active states. Roughly speaking, the number of active states of an automaton A on input w counts the number of different states visited during the most economic computation of the automaton A for the word w. This concept generalizes to finite automata and regular languages in a straightforward way. We show that the number of active states of both finite automata and regular languages is computable, even with respect to nondeterministic finite automata. We further compare the number of active states to related measures for regular languages. In particular, we show incomparability to the radius of regular languages and that the difference between the number of active states and the total number of states needed in finite automata for a regular language can be of exponential order.
Motivation:
Constraint-based modeling approaches allow the estimation of maximal in vivo enzyme catalytic rates that can serve as proxies for enzyme turnover numbers. Yet, genome-scale flux profiling remains a challenge in deploying these approaches to catalogue proxies for enzyme catalytic rates across organisms.
Results:
Here, we formulate a constraint-based approach, termed NIDLE-flux, to estimate fluxes at a genome-scale level by using the principle of efficient usage of expressed enzymes. Using proteomics data from Escherichia coli, we show that the fluxes estimated by NIDLE-flux and the existing approaches are in excellent qualitative agreement (Pearson correlation > 0.9). We also find that the maximal in vivo catalytic rates estimated by NIDLE-flux exhibits a Pearson correlation of 0.74 with in vitro enzyme turnover numbers. However, NIDLE-flux results in a 1.4-fold increase in the size of the estimated maximal in vivo catalytic rates in comparison to the contenders. Integration of the maximum in vivo catalytic rates with publically available proteomics and metabolomics data provide a better match to fluxes estimated by NIDLE-flux. Therefore, NIDLE-flux facilitates more effective usage of proteomics data to estimate proxies for kcatomes.
A core operator of evolutionary algorithms (EAs) is the mutation. Recently, much attention has been devoted to the study of mutation operators with dynamic and non-uniform mutation rates. Following up on this area of work, we propose a new mutation operator and analyze its performance on the (1 + 1) Evolutionary Algorithm (EA). Our analyses show that this mutation operator competes with pre-existing ones, when used by the (1 + 1) EA on classes of problems for which results on the other mutation operators are available. We show that the (1 + 1) EA using our mutation operator finds a (1/3)-approximation ratio on any non-negative submodular function in polynomial time. We also consider the problem of maximizing a symmetric submodular function under a single matroid constraint and show that the (1 + 1) EA using our operator finds a (1/3)-approximation within polynomial time. This performance matches that of combinatorial local search algorithms specifically designed to solve these problems and outperforms them with constant probability. Finally, we evaluate the performance of the (1 + 1) EA using our operator experimentally by considering two applications: (a) the maximum directed cut problem on real-world graphs of different origins, with up to 6.6 million vertices and 56 million edges and (b) the symmetric mutual information problem using a four month period air pollution data set. In comparison with uniform mutation and a recently proposed dynamic scheme, our operator comes out on top on these instances.
Analysis of protrusion dynamics in amoeboid cell motility by means of regularized contour flows
(2021)
Amoeboid cell motility is essential for a wide range of biological processes including wound healing, embryonic morphogenesis, and cancer metastasis. It relies on complex dynamical patterns of cell shape changes that pose long-standing challenges to mathematical modeling and raise a need for automated and reproducible approaches to extract quantitative morphological features from image sequences. Here, we introduce a theoretical framework and a computational method for obtaining smooth representations of the spatiotemporal contour dynamics from stacks of segmented microscopy images. Based on a Gaussian process regression we propose a one-parameter family of regularized contour flows that allows us to continuously track reference points (virtual markers) between successive cell contours. We use this approach to define a coordinate system on the moving cell boundary and to represent different local geometric quantities in this frame of reference. In particular, we introduce the local marker dispersion as a measure to identify localized membrane expansions and provide a fully automated way to extract the properties of such expansions, including their area and growth time. The methods are available as an open-source software package called AmoePy, a Python-based toolbox for analyzing amoeboid cell motility (based on time-lapse microscopy data), including a graphical user interface and detailed documentation. Due to the mathematical rigor of our framework, we envision it to be of use for the development of novel cell motility models. We mainly use experimental data of the social amoeba Dictyostelium discoideum to illustrate and validate our approach. <br /> Author summary Amoeboid motion is a crawling-like cell migration that plays an important key role in multiple biological processes such as wound healing and cancer metastasis. This type of cell motility results from expanding and simultaneously contracting parts of the cell membrane. From fluorescence images, we obtain a sequence of points, representing the cell membrane, for each time step. By using regression analysis on these sequences, we derive smooth representations, so-called contours, of the membrane. Since the number of measurements is discrete and often limited, the question is raised of how to link consecutive contours with each other. In this work, we present a novel mathematical framework in which these links are described by regularized flows allowing a certain degree of concentration or stretching of neighboring reference points on the same contour. This stretching rate, the so-called local dispersion, is used to identify expansions and contractions of the cell membrane providing a fully automated way of extracting properties of these cell shape changes. We applied our methods to time-lapse microscopy data of the social amoeba Dictyostelium discoideum.
Dass Technologien wie Machine Learning-Anwendungen oder Big bzw. Smart Data- Verfahren unbedingt Daten in ausreichender Menge und Güte benötigen, erscheint inzwischen als Binsenweisheit. Vor diesem Hintergrund hat insbesondere der EU-Gesetzgeber für sich zuletzt ein neues Betätigungsfeld entdeckt, indem er versucht, auf unterschiedlichen Wegen Anreize zum Datenteilen zu schaffen, um Innovation zu kreieren. Hierzu zählt auch eine geradezu wohltönend mit ,,Datenaltruismus‘‘ verschlagwortete Konstellation. Der Beitrag stellt die diesbezüglichen Regulierungserwägungen auf supranationaler Ebene dar und nimmt eine erste Analyse vor.
Die Digitalisierung unseres Lebens löst die Grenzen zwischen Privat- und Berufsleben immer weiter auf. Bekanntes Beispiel ist das Homeoffice. Arbeitgeber begegnen aber auch zahlreichen weiteren Trends in diesem Zusammenhang. Dazu gehören „workation“, also die Verbindung zwischen Arbeit („work“) und Urlaub („vacation“) ebenso wie „bleisure“, dh die Verbindung von Dienstreisen („business“) und Urlaub („leisure“). Der Beitrag geht den rechtlichen Rahmenbedingungen hierfür nach.
Proceedings of the HPI Research School on Service-oriented Systems Engineering 2020 Fall Retreat
(2021)
Design and Implementation of service-oriented architectures imposes a huge number of research questions from the fields of software engineering, system analysis and modeling, adaptability, and application integration. Component orientation and web services are two approaches for design and realization of complex web-based system. Both approaches allow for dynamic application adaptation as well as integration of enterprise application.
Service-Oriented Systems Engineering represents a symbiosis of best practices in object-orientation, component-based development, distributed computing, and business process management. It provides integration of business and IT concerns.
The annual Ph.D. Retreat of the Research School provides each member the opportunity to present his/her current state of their research and to give an outline of a prospective Ph.D. thesis. Due to the interdisciplinary structure of the research school, this technical report covers a wide range of topics. These include but are not limited to: Human Computer Interaction and Computer Vision as Service; Service-oriented Geovisualization Systems; Algorithm Engineering for Service-oriented Systems; Modeling and Verification of Self-adaptive Service-oriented Systems; Tools and Methods for Software Engineering in Service-oriented Systems; Security Engineering of Service-based IT Systems; Service-oriented Information Systems; Evolutionary Transition of Enterprise Applications to Service Orientation; Operating System Abstractions for Service-oriented Computing; and Services Specification, Composition, and Enactment.
Intrinsic decomposition refers to the problem of estimating scene characteristics, such as albedo and shading, when one view or multiple views of a scene are provided. The inverse problem setting, where multiple unknowns are solved given a single known pixel-value, is highly under-constrained. When provided with correlating image and depth data, intrinsic scene decomposition can be facilitated using depth-based priors, which nowadays is easy to acquire with high-end smartphones by utilizing their depth sensors. In this work, we present a system for intrinsic decomposition of RGB-D images on smartphones and the algorithmic as well as design choices therein. Unlike state-of-the-art methods that assume only diffuse reflectance, we consider both diffuse and specular pixels. For this purpose, we present a novel specularity extraction algorithm based on a multi-scale intensity decomposition and chroma inpainting. At this, the diffuse component is further decomposed into albedo and shading components. We use an inertial proximal algorithm for non-convex optimization (iPiano) to ensure albedo sparsity. Our GPU-based visual processing is implemented on iOS via the Metal API and enables interactive performance on an iPhone 11 Pro. Further, a qualitative evaluation shows that we are able to obtain high-quality outputs. Furthermore, our proposed approach for specularity removal outperforms state-of-the-art approaches for real-world images, while our albedo and shading layer decomposition is faster than the prior work at a comparable output quality. Manifold applications such as recoloring, retexturing, relighting, appearance editing, and stylization are shown, each using the intrinsic layers obtained with our method and/or the corresponding depth data.
Trotz erfolgreicher Impfkampagne droht nach dem Sommer eine vierte Infektionswelle der Corona-Pandemie. Ob es dazu kommen wird, hängt maßgeblich davon ab, wie viele Menschen sich für eine Corona-Schutzimpfung entscheiden. Am Impfstoff mangelt es nicht mehr, dafür an der Impfbereitschaft. Viele Arbeitgeber fragen sich daher, was sie unternehmen können, um die Impfquote in ihren Betrieben zu erhöhen.
We study the concept of reversibility in connection with parallel communicating systems of finite automata (PCFA in short). We define the notion of reversibility in the case of PCFA (also covering the non-deterministic case) and discuss the relationship of the reversibility of the systems and the reversibility of its components. We show that a system can be reversible with non-reversible components, and the other way around, the reversibility of the components does not necessarily imply the reversibility of the system as a whole. We also investigate the computational power of deterministic centralized reversible PCFA. We show that these very simple types of PCFA (returning or non-returning) can recognize regular languages which cannot be accepted by reversible (deterministic) finite automata, and that they can even accept languages that are not context-free. We also separate the deterministic and non-deterministic variants in the case of systems with non-returning communication. We show that there are languages accepted by non-deterministic centralized PCFA, which cannot be recognized by any deterministic variant of the same type.
The reconstruction of cone-beam computed tomography data using filtered back-projection algorithms unavoidably results in severe artefacts. We describe how the Direct Iterative Reconstruction of Computed Tomography Trajectories (DIRECTT) algorithm can be combined with a model of the artefacts for the reconstruction of such data. The implementation of DIRECTT results in reconstructed volumes of superior quality compared to the conventional algorithms.
We introduce a logic-based incremental approach to graph repair, generating a sound and complete (upon termination) overview of least-changing graph repairs from which a user may select a graph repair based on non-formalized further requirements. This incremental approach features delta preservation as it allows to restrict the generation of graph repairs to delta-preserving graph repairs, which do not revert the additions and deletions of the most recent consistency-violating graph update. We specify consistency of graphs using the logic of nested graph conditions, which is equivalent to first-order logic on graphs. Technically, the incremental approach encodes if and how the graph under repair satisfies a graph condition using the novel data structure of satisfaction trees, which are adapted incrementally according to the graph updates applied. In addition to the incremental approach, we also present two state-based graph repair algorithms, which restore consistency of a graph independent of the most recent graph update and which generate additional graph repairs using a global perspective on the graph under repair. We evaluate the developed algorithms using our prototypical implementation in the tool AutoGraph and illustrate our incremental approach using a case study from the graph database domain.
Empirical investigations on the uncanny valley have almost solely focused on the analysis of people?s noninteractive perception of a robot at first sight. Recent studies suggest, however, that these uncanny first impressions may be significantly altered over an interaction. What is yet to discover is whether certain interaction patterns can lead to a faster decline in uncanny feelings. In this paper, we present a study in which participants with limited expertise in Computer Science played a collaborative geography game with a Furhat robot. During the game, Furhat displayed one of two personalities, which corresponded to two different interaction strategies. The robot was either optimistic and encouraging, or impatient and provocative. We performed the study in a science museum and recruited participants among the visitors. Our findings suggest that a robot that is rated high on agreeableness, emotional stability, and conscientiousness can indeed weaken uncanny feelings. This study has important implications for human-robot interaction design as it further highlights that a first impression, merely based on a robot?s appearance, is not indicative of the affinity people might develop towards it throughout an interaction. We thus argue that future work should emphasize investigations on exact interaction patterns that can help to overcome uncanny feelings.
Data privacy is a very important issue. Especially in fields like medicine, it is paramount to abide by the existing privacy regulations to preserve patients' anonymity. However, data is required for research and training machine learning models that could help gain insight into complex correlations or personalised treatments that may otherwise stay undiscovered. Those models generally scale with the amount of data available, but the current situation often prohibits building large databases across sites. So it would be beneficial to be able to combine similar or related data from different sites all over the world while still preserving data privacy. Federated learning has been proposed as a solution for this, because it relies on the sharing of machine learning models, instead of the raw data itself. That means private data never leaves the site or device it was collected on. Federated learning is an emerging research area, and many domains have been identified for the application of those methods. This systematic literature review provides an extensive look at the concept of and research into federated learning and its applicability for confidential healthcare datasets.
Cyber warfare is a timely and relevant issue and one of the most controversial in international humanitarian law (IHL). The aim of IHL is to set rules and limits in terms of means and methods of warfare. In this context, a key question arises: Has digital warfare rules or limits, and if so, how are these applicable? Traditional principles, developed over a long period, are facing a new dimension of challenges due to the rise of cyber warfare. This paper argues that to overcome this new issue, it is critical that new humanity-oriented approaches is developed with regard to cyber warfare. The challenge is to establish a legal regime for cyber-attacks, successfully addressing human rights norms and standards. While clarifying this from a legal perspective, the authors can redesign the sensitive equilibrium between humanity and military necessity, weighing the humanitarian aims of IHL and the protection of civilians-in combination with international human rights law and other relevant legal regimes-in a different manner than before.
Argument mining on twitter
(2021)
In the last decade, the field of argument mining has grown notably. However, only relatively few studies have investigated argumentation in social media and specifically on Twitter. Here, we provide the, to our knowledge, first critical in-depth survey of the state of the art in tweet-based argument mining. We discuss approaches to modelling the structure of arguments in the context of tweet corpus annotation, and we review current progress in the task of detecting argument components and their relations in tweets. We also survey the intersection of argument mining and stance detection, before we conclude with an outlook.