004 Datenverarbeitung; Informatik
Refine
Year of publication
- 2021 (78) (remove)
Document Type
- Article (58)
- Doctoral Thesis (7)
- Monograph/Edited Volume (6)
- Conference Proceeding (5)
- Postprint (2)
Is part of the Bibliography
- yes (78) (remove)
Keywords
- blockchain (2)
- business process management (2)
- business processes (2)
- computer vision (2)
- cyber-physical systems (2)
- deferred choice (2)
- formal semantics (2)
- memory (2)
- oracles (2)
- perception of robots (2)
Institute
- Hasso-Plattner-Institut für Digital Engineering GmbH (23)
- Hasso-Plattner-Institut für Digital Engineering gGmbH (19)
- Institut für Informatik und Computational Science (11)
- Fachgruppe Betriebswirtschaftslehre (7)
- Bürgerliches Recht (4)
- Department Linguistik (3)
- Extern (3)
- Institut für Biochemie und Biologie (3)
- Department Erziehungswissenschaft (2)
- Institut für Physik und Astronomie (2)
- Wirtschaftswissenschaften (2)
- Department Psychologie (1)
- Department Sport- und Gesundheitswissenschaften (1)
- Institut für Mathematik (1)
- Institut für Umweltwissenschaften und Geographie (1)
- Potsdam Transfer - Zentrum für Gründung, Innovation, Wissens- und Technologietransfer (1)
- Öffentliches Recht (1)
Die Digitalisierung unseres Lebens löst die Grenzen zwischen Privat- und Berufsleben immer weiter auf. Bekanntes Beispiel ist das Homeoffice. Arbeitgeber begegnen aber auch zahlreichen weiteren Trends in diesem Zusammenhang. Dazu gehören „workation“, also die Verbindung zwischen Arbeit („work“) und Urlaub („vacation“) ebenso wie „bleisure“, dh die Verbindung von Dienstreisen („business“) und Urlaub („leisure“). Der Beitrag geht den rechtlichen Rahmenbedingungen hierfür nach.
Proceedings of the HPI Research School on Service-oriented Systems Engineering 2020 Fall Retreat
(2021)
Design and Implementation of service-oriented architectures imposes a huge number of research questions from the fields of software engineering, system analysis and modeling, adaptability, and application integration. Component orientation and web services are two approaches for design and realization of complex web-based system. Both approaches allow for dynamic application adaptation as well as integration of enterprise application.
Service-Oriented Systems Engineering represents a symbiosis of best practices in object-orientation, component-based development, distributed computing, and business process management. It provides integration of business and IT concerns.
The annual Ph.D. Retreat of the Research School provides each member the opportunity to present his/her current state of their research and to give an outline of a prospective Ph.D. thesis. Due to the interdisciplinary structure of the research school, this technical report covers a wide range of topics. These include but are not limited to: Human Computer Interaction and Computer Vision as Service; Service-oriented Geovisualization Systems; Algorithm Engineering for Service-oriented Systems; Modeling and Verification of Self-adaptive Service-oriented Systems; Tools and Methods for Software Engineering in Service-oriented Systems; Security Engineering of Service-based IT Systems; Service-oriented Information Systems; Evolutionary Transition of Enterprise Applications to Service Orientation; Operating System Abstractions for Service-oriented Computing; and Services Specification, Composition, and Enactment.
Intrinsic decomposition refers to the problem of estimating scene characteristics, such as albedo and shading, when one view or multiple views of a scene are provided. The inverse problem setting, where multiple unknowns are solved given a single known pixel-value, is highly under-constrained. When provided with correlating image and depth data, intrinsic scene decomposition can be facilitated using depth-based priors, which nowadays is easy to acquire with high-end smartphones by utilizing their depth sensors. In this work, we present a system for intrinsic decomposition of RGB-D images on smartphones and the algorithmic as well as design choices therein. Unlike state-of-the-art methods that assume only diffuse reflectance, we consider both diffuse and specular pixels. For this purpose, we present a novel specularity extraction algorithm based on a multi-scale intensity decomposition and chroma inpainting. At this, the diffuse component is further decomposed into albedo and shading components. We use an inertial proximal algorithm for non-convex optimization (iPiano) to ensure albedo sparsity. Our GPU-based visual processing is implemented on iOS via the Metal API and enables interactive performance on an iPhone 11 Pro. Further, a qualitative evaluation shows that we are able to obtain high-quality outputs. Furthermore, our proposed approach for specularity removal outperforms state-of-the-art approaches for real-world images, while our albedo and shading layer decomposition is faster than the prior work at a comparable output quality. Manifold applications such as recoloring, retexturing, relighting, appearance editing, and stylization are shown, each using the intrinsic layers obtained with our method and/or the corresponding depth data.
Trotz erfolgreicher Impfkampagne droht nach dem Sommer eine vierte Infektionswelle der Corona-Pandemie. Ob es dazu kommen wird, hängt maßgeblich davon ab, wie viele Menschen sich für eine Corona-Schutzimpfung entscheiden. Am Impfstoff mangelt es nicht mehr, dafür an der Impfbereitschaft. Viele Arbeitgeber fragen sich daher, was sie unternehmen können, um die Impfquote in ihren Betrieben zu erhöhen.
We study the concept of reversibility in connection with parallel communicating systems of finite automata (PCFA in short). We define the notion of reversibility in the case of PCFA (also covering the non-deterministic case) and discuss the relationship of the reversibility of the systems and the reversibility of its components. We show that a system can be reversible with non-reversible components, and the other way around, the reversibility of the components does not necessarily imply the reversibility of the system as a whole. We also investigate the computational power of deterministic centralized reversible PCFA. We show that these very simple types of PCFA (returning or non-returning) can recognize regular languages which cannot be accepted by reversible (deterministic) finite automata, and that they can even accept languages that are not context-free. We also separate the deterministic and non-deterministic variants in the case of systems with non-returning communication. We show that there are languages accepted by non-deterministic centralized PCFA, which cannot be recognized by any deterministic variant of the same type.
The reconstruction of cone-beam computed tomography data using filtered back-projection algorithms unavoidably results in severe artefacts. We describe how the Direct Iterative Reconstruction of Computed Tomography Trajectories (DIRECTT) algorithm can be combined with a model of the artefacts for the reconstruction of such data. The implementation of DIRECTT results in reconstructed volumes of superior quality compared to the conventional algorithms.
We introduce a logic-based incremental approach to graph repair, generating a sound and complete (upon termination) overview of least-changing graph repairs from which a user may select a graph repair based on non-formalized further requirements. This incremental approach features delta preservation as it allows to restrict the generation of graph repairs to delta-preserving graph repairs, which do not revert the additions and deletions of the most recent consistency-violating graph update. We specify consistency of graphs using the logic of nested graph conditions, which is equivalent to first-order logic on graphs. Technically, the incremental approach encodes if and how the graph under repair satisfies a graph condition using the novel data structure of satisfaction trees, which are adapted incrementally according to the graph updates applied. In addition to the incremental approach, we also present two state-based graph repair algorithms, which restore consistency of a graph independent of the most recent graph update and which generate additional graph repairs using a global perspective on the graph under repair. We evaluate the developed algorithms using our prototypical implementation in the tool AutoGraph and illustrate our incremental approach using a case study from the graph database domain.
Empirical investigations on the uncanny valley have almost solely focused on the analysis of people?s noninteractive perception of a robot at first sight. Recent studies suggest, however, that these uncanny first impressions may be significantly altered over an interaction. What is yet to discover is whether certain interaction patterns can lead to a faster decline in uncanny feelings. In this paper, we present a study in which participants with limited expertise in Computer Science played a collaborative geography game with a Furhat robot. During the game, Furhat displayed one of two personalities, which corresponded to two different interaction strategies. The robot was either optimistic and encouraging, or impatient and provocative. We performed the study in a science museum and recruited participants among the visitors. Our findings suggest that a robot that is rated high on agreeableness, emotional stability, and conscientiousness can indeed weaken uncanny feelings. This study has important implications for human-robot interaction design as it further highlights that a first impression, merely based on a robot?s appearance, is not indicative of the affinity people might develop towards it throughout an interaction. We thus argue that future work should emphasize investigations on exact interaction patterns that can help to overcome uncanny feelings.
Data privacy is a very important issue. Especially in fields like medicine, it is paramount to abide by the existing privacy regulations to preserve patients' anonymity. However, data is required for research and training machine learning models that could help gain insight into complex correlations or personalised treatments that may otherwise stay undiscovered. Those models generally scale with the amount of data available, but the current situation often prohibits building large databases across sites. So it would be beneficial to be able to combine similar or related data from different sites all over the world while still preserving data privacy. Federated learning has been proposed as a solution for this, because it relies on the sharing of machine learning models, instead of the raw data itself. That means private data never leaves the site or device it was collected on. Federated learning is an emerging research area, and many domains have been identified for the application of those methods. This systematic literature review provides an extensive look at the concept of and research into federated learning and its applicability for confidential healthcare datasets.
Cyber warfare is a timely and relevant issue and one of the most controversial in international humanitarian law (IHL). The aim of IHL is to set rules and limits in terms of means and methods of warfare. In this context, a key question arises: Has digital warfare rules or limits, and if so, how are these applicable? Traditional principles, developed over a long period, are facing a new dimension of challenges due to the rise of cyber warfare. This paper argues that to overcome this new issue, it is critical that new humanity-oriented approaches is developed with regard to cyber warfare. The challenge is to establish a legal regime for cyber-attacks, successfully addressing human rights norms and standards. While clarifying this from a legal perspective, the authors can redesign the sensitive equilibrium between humanity and military necessity, weighing the humanitarian aims of IHL and the protection of civilians-in combination with international human rights law and other relevant legal regimes-in a different manner than before.
Argument mining on twitter
(2021)
In the last decade, the field of argument mining has grown notably. However, only relatively few studies have investigated argumentation in social media and specifically on Twitter. Here, we provide the, to our knowledge, first critical in-depth survey of the state of the art in tweet-based argument mining. We discuss approaches to modelling the structure of arguments in the context of tweet corpus annotation, and we review current progress in the task of detecting argument components and their relations in tweets. We also survey the intersection of argument mining and stance detection, before we conclude with an outlook.
We systematically explore the effect of calibration data length on the performance of a conceptual hydrological model, GR4H, in comparison to two Artificial Neural Network (ANN) architectures: Long Short-Term Memory Networks (LSTM) and Gated Recurrent Units (GRU), which have just recently been introduced to the field of hydrology. We implemented a case study for six river basins across the contiguous United States, with 25 years of meteorological and discharge data. Nine years were reserved for independent validation; two years were used as a warm-up period, one year for each of the calibration and validation periods, respectively; from the remaining 14 years, we sampled increasing amounts of data for model calibration, and found pronounced differences in model performance. While GR4H required less data to converge, LSTM and GRU caught up at a remarkable rate, considering their number of parameters. Also, LSTM and GRU exhibited the higher calibration instability in comparison to GR4H. These findings confirm the potential of modern deep-learning architectures in rainfall runoff modelling, but also highlight the noticeable differences between them in regard to the effect of calibration data length.
ATIB
(2021)
Identity management is a principle component of securing online services. In the advancement of traditional identity management patterns, the identity provider remained a Trusted Third Party (TTP). The service provider and the user need to trust a particular identity provider for correct attributes amongst other demands. This paradigm changed with the invention of blockchain-based Self-Sovereign Identity (SSI) solutions that primarily focus on the users. SSI reduces the functional scope of the identity provider to an attribute provider while enabling attribute aggregation. Besides that, the development of new protocols, disregarding established protocols and a significantly fragmented landscape of SSI solutions pose considerable challenges for an adoption by service providers. We propose an Attribute Trust-enhancing Identity Broker (ATIB) to leverage the potential of SSI for trust-enhancing attribute aggregation. Furthermore, ATIB abstracts from a dedicated SSI solution and offers standard protocols. Therefore, it facilitates the adoption by service providers. Despite the brokered integration approach, we show that ATIB provides a high security posture. Additionally, ATIB does not compromise the ten foundational SSI principles for the users.
Cyber-physical systems often encompass complex concurrent behavior with timing constraints and probabilistic failures on demand. The analysis whether such systems with probabilistic timed behavior adhere to a given specification is essential. When the states of the system can be represented by graphs, the rule-based formalism of Probabilistic Timed Graph Transformation Systems (PTGTSs) can be used to suitably capture structure dynamics as well as probabilistic and timed behavior of the system. The model checking support for PTGTSs w.r.t. properties specified using Probabilistic Timed Computation Tree Logic (PTCTL) has been already presented. Moreover, for timed graph-based runtime monitoring, Metric Temporal Graph Logic (MTGL) has been developed for stating metric temporal properties on identified subgraphs and their structural changes over time. In this paper, we (a) extend MTGL to the Probabilistic Metric Temporal Graph Logic (PMTGL) by allowing for the specification of probabilistic properties, (b) adapt our MTGL satisfaction checking approach to PTGTSs, and (c) combine the approaches for PTCTL model checking and MTGL satisfaction checking to obtain a Bounded Model Checking (BMC) approach for PMTGL. In our evaluation, we apply an implementation of our BMC approach in AutoGraph to a running example.
Gene expression data provide the expression levels of tens of thousands of genes from several hundred samples. These data are analyzed to detect biomarkers that can be of prognostic or diagnostic use. Traditionally, biomarker detection for gene expression data is the task of gene selection. The vast number of genes is reduced to a few relevant ones that achieve the best performance for the respective use case. Traditional approaches select genes based on their statistical significance in the data set. This results in issues of robustness, redundancy and true biological relevance of the selected genes. Integrative analyses typically address these shortcomings by integrating multiple data artifacts from the same objects, e.g. gene expression and methylation data. When only gene expression data are available, integrative analyses instead use curated information on biological processes from public knowledge bases. With knowledge bases providing an ever-increasing amount of curated biological knowledge, such prior knowledge approaches become more powerful. This paper provides a thorough overview on the status quo of biomarker detection on gene expression data with prior biological knowledge. We discuss current shortcomings of traditional approaches, review recent external knowledge bases, provide a classification and qualitative comparison of existing prior knowledge approaches and discuss open challenges for this kind of gene selection.
Many important graph-theoretic notions can be encoded as counting graph homomorphism problems, such as partition functions in statistical physics, in particular independent sets and colourings. In this article, we study the complexity of #(p) HOMSTOH, the problem of counting graph homomorphisms from an input graph to a graph H modulo a prime number p. Dyer and Greenhill proved a dichotomy stating that the tractability of non-modular counting graph homomorphisms depends on the structure of the target graph. Many intractable cases in non-modular counting become tractable in modular counting due to the common phenomenon of cancellation. In subsequent studies on counting modulo 2, however, the influence of the structure of H on the tractability was shown to persist, which yields similar dichotomies. <br /> Our main result states that for every tree H and every prime p the problem #pHOMSTOH is either polynomial time computable or #P-p-complete. This relates to the conjecture of Faben and Jerrum stating that this dichotomy holds for every graph H when counting modulo 2. In contrast to previous results on modular counting, the tractable cases of #pHOMSTOH are essentially the same for all values of the modulo when H is a tree. To prove this result, we study the structural properties of a homomorphism. As an important interim result, our study yields a dichotomy for the problem of counting weighted independent sets in a bipartite graph modulo some prime p. These results are the first suggesting that such dichotomies hold not only for the modulo 2 case but also for the modular counting functions of all primes p.
Spreadsheets are among the most commonly used file formats for data management, distribution, and analysis. Their widespread employment makes it easy to gather large collections of data, but their flexible canvas-based structure makes automated analysis difficult without heavy preparation. One of the common problems that practitioners face is the presence of multiple, independent regions in a single spreadsheet, possibly separated by repeated empty cells. We define such files as "multiregion" files. In collections of various spreadsheets, we can observe that some share the same layout. We present the Mondrian approach to automatically identify layout templates across multiple files and systematically extract the corresponding regions. Our approach is composed of three phases: first, each file is rendered as an image and inspected for elements that could form regions; then, using a clustering algorithm, the identified elements are grouped to form regions; finally, every file layout is represented as a graph and compared with others to find layout templates. We compare our method to state-of-the-art table recognition algorithms on two corpora of real-world enterprise spreadsheets. Our approach shows the best performances in detecting reliable region boundaries within each file and can correctly identify recurring layouts across files.
The automated detection of sequential anomalies in time series is an essential task for many applications, such as the monitoring of technical systems, fraud detection in high-frequency trading, or the early detection of disease symptoms. All these applications require the detection to find all sequential anomalies possibly fast on potentially very large time series. In other words, the detection needs to be effective, efficient and scalable w.r.t. the input size. Series2Graph is an effective solution based on graph embeddings that are robust against re-occurring anomalies and can discover sequential anomalies of arbitrary length and works without training data. Yet, Series2Graph is no t scalable due to its single-threaded approach; it cannot, in particular, process arbitrarily large sequences due to the memory constraints of a single machine. In this paper, we propose our distributed anomaly detection system, short DADS, which is an efficient and scalable adaptation of Series2Graph. Based on the actor programming model, DADS distributes the input time sequence, intermediate state and the computation to all processors of a cluster in a way that minimizes communication costs and synchronization barriers. Our evaluation shows that DADS is orders of magnitude faster than S2G, scales almost linearly with the number of processors in the cluster and can process much larger input sequences due to its scale-out property.
The integration of multiple data sources is a common problem in a large variety of applications. Traditionally, handcrafted similarity measures are used to discover, merge, and integrate multiple representations of the same entity-duplicates-into a large homogeneous collection of data. Often, these similarity measures do not cope well with the heterogeneity of the underlying dataset. In addition, domain experts are needed to manually design and configure such measures, which is both time-consuming and requires extensive domain expertise. <br /> We propose a deep Siamese neural network, capable of learning a similarity measure that is tailored to the characteristics of a particular dataset. With the properties of deep learning methods, we are able to eliminate the manual feature engineering process and thus considerably reduce the effort required for model construction. In addition, we show that it is possible to transfer knowledge acquired during the deduplication of one dataset to another, and thus significantly reduce the amount of data required to train a similarity measure. We evaluated our method on multiple datasets and compare our approach to state-of-the-art deduplication methods. Our approach outperforms competitors by up to +26 percent F-measure, depending on task and dataset. In addition, we show that knowledge transfer is not only feasible, but in our experiments led to an improvement in F-measure of up to +4.7 percent.