publish.UP Search

A distributed data exchange engine for polystores (2020)

Kaitoua, Abdulrahman ; Rabl, Tilmann ; Markl, Volker

There is an increasing interest in fusing data from heterogeneous sources. Combining data sources increases the utility of existing datasets, generating new information and creating services of higher quality. A central issue in working with heterogeneous sources is data migration: In order to share and process data in different engines, resource intensive and complex movements and transformations between computing engines, services, and stores are necessary. Muses is a distributed, high-performance data migration engine that is able to interconnect distributed data stores by forwarding, transforming, repartitioning, or broadcasting data among distributed engines' instances in a resource-, cost-, and performance-adaptive manner. As such, it performs seamless information sharing across all participating resources in a standard, modular manner. We show an overall improvement of 30 % for pipelining jobs across multiple engines, even when we count the overhead of Muses in the execution time. This performance gain implies that Muses can be used to optimise large pipelines that leverage multiple engines.

A flexible tool to correct superimposed mass isotopologue distributions in GC-APCI-MS flux experiments (2022)

Langenhan, Jennifer ; Jaeger, Carsten ; Baum, Katharina ; Simon, Mareike ; Lisec, Jan

The investigation of metabolic fluxes and metabolite distributions within cells by means of tracer molecules is a valuable tool to unravel the complexity of biological systems. Technological advances in mass spectrometry (MS) technology such as atmospheric pressure chemical ionization (APCI) coupled with high resolution (HR), not only allows for highly sensitive analyses but also broadens the usefulness of tracer-based experiments, as interesting signals can be annotated de novo when not yet present in a compound library. However, several effects in the APCI ion source, i.e., fragmentation and rearrangement, lead to superimposed mass isotopologue distributions (MID) within the mass spectra, which need to be corrected during data evaluation as they will impair enrichment calculation otherwise. Here, we present and evaluate a novel software tool to automatically perform such corrections. We discuss the different effects, explain the implemented algorithm, and show its application on several experimental datasets. This adjustable tool is available as an R package from CRAN.

A framework for improved video text detection and recognition (2014)

Yang, Haojin ; Quehl, Bernhard ; Sack, Harald

Text displayed in a video is an essential part for the high-level semantic information of the video content. Therefore, video text can be used as a valuable source for automated video indexing in digital video libraries. In this paper, we propose a workflow for video text detection and recognition. In the text detection stage, we have developed a fast localization-verification scheme, in which an edge-based multi-scale text detector first identifies potential text candidates with high recall rate. Then, detected candidate text lines are refined by using an image entropy-based filter. Finally, Stroke Width Transform (SWT)- and Support Vector Machine (SVM)-based verification procedures are applied to eliminate the false alarms. For text recognition, we have developed a novel skeleton-based binarization method in order to separate text from complex backgrounds to make it processible for standard OCR (Optical Character Recognition) software. Operability and accuracy of proposed text detection and binarization methods have been evaluated by using publicly available test data sets.

A logic-based incremental approach to graph repair featuring delta preservation (2021)

Schneider, Sven ; Lambers, Leen ; Orejas, Fernando

We introduce a logic-based incremental approach to graph repair, generating a sound and complete (upon termination) overview of least-changing graph repairs from which a user may select a graph repair based on non-formalized further requirements. This incremental approach features delta preservation as it allows to restrict the generation of graph repairs to delta-preserving graph repairs, which do not revert the additions and deletions of the most recent consistency-violating graph update. We specify consistency of graphs using the logic of nested graph conditions, which is equivalent to first-order logic on graphs. Technically, the incremental approach encodes if and how the graph under repair satisfies a graph condition using the novel data structure of satisfaction trees, which are adapted incrementally according to the graph updates applied. In addition to the incremental approach, we also present two state-based graph repair algorithms, which restore consistency of a graph independent of the most recent graph update and which generate additional graph repairs using a global perspective on the graph under repair. We evaluate the developed algorithms using our prototypical implementation in the tool AutoGraph and illustrate our incremental approach using a case study from the graph database domain.

A mixed transaction processing and operational reporting benchmark (2011)

Bog, Anja ; Plattner, Hasso ; Zeier, Alexander

The importance of reporting is ever increasing in today's fast-paced market environments and the availability of up-to-date information for reporting has become indispensable. Current reporting systems are separated from the online transaction processing systems (OLTP) with periodic updates pushed in. A pre-defined and aggregated subset of the OLTP data, however, does not provide the flexibility, detail, and timeliness needed for today's operational reporting. As technology advances, this separation has to be re-evaluated and means to study and evaluate new trends in data storage management have to be provided. This article proposes a benchmark for combined OLTP and operational reporting, providing means to evaluate the performance of enterprise data management systems for mixed workloads of OLTP and operational reporting queries. Such systems offer up-to-date information and the flexibility of the entire data set for reporting. We describe how the benchmark provokes the conflicts that are the reason for separating the two workloads on different systems. In this article, we introduce the concepts, logical data schema, transactions and queries of the benchmark, which are entirely based on the original data sets and real workloads of existing, globally operating enterprises.

A quantifiable trustmModel for Blockchain-based identity management (2019)

Grüner, Andreas ; Mühle, Alexander ; Gayvoronskaya, Tatiana ; Meinel, Christoph

A service-oriented approach for classifying 3D points clouds by example of office furniture classification (2018)

Stojanovic, Vladeta ; Trapp, Matthias ; Richter, Rico ; Döllner, Jürgen Roland Friedrich

The rapid digitalization of the Facility Management (FM) sector has increased the demand for mobile, interactive analytics approaches concerning the operational state of a building. These approaches provide the key to increasing stakeholder engagement associated with Operation and Maintenance (O&M) procedures of living and working areas, buildings, and other built environment spaces. We present a generic and fast approach to process and analyze given 3D point clouds of typical indoor office spaces to create corresponding up-to-date approximations of classified segments and object-based 3D models that can be used to analyze, record and highlight changes of spatial configurations. The approach is based on machine-learning methods used to classify the scanned 3D point cloud data using 2D images. This approach can be used to primarily track changes of objects over time for comparison, allowing for routine classification, and presentation of results used for decision making. We specifically focus on classification, segmentation, and reconstruction of multiple different object types in a 3D point-cloud scene. We present our current research and describe the implementation of these technologies as a web-based application using a services-oriented methodology.

A simple distributed mechanism for accounting system self-configuration in next-generation charging and billing (2011)

Kühne, Ralph ; Huitema, George ; Carle, Georg

Modern communication systems are becoming increasingly dynamic and complex. In this article a novel mechanism for next generation charging and billing is presented that enables self-configurability for accounting systems consisting of heterogeneous components. The mechanism is required to be simple, effective, efficient, scalable and fault-tolerant. Based on simulation results it is shown that the proposed simple distributed mechanism is competitive with usual cost-based or random mechanisms under realistic assumptions and up to non-extreme workload situations as well as fulfilling the posed requirements.

A simplified run time analysis of the univariate marginal distribution algorithm on LeadingOnes (2021)

Doerr, Benjamin ; Krejca, Martin Stefan

With elementary means, we prove a stronger run time guarantee for the univariate marginal distribution algorithm (UMDA) optimizing the LEADINGONES benchmark function in the desirable regime with low genetic drift. If the population size is at least quasilinear, then, with high probability, the UMDA samples the optimum in a number of iterations that is linear in the problem size divided by the logarithm of the UMDA's selection rate. This improves over the previous guarantee, obtained by Dang and Lehre (2015) via the deep level-based population method, both in terms of the run time and by demonstrating further run time gains from small selection rates. Under similar assumptions, we prove a lower bound that matches our upper bound up to constant factors.

A software reference architecture for service-oriented 3D geovisualization systems (2014)

Hildebrandt, Dieter

A survey and comparison of transformation tools based on the transformation tool contest (2014)

Buchwald, Sebastian ; Wagelaar, Dennis ; Dan, Li ; Hegedues, Abel ; Herrmannsdoerfer, Markus ; Horn, Tassilo ; Kalnina, Elina ; Krause, Christian ; Lano, Kevin ; Lepper, Markus ; Rensink, Arend ; Rose, Louis ; Waetzoldt, Sebastian ; Mazanek, Steffen

Model transformation is one of the key tasks in model-driven engineering and relies on the efficient matching and modification of graph-based data structures; its sibling graph rewriting has been used to successfully model problems in a variety of domains. Over the last years, a wide range of graph and model transformation tools have been developed all of them with their own particular strengths and typical application domains. In this paper, we give a survey and a comparison of the model and graph transformation tools that participated at the Transformation Tool Contest 2011. The reader gains an overview of the field and its tools, based on the illustrative solutions submitted to a Hello World task, and a comparison alongside a detailed taxonomy. The article is of interest to researchers in the field of model and graph transformation, as well as to software engineers with a transformation task at hand who have to choose a tool fitting to their needs. All solutions referenced in this article provide a SHARE demo. It supported the peer-review process for the contest, and now allows the reader to test the tools online.

A systematic review of diagnostic accuracy and clinical applications of wearable movement sensors for knee joint rehabilitation (2021)

Prill, Robert ; Walter, Marina ; Królikowska, Aleksandra ; Becker, Roland

In clinical practice, only a few reliable measurement instruments are available for monitoring knee joint rehabilitation. Advances to replace motion capturing with sensor data measurement have been made in the last years. Thus, a systematic review of the literature was performed, focusing on the implementation, diagnostic accuracy, and facilitators and barriers of integrating wearable sensor technology in clinical practices based on a Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement. For critical appraisal, the COSMIN Risk of Bias tool for reliability and measurement of error was used. PUBMED, Prospero, Cochrane database, and EMBASE were searched for eligible studies. Six studies reporting reliability aspects in using wearable sensor technology at any point after knee surgery in humans were included. All studies reported excellent results with high reliability coefficients, high limits of agreement, or a few detectable errors. They used different or partly inappropriate methods for estimating reliability or missed reporting essential information. Therefore, a moderate risk of bias must be considered. Further quality criterion studies in clinical settings are needed to synthesize the evidence for providing transparent recommendations for the clinical use of wearable movement sensors in knee joint rehabilitation.

A variant of genetic algorithm for non-homogeneous population (2017)

Alibabaie, Najmeh ; Ghasemzadeh, Mohammad ; Meinel, Christoph

Selection of initial points, the number of clusters and finding proper clusters centers are still the main challenge in clustering processes. In this paper, we suggest genetic algorithm based method which searches several solution spaces simultaneously. The solution spaces are population groups consisting of elements with similar structure. Elements in a group have the same size, while elements in different groups are of different sizes. The proposed algorithm processes the population in groups of chromosomes with one gene, two genes to k genes. These genes hold corresponding information about the cluster centers. In the proposed method, the crossover and mutation operators can accept parents with different sizes; this can lead to versatility in population and information transfer among sub-populations. We implemented the proposed method and evaluated its performance against some random datasets and the Ruspini dataset as well. The experimental results show that the proposed method could effectively determine the appropriate number of clusters and recognize their centers. Overall this research implies that using heterogeneous population in the genetic algorithm can lead to better results.

Abstract representations for interactive visualization of virtual 3D city models (2009)

Glander, Tassilo ; Döllner, Jürgen Roland Friedrich

Virtual 3D city models increasingly cover whole city areas; hence, the perception of complex urban structures becomes increasingly difficult. Using abstract visualization, complexity of these models can be hidden where its visibility is unnecessary, while important features are maintained and highlighted for better comprehension and communication. We present a technique to automatically generalize a given virtual 3D city model consisting of building models, an infrastructure network and optional land coverage data; this technique creates several representations of increasing levels of abstraction. Using the infrastructure network, our technique groups building models and replaces them with cell blocks, while preserving local landmarks. By computing a landmark hierarchy, we reduce the set of initial landmarks in a spatially balanced manner for use in higher levels of abstraction. In four application examples, we demonstrate smooth visualization of transitions between precomputed representations; dynamic landmark highlighting according to virtual camera distance; an implementation of a cognitively enhanced route representation, and generalization lenses to combine precomputed representations in focus + context visualization.

Acute kidney injury in patients hospitalized with COVID-19 in New York City (2021)

Dellepiane, Sergio ; Vaid, Akhil ; Jaladanki, Suraj K. ; Coca, Steven ; Fayad, Zahi A. ; Charney, Alexander W. ; Böttinger, Erwin ; He, John Cijiang ; Glicksberg, Benjamin S. ; Chan, Lili ; Nadkarni, Girish

Ad hoc learning of peptide fragmentation from mass spectra enables an interpretable detection of phosphorylated and cross-linked peptides (2022)

Altenburg, Tom ; Giese, Sven Hans-Joachim ; Wang, Shengbo ; Muth, Thilo ; Renard, Bernhard Y.

Fragmentation of peptides leaves characteristic patterns in mass spectrometry data, which can be used to identify protein sequences, but this method is challenging for mutated or modified sequences for which limited information exist. Altenburg et al. use an ad hoc learning approach to learn relevant patterns directly from unannotated fragmentation spectra. Mass spectrometry-based proteomics provides a holistic snapshot of the entire protein set of living cells on a molecular level. Currently, only a few deep learning approaches exist that involve peptide fragmentation spectra, which represent partial sequence information of proteins. Commonly, these approaches lack the ability to characterize less studied or even unknown patterns in spectra because of their use of explicit domain knowledge. Here, to elevate unrestricted learning from spectra, we introduce 'ad hoc learning of fragmentation' (AHLF), a deep learning model that is end-to-end trained on 19.2 million spectra from several phosphoproteomic datasets. AHLF is interpretable, and we show that peak-level feature importance values and pairwise interactions between peaks are in line with corresponding peptide fragments. We demonstrate our approach by detecting post-translational modifications, specifically protein phosphorylation based on only the fragmentation spectrum without a database search. AHLF increases the area under the receiver operating characteristic curve (AUC) by an average of 9.4% on recent phosphoproteomic data compared with the current state of the art on this task. Furthermore, use of AHLF in rescoring search results increases the number of phosphopeptide identifications by a margin of up to 15.1% at a constant false discovery rate. To show the broad applicability of AHLF, we use transfer learning to also detect cross-linked peptides, as used in protein structure analysis, with an AUC of up to 94%.

Adding Value by Combining Business and Sensor Data (2019)

Hesse, Günter ; Matthies, Christoph ; Sinzig, Werner ; Uflacker, Matthias

Industry 4.0 and the Internet of Things are recent developments that have lead to the creation of new kinds of manufacturing data. Linking this new kind of sensor data to traditional business information is crucial for enterprises to take advantage of the data’s full potential. In this paper, we present a demo which allows experiencing this data integration, both vertically between technical and business contexts and horizontally along the value chain. The tool simulates a manufacturing company, continuously producing both business and sensor data, and supports issuing ad-hoc queries that answer specific questions related to the business. In order to adapt to different environments, users can configure sensor characteristics to their needs.

Affect-aware word clouds (2020)

Kulahcioglu, Tugba ; Melo, Gerard de

Word clouds are widely used for non-analytic purposes, such as introducing a topic to students, or creating a gift with personally meaningful text. Surveys show that users prefer tools that yield word clouds with a stronger emotional impact. Fonts and color palettes are powerful typographical signals that may determine this impact. Typically, these signals are assigned randomly, or expected to be chosen by the users. We present an affect-aware font and color palette selection methodology that aims to facilitate more informed choices. We infer associations of fonts with a set of eight affects, and evaluate the resulting data in a series of user studies both on individual words as well as in word clouds. Relying on a recent study to procure affective color palettes, we carry out a similar user study to understand the impact of color choices on word clouds. Our findings suggest that both fonts and color palettes are powerful tools contributing to the affects evoked by a word cloud. The experiments further confirm that the novel datasets we propose are successful in enabling this. We also find that, for the majority of the affects, both signals need to be congruent to create a stronger impact. Based on this data, we implement a prototype that allows users to specify a desired affect and recommends congruent fonts and color palettes for the word.

AKI in hospitalized patients with COVID-19 (2021)

Background: Early reports indicate that AKI is common among patients with coronavirus disease 2019 (COVID-19) and associatedwith worse outcomes. However, AKI among hospitalized patients with COVID19 in the United States is not well described. Methods: This retrospective, observational study involved a review of data from electronic health records of patients aged >= 18 years with laboratory-confirmed COVID-19 admitted to the Mount Sinai Health System from February 27 to May 30, 2020. We describe the frequency of AKI and dialysis requirement, AKI recovery, and adjusted odds ratios (aORs) with mortality. Results: Of 3993 hospitalized patients with COVID-19, AKI occurred in 1835 (46%) patients; 347 (19%) of the patientswith AKI required dialysis. The proportionswith stages 1, 2, or 3 AKIwere 39%, 19%, and 42%, respectively. A total of 976 (24%) patients were admitted to intensive care, and 745 (76%) experienced AKI. Of the 435 patients with AKI and urine studies, 84% had proteinuria, 81% had hematuria, and 60% had leukocyturia. Independent predictors of severe AKI were CKD, men, and higher serum potassium at admission. In-hospital mortality was 50% among patients with AKI versus 8% among those without AKI (aOR, 9.2; 95% confidence interval, 7.5 to 11.3). Of survivors with AKI who were discharged, 35% had not recovered to baseline kidney function by the time of discharge. An additional 28 of 77 (36%) patients who had not recovered kidney function at discharge did so on posthospital follow-up. Conclusions: AKI is common among patients hospitalized with COVID-19 and is associated with high mortality. Of all patients with AKI, only 30% survived with recovery of kidney function by the time of discharge.

An alert correlation platform for memory-supported techniques (2012)

Roschke, Sebastian ; Cheng, Feng ; Meinel, Christoph

Intrusion Detection Systems (IDS) have been widely deployed in practice for detecting malicious behavior on network communication and hosts. False-positive alerts are a popular problem for most IDS approaches. The solution to address this problem is to enhance the detection process by correlation and clustering of alerts. To meet the practical requirements, this process needs to be finished fast, which is a challenging task as the amount of alerts in large-scale IDS deployments is significantly high. We identifytextitdata storage and processing algorithms to be the most important factors influencing the performance of clustering and correlation. We propose and implement a highly efficient alert correlation platform. For storage, a column-based database, an In-Memory alert storage, and memory-based index tables lead to significant improvements of the performance. For processing, algorithms are designed and implemented which are optimized for In-Memory databases, e.g. an attack graph-based correlation algorithm. The platform can be distributed over multiple processing units to share memory and processing power. A standardized interface is designed to provide a unified view of result reports for end users. The efficiency of the platform is tested by practical experiments with several alert storage approaches, multiple algorithms, as well as a local and a distributed deployment.

An assisting, constrained 3D navigation technique for multiscale virtual 3D city models (2014)

Hildebrandt, Dieter ; Timm, Robert

Virtual 3D city models serve as integration platforms for complex geospatial and georeferenced information and as medium for effective communication of spatial information. In order to explore these information spaces, navigation techniques for controlling the virtual camera are required to facilitate wayfinding and movement. However, navigation is not a trivial task and many available navigation techniques do not support users effectively and efficiently with their respective skills and tasks. In this article, we present an assisting, constrained navigation technique for multiscale virtual 3D city models that is based on three basic principles: users point to navigate, users are lead by suggestions, and the exploitation of semantic, multiscale, hierarchical structurings of city models. The technique particularly supports users with low navigation and virtual camera control skills but is also valuable for experienced users. It supports exploration, search, inspection, and presentation tasks, is easy to learn and use, supports orientation, is efficient, and yields effective view properties. In particular, the technique is suitable for interactive kiosks and mobile devices with a touch display and low computing resources and for use in mobile situations where users only have restricted resources for operating the application. We demonstrate the validity of the proposed navigation technique by presenting an implementation and evaluation results. The implementation is based on service-oriented architectures, standards, and image-based representations and allows exploring massive virtual 3D city models particularly on mobile devices with limited computing resources. Results of a user study comparing the proposed navigation technique with standard techniques suggest that the proposed technique provides the targeted properties, and that it is more advantageous to novice than to expert users.

An exception handling framework for case management (2022)

Andree, Kerstin ; Ihde, Sven ; Weske, Mathias ; Pufahl, Luise

In order to achieve their business goals, organizations heavily rely on the operational excellence of their business processes. In traditional scenarios, business processes are usually well-structured, clearly specifying when and how certain tasks have to be executed. Flexible and knowledge-intensive processes are gathering momentum, where a knowledge worker drives the execution of a process case and determines the exact process path at runtime. In the case of an exception, the knowledge worker decides on an appropriate handling. While there is initial work on exception handling in well-structured business processes, exceptions in case management have not been sufficiently researched. This paper proposes an exception handling framework for stage-oriented case management languages, namely Guard Stage Milestone Model, Case Management Model and Notation, and Fragment-based Case Management. The effectiveness of the framework is evaluated with two real-world use cases showing that it covers all relevant exceptions and proposed handling strategies.

An Exploratory Study to Detect Temporal Orientation Using Bluetooth's sensor (2019)

Hernandez, Netzahualcoyotl ; Demiray, Burcu ; Arnrich, Bert ; Favela, Jesus

Mobile sensing technology allows us to investigate human behaviour on a daily basis. In the study, we examined temporal orientation, which refers to the capacity of thinking or talking about personal events in the past and future. We utilise the mksense platform that allows us to use the experience-sampling method. Individual's thoughts and their relationship with smartphone's Bluetooth data is analysed to understand in which contexts people are influenced by social environments, such as the people they spend the most time with. As an exploratory study, we analyse social condition influence through a collection of Bluetooth data and survey information from participant's smartphones. Preliminary results show that people are likely to focus on past events when interacting with close-related people, and focus on future planning when interacting with strangers. Similarly, people experience present temporal orientation when accompanied by known people. We believe that these findings are linked to emotions since, in its most basic state, emotion is a state of physiological arousal combined with an appropriated cognition. In this contribution, we envision a smartphone application for automatically inferring human emotions based on user's temporal orientation by using Bluetooth sensors, we briefly elaborate on the influential factor of temporal orientation episodes and conclude with a discussion and lessons learned.

An interactive platform to simulate dynamic pricing competition on online marketplaces (2017)

Serth, Sebastian ; Podlesny, Nikolai ; Bornstein, Marvin ; Lindemann, Jan ; Latt, Johanna ; Selke, Jan ; Schlosser, Rainer ; Boissier, Martin ; Uflacker, Matthias

E-commerce marketplaces are highly dynamic with constant competition. While this competition is challenging for many merchants, it also provides plenty of opportunities, e.g., by allowing them to automatically adjust prices in order to react to changing market situations. For practitioners however, testing automated pricing strategies is time-consuming and potentially hazardously when done in production. Researchers, on the other side, struggle to study how pricing strategies interact under heavy competition. As a consequence, we built an open continuous time framework to simulate dynamic pricing competition called Price Wars. The microservice-based architecture provides a scalable platform for large competitions with dozens of merchants and a large random stream of consumers. Our platform stores each event in a distributed log. This allows to provide different performance measures enabling users to compare profit and revenue of various repricing strategies in real-time. For researchers, price trajectories are shown which ease evaluating mutual price reactions of competing strategies. Furthermore, merchants can access historical marketplace data and apply machine learning. By providing a set of customizable, artificial merchants, users can easily simulate both simple rule-based strategies as well as sophisticated data-driven strategies using demand learning to optimize their pricing strategies.

An open implementation for context-oriented layer composition in ContextJS (2011)

Lincke, Jens ; Appeltauer, Malte ; Steinert, Bastian ; Hirschfeld, Robert

Context-oriented programming (COP) provides dedicated support for defining and composing variations to a basic program behavior. A variation, which is defined within a layer, can be de-/activated for the dynamic extent of a code block. While this mechanism allows for control flow-specific scoping, expressing behavior adaptations can demand alternative scopes. For instance, adaptations can depend on dynamic object structure rather than control flow. We present scenarios for behavior adaptation and identify the need for new scoping mechanisms. The increasing number of scoping mechanisms calls for new language abstractions representing them. We suggest to open the implementation of scoping mechanisms so that developers can extend the COP language core according to their specific needs. Our open implementation moves layer composition into objects to be affected and with that closer to the method dispatch to be changed. We discuss the implementation of established COP scoping mechanisms using our approach and present new scoping mechanisms developed for our enhancements to Lively Kernel.

Are we there yet? Simple language implementation techniques for the 21st century (2014)

Marr, Stefan ; Pape, Tobias ; De Meuter, Wolfgang

Assessing NUMA performance based on hardware event counters (2017)

Plauth, Max ; Sterz, Christoph ; Eberhardt, Felix ; Feinbube, Frank ; Polze, Andreas

Cost models play an important role for the efficient implementation of software systems. These models can be embedded in operating systems and execution environments to optimize execution at run time. Even though non-uniform memory access (NUMA) architectures are dominating today's server landscape, there is still a lack of parallel cost models that represent NUMA system sufficiently. Therefore, the existing NUMA models are analyzed, and a two-step performance assessment strategy is proposed that incorporates low-level hardware counters as performance indicators. To support the two-step strategy, multiple tools are developed, all accumulating and enriching specific hardware event counter information, to explore, measure, and visualize these low-overhead performance indicators. The tools are showcased and discussed alongside specific experiments in the realm of performance assessment.

ATIB (2021)

Grüner, Andreas ; Mühle, Alexander ; Meinel, Christoph

Identity management is a principle component of securing online services. In the advancement of traditional identity management patterns, the identity provider remained a Trusted Third Party (TTP). The service provider and the user need to trust a particular identity provider for correct attributes amongst other demands. This paradigm changed with the invention of blockchain-based Self-Sovereign Identity (SSI) solutions that primarily focus on the users. SSI reduces the functional scope of the identity provider to an attribute provider while enabling attribute aggregation. Besides that, the development of new protocols, disregarding established protocols and a significantly fragmented landscape of SSI solutions pose considerable challenges for an adoption by service providers. We propose an Attribute Trust-enhancing Identity Broker (ATIB) to leverage the potential of SSI for trust-enhancing attribute aggregation. Furthermore, ATIB abstracts from a dedicated SSI solution and offers standard protocols. Therefore, it facilitates the adoption by service providers. Despite the brokered integration approach, we show that ATIB provides a high security posture. Additionally, ATIB does not compromise the ten foundational SSI principles for the users.

Attribute Compartmentation and Greedy UCC Discovery for High-Dimensional Data Anonymisation (2019)

Podlesny, Nikolai Jannik ; Kayem, Anne V. D. M. ; Meinel, Christoph

High-dimensional data is particularly useful for data analytics research. In the healthcare domain, for instance, high-dimensional data analytics has been used successfully for drug discovery. Yet, in order to adhere to privacy legislation, data analytics service providers must guarantee anonymity for data owners. In the context of high-dimensional data, ensuring privacy is challenging because increased data dimensionality must be matched by an exponential growth in the size of the data to avoid sparse datasets. Syntactically, anonymising sparse datasets with methods that rely of statistical significance, makes obtaining sound and reliable results, a challenge. As such, strong privacy is only achievable at the cost of high information loss, rendering the data unusable for data analytics. In this paper, we make two contributions to addressing this problem from both the privacy and information loss perspectives. First, we show that by identifying dependencies between attribute subsets we can eliminate privacy violating attributes from the anonymised dataset. Second, to minimise information loss, we employ a greedy search algorithm to determine and eliminate maximal partial unique attribute combinations. Thus, one only needs to find the minimal set of identifying attributes to prevent re-identification. Experiments on a health cloud based on the SAP HANA platform using a semi-synthetic medical history dataset comprised of 109 attributes, demonstrate the effectiveness of our approach.

Automated reasoning for attributed graph properties (2018)

Schneider, Sven ; Lambers, Leen ; Orejas, Fernando

Graphs are ubiquitous in computer science. Moreover, in various application fields, graphs are equipped with attributes to express additional information such as names of entities or weights of relationships. Due to the pervasiveness of attributed graphs, it is highly important to have the means to express properties on attributed graphs to strengthen modeling capabilities and to enable analysis. Firstly, we introduce a new logic of attributed graph properties, where the graph part and attribution part are neatly separated. The graph part is equivalent to first-order logic on graphs as introduced by Courcelle. It employs graph morphisms to allow the specification of complex graph patterns. The attribution part is added to this graph part by reverting to the symbolic approach to graph attribution, where attributes are represented symbolically by variables whose possible values are specified by a set of constraints making use of algebraic specifications. Secondly, we extend our refutationally complete tableau-based reasoning method as well as our symbolic model generation approach for graph properties to attributed graph properties. Due to the new logic mentioned above, neatly separating the graph and attribution parts, and the categorical constructions employed only on a more abstract level, we can leave the graph part of the algorithms seemingly unchanged. For the integration of the attribution part into the algorithms, we use an oracle, allowing for flexible adoption of different available SMT solvers in the actual implementation. Finally, our automated reasoning approach for attributed graph properties is implemented in the tool AutoGraph integrating in particular the SMT solver Z3 for the attribute part of the properties. We motivate and illustrate our work with a particular application scenario on graph database query validation.

Automating data exchange in process choreographies (2015)

Meyer, Andreas ; Pufahl, Luise ; Batoulis, Kimon ; Fahland, Dirk ; Weske, Mathias

Communication between organizations is formalized as process choreographies in daily business. While the correct ordering of exchanged messages can be modeled and enacted with current choreography techniques, no approach exists to describe and automate the exchange of data between processes in a choreography using messages. This paper describes an entirely model-driven approach for BPMN introducing a few concepts that suffice to model data retrieval, data transformation, message exchange, and correlation four aspects of data exchange. For automation, this work utilizes a recent concept to enact data dependencies in internal processes. We present a modeling guideline to derive local process models from a given choreography; their operational semantics allows to correctly enact the entire choreography from the derived models only including the exchange of data. Targeting on successful interactions, we discuss means to ensure correct process choreography modeling. Finally, we implemented our approach by extending the camunda BPM platform with our approach and show its feasibility by realizing all service interaction patterns using only model-based concepts. (C) 2015 Elsevier Ltd. All rights reserved.

Avoiding irreducible CSC conflicts by internal communication (2009)

Wist, Dominic ; Wollowski, Ralf ; Schaefer, Mark ; Vogler, Walter

Resynthesis of handshake specifications obtained e. g. from BALSA or TANGRAM with speed-independent logic synthesis from STGs is a promising approach. To deal with state-space explosion, we suggested STG decomposition; a problem is that decomposition can lead to irreducible CSC conflicts. Here, we present a new approach to solve such conflicts by introducing internal communication between the components. We give some first, very encouraging results for very large STGs concerning synthesis time and circuit area.

Behaviour equivalence and compatibility of business process models with complex correspondences (2012)

Weidlich, Matthias ; Dijkman, Remco ; Weske, Mathias

Once multiple models of a business process are created for different purposes or to capture different variants, verification of behaviour equivalence or compatibility is needed. Equivalence verification ensures that two business process models specify the same behaviour. Since different process models are likely to differ with respect to their assumed level of abstraction and the actions that they take into account, equivalence notions have to cope with correspondences between sets of actions and actions that exist in one process but not in the other. In this paper, we present notions of equivalence and compatibility that can handle these problems. In essence, we present a notion of equivalence that works on correspondences between sets of actions rather than single actions. We then integrate our equivalence notion with work on behaviour inheritance that copes with actions that exist in one process but not in the other, leading to notions of behaviour compatibility. Compatibility notions verify that two models have the same behaviour with respect to the actions that they have in common. As such, our contribution is a collection of behaviour equivalence and compatibility notions that are applicable in more general settings than existing ones.

Behavioural Models (2016)

Kunze, Matthias ; Weske, Mathias

This textbook introduces the basis for modelling and analysing discrete dynamic systems, such as computer programmes, soft- and hardware systems, and business processes. The underlying concepts are introduced and concrete modelling techniques are described, such as finite automata, state machines, and Petri nets. The concepts are related to concrete application scenarios, among which business processes play a prominent role. The book consists of three parts, the first of which addresses the foundations of behavioural modelling. After a general introduction to modelling, it introduces transition systems as a basic formalism for representing the behaviour of discrete dynamic systems. This section also discusses causality, a fundamental concept for modelling and reasoning about behaviour. In turn, Part II forms the heart of the book and is devoted to models of behaviour. It details both sequential and concurrent systems and introduces finite automata, state machines and several different types of Petri nets. One chapter is especially devoted to business process models, workflow patterns and BPMN, the industry standard for modelling business processes. Lastly, Part III investigates how the behaviour of systems can be analysed. To this end, it introduces readers to the concept of state spaces. Further chapters cover the comparison of behaviour and the formal analysis and verification of behavioural models. The book was written for students of computer science and software engineering, as well as for programmers and system analysts interested in the behaviour of the systems they work on. It takes readers on a journey from the fundamentals of behavioural modelling to advanced techniques for modelling and analysing sequential and concurrent systems, and thus provides them a deep understanding of the concepts and techniques introduced and how they can be applied to concrete application scenarios.

Blockchains for Business Process Management (2018)

Blockchain technology offers a sizable promise to rethink the way interorganizational business processes are managed because of its potential to realize execution without a central party serving as a single point of trust (and failure). To stimulate research on this promise and the limits thereof, in this article, we outline the challenges and opportunities of blockchain for business process management (BPM). We first reflect how blockchains could be used in the context of the established BPM lifecycle and second how they might become relevant beyond. We conclude our discourse with a summary of seven research directions for investigating the application of blockchain technology in the context of BPM.

Bridge damage (2020)

Isailović, Dušan ; Stojanovic, Vladeta ; Trapp, Matthias ; Richter, Rico ; Hajdin, Rade ; Döllner, Jürgen Roland Friedrich

Building Information Modeling (BIM) representations of bridges enriched by inspection data will add tremendous value to future Bridge Management Systems (BMSs). This paper presents an approach for point cloud-based detection of spalling damage, as well as integrating damage components into a BIM via semantic enrichment of an as-built Industry Foundation Classes (IFC) model. An approach for generating the as-built BIM, geometric reconstruction of detected damage point clusters and semantic-enrichment of the corresponding IFC model is presented. Multiview-classification is used and evaluated for the detection of spalling damage features. The semantic enrichment of as-built IFC models is based on injecting classified and reconstructed damage clusters back into the as-built IFC, thus generating an accurate as-is IFC model compliant to the BMS inspection requirements.

Bridging the Gap (2019)

Herzog, Benedict ; Hönig, Timo ; Schröder-Preikschat, Wolfgang ; Plauth, Max ; Köhler, Sven ; Polze, Andreas

The recent restructuring of the electricity grid (i.e., smart grid) introduces a number of challenges for today's large-scale computing systems. To operate reliable and efficient, computing systems must adhere not only to technical limits (i.e., thermal constraints) but they must also reduce operating costs, for example, by increasing their energy efficiency. Efforts to improve the energy efficiency, however, are often hampered by inflexible software components that hardly adapt to underlying hardware characteristics. In this paper, we propose an approach to bridge the gap between inflexible software and heterogeneous hardware architectures. Our proposal introduces adaptive software components that dynamically adapt to heterogeneous processing units (i.e., accelerators) during runtime to improve the energy efficiency of computing systems.

Cartography-Oriented Design of 3D Geospatial Information Visualization - Overview and Techniques (2015)

Semmo, Amir ; Trapp, Matthias ; Jobst, Markus ; Döllner, Jürgen Roland Friedrich

In economy, society and personal life map-based interactive geospatial visualization becomes a natural element of a growing number of applications and systems. The visualization of 3D geospatial information, however, raises the question how to represent the information in an effective way. Considerable research has been done in technology-driven directions in the fields of cartography and computer graphics (e.g., design principles, visualization techniques). Here, non-photorealistic rendering (NPR) represents a promising visualization category - situated between both fields - that offers a large number of degrees for the cartography-oriented visual design of complex 2D and 3D geospatial information for a given application context. Still today, however, specifications and techniques for mapping cartographic design principles to the state-of-the-art rendering pipeline of 3D computer graphics remain to be explored. This paper revisits cartographic design principles for 3D geospatial visualization and introduces an extended 3D semiotic model that complies with the general, interactive visualization pipeline. Based on this model, we propose NPR techniques to interactively synthesize cartographic renditions of basic feature types, such as terrain, water, and buildings. In particular, it includes a novel iconification concept to seamlessly interpolate between photorealistic and cartographic representations of 3D landmarks. Our work concludes with a discussion of open challenges in this field of research, including topics, such as user interaction and evaluation.

Causal behavioural profiles - efficient computation, applications, and evaluation (2011)

Weidlich, Matthias ; Polyvyanyy, Artem ; Mendling, Jan ; Weske, Mathias

Analysis of behavioural consistency is an important aspect of software engineering. In process and service management, consistency verification of behavioural models has manifold applications. For instance, a business process model used as system specification and a corresponding workflow model used as implementation have to be consistent. Another example would be the analysis to what degree a process log of executed business operations is consistent with the corresponding normative process model. Typically, existing notions of behaviour equivalence, such as bisimulation and trace equivalence, are applied as consistency notions. Still, these notions are exponential in computation and yield a Boolean result. In many cases, however, a quantification of behavioural deviation is needed along with concepts to isolate the source of deviation. In this article, we propose causal behavioural profiles as the basis for a consistency notion. These profiles capture essential behavioural information, such as order, exclusiveness, and causality between pairs of activities of a process model. Consistency based on these profiles is weaker than trace equivalence, but can be computed efficiently for a broad class of models. In this article, we introduce techniques for the computation of causal behavioural profiles using structural decomposition techniques for sound free-choice workflow systems if unstructured net fragments are acyclic or can be traced back to S-or T-nets. We also elaborate on the findings of applying our technique to three industry model collections.

Causal inference in developmental medicine and neurology (2021)

Konigorski, Stefan

Certainty in QRS detection with artificial neural networks (2021)

Chromik, Jonas ; Pirl, Lukas ; Beilharz, Jossekin Jakob ; Arnrich, Bert ; Polze, Andreas

Detection of the QRS complex is a long-standing topic in the context of electrocardiography and many algorithms build upon the knowledge of the QRS positions. Although the first solutions to this problem were proposed in the 1970s and 1980s, there is still potential for improvements. Advancements in neural network technology made in recent years also lead to the emergence of enhanced QRS detectors based on artificial neural networks. In this work, we propose a method for assessing the certainty that is in each of the detected QRS complexes, i.e. how confident the QRS detector is that there is, in fact, a QRS complex in the position where it was detected. We further show how this metric can be utilised to distinguish correctly detected QRS complexes from false detections.

Charging and billing in modern communications networks a comprehensive survey of the state of the art and future requirements (2012)

Kühne, Ralph ; Huitema, George ; Carle, George

In mobile telecommunication networks the trend for an increasing heterogeneity of access networks, the convergence with fixed networks as well as with the Internet are apparent. The resulting future converged network with an expected wide variety of services and a possibly stiff competition between the different market participants as well as legal issues will bring about requirements for charging systems that demand for more flexibility, scalability and efficiency than is available in today's systems. This article surveys recent developments in charging and billing architectures comprising both standardisation work as well as research projects. The second main contribution of this article is a comparison of key features of these developments thus giving a list of essential charging and billing ingredients for tomorrow's communication and service environments.

CloudStrike (2020)

Torkura, Kennedy A. ; Sukmana, Muhammad Ihsan Haikal ; Cheng, Feng ; Meinel, Christoph

Most cyber-attacks and data breaches in cloud infrastructure are due to human errors and misconfiguration vulnerabilities. Cloud customer-centric tools are imperative for mitigating these issues, however existing cloud security models are largely unable to tackle these security challenges. Therefore, novel security mechanisms are imperative, we propose Risk-driven Fault Injection (RDFI) techniques to address these challenges. RDFI applies the principles of chaos engineering to cloud security and leverages feedback loops to execute, monitor, analyze and plan security fault injection campaigns, based on a knowledge-base. The knowledge-base consists of fault models designed from secure baselines, cloud security best practices and observations derived during iterative fault injection campaigns. These observations are helpful for identifying vulnerabilities while verifying the correctness of security attributes (integrity, confidentiality and availability). Furthermore, RDFI proactively supports risk analysis and security hardening efforts by sharing security information with security mechanisms. We have designed and implemented the RDFI strategies including various chaos engineering algorithms as a software tool: CloudStrike. Several evaluations have been conducted with CloudStrike against infrastructure deployed on two major public cloud infrastructure: Amazon Web Services and Google Cloud Platform. The time performance linearly increases, proportional to increasing attack rates. Also, the analysis of vulnerabilities detected via security fault injection has been used to harden the security of cloud resources to demonstrate the effectiveness of the security information provided by CloudStrike. Therefore, we opine that our approaches are suitable for overcoming contemporary cloud security issues.

Concepts and techniques for integration, analysis and visualization of massive 3D point clouds (2014)

Richter, Rico ; Döllner, Jürgen Roland Friedrich

Remote sensing methods, such as LiDAR and image-based photogrammetry, are established approaches for capturing the physical world. Professional and low-cost scanning devices are capable of generating dense 3D point clouds. Typically, these 3D point clouds are preprocessed by GIS and are then used as input data in a variety of applications such as urban planning, environmental monitoring, disaster management, and simulation. The availability of area-wide 3D point clouds will drastically increase in the future due to the availability of novel capturing methods (e.g., driver assistance systems) and low-cost scanning devices. Applications, systems, and workflows will therefore face large collections of redundant, up-to-date 3D point clouds and have to cope with massive amounts of data. Hence, approaches are required that will efficiently integrate, update, manage, analyze, and visualize 3D point clouds. In this paper, we define requirements for a system infrastructure that enables the integration of 3D point clouds from heterogeneous capturing devices and different timestamps. Change detection and update strategies for 3D point clouds are presented that reduce storage requirements and offer new insights for analysis purposes. We also present an approach that attributes 3D point clouds with semantic information (e.g., object class category information), which enables more effective data processing, analysis, and visualization. Out-of-core real-time rendering techniques then allow for an interactive exploration of the entire 3D point cloud and the corresponding analysis results. Web-based visualization services are utilized to make 3D point clouds available to a large community. The proposed concepts and techniques are designed to establish 3D point clouds as base datasets, as well as rendering primitives for analysis and visualization tasks, which allow operations to be performed directly on the point data. Finally, we evaluate the presented system, report on its applications, and discuss further research challenges.

Concepts and techniques for web-based visualization and processing of massive 3D point clouds with semantics (2019)

Discher, Sören ; Richter, Rico ; Döllner, Jürgen Roland Friedrich

3D point cloud technology facilitates the automated and highly detailed acquisition of real-world environments such as assets, sites, and countries. We present a web-based system for the interactive exploration and inspection of arbitrary large 3D point clouds. Our approach is able to render 3D point clouds with billions of points using spatial data structures and level-of-detail representations. Point-based rendering techniques and post-processing effects are provided to enable task-specific and data-specific filtering, e.g., based on semantics. A set of interaction techniques allows users to collaboratively work with the data (e.g., measuring distances and annotating). Additional value is provided by the system’s ability to display additional, context-providing geodata alongside 3D point clouds and to integrate processing and analysis operations. We have evaluated the presented techniques and in case studies and with different data sets from aerial, mobile, and terrestrial acquisition with up to 120 billion points to show their practicality and feasibility.

Concepts for cartography-oriented visualization of virtual 3D city models (2012)

Semmo, Amir ; Hildebrandt, Dieter ; Trapp, Matthias ; Döllner, Jürgen Roland Friedrich

Virtual 3D city models serve as an effective medium with manifold applications in geoinformation systems and services. To date, most 3D city models are visualized using photorealistic graphics. But an effective communication of geoinformation significantly depends on how important information is designed and cognitively processed in the given application context. One possibility to visually emphasize important information is based on non-photorealistic rendering, which comprehends artistic depiction styles and is characterized by its expressiveness and communication aspects. However, a direct application of non-photorealistic rendering techniques primarily results in monotonic visualization that lacks cartographic design aspects. In this work, we present concepts for cartography-oriented visualization of virtual 3D city models. These are based on coupling non-photorealistic rendering techniques and semantics-based information for a user, context, and media-dependent representation of thematic information. This work highlights challenges for cartography-oriented visualization of 3D geovirtual environments, presents stylization techniques and discusses their applications and ideas for a standardized visualization. In particular, the presented concepts enable a real-time and dynamic visualization of thematic geoinformation.

Controlling strokes in fast neural style transfer using content transforms (2022)

Reimann, Max ; Buchheim, Benito ; Semmo, Amir ; Döllner, Jürgen ; Trapp, Matthias

Fast style transfer methods have recently gained popularity in art-related applications as they make a generalized real-time stylization of images practicable. However, they are mostly limited to one-shot stylizations concerning the interactive adjustment of style elements. In particular, the expressive control over stroke sizes or stroke orientations remains an open challenge. To this end, we propose a novel stroke-adjustable fast style transfer network that enables simultaneous control over the stroke size and intensity, and allows a wider range of expressive editing than current approaches by utilizing the scale-variance of convolutional neural networks. Furthermore, we introduce a network-agnostic approach for style-element editing by applying reversible input transformations that can adjust strokes in the stylized output. At this, stroke orientations can be adjusted, and warping-based effects can be applied to stylistic elements, such as swirls or waves. To demonstrate the real-world applicability of our approach, we present StyleTune, a mobile app for interactive editing of neural style transfers at multiple levels of control. Our app allows stroke adjustments on a global and local level. It furthermore implements an on-device patch-based upsampling step that enables users to achieve results with high output fidelity and resolutions of more than 20 megapixels. Our approach allows users to art-direct their creations and achieve results that are not possible with current style transfer applications.

Coronavirus 2019 and people living with human immunodeficiency virus (2020)

Background: There are limited data regarding the clinical impact of coronavirus disease 2019 (COVID-19) on people living with human immunodeficiency virus (PLWH). In this study, we compared outcomes for PLWH with COVID-19 to a matched comparison group. Methods: We identified 88 PLWH hospitalized with laboratory-confirmed COVID-19 in our hospital system in New York City between 12 March and 23 April 2020. We collected data on baseline clinical characteristics, laboratory values, HIV status, treatment, and outcomes from this group and matched comparators (1 PLWH to up to 5 patients by age, sex, race/ethnicity, and calendar week of infection). We compared clinical characteristics and outcomes (death, mechanical ventilation, hospital discharge) for these groups, as well as cumulative incidence of death by HIV status. Results: Patients did not differ significantly by HIV status by age, sex, or race/ethnicity due to the matching algorithm. PLWH hospitalized with COVID-19 had high proportions of HIV virologic control on antiretroviral therapy. PLWH had greater proportions of smoking (P < .001) and comorbid illness than uninfected comparators. There was no difference in COVID-19 severity on admission by HIV status (P = .15). Poor outcomes for hospitalized PLWH were frequent but similar to proportions in comparators; 18% required mechanical ventilation and 21% died during follow-up (compared with 23% and 20%, respectively). There was similar cumulative incidence of death over time by HIV status (P = .94). Conclusions: We found no differences in adverse outcomes associated with HIV infection for hospitalized COVID-19 patients compared with a demographically similar patient group.

Correction to: Knowledge bases and software support for variant interpretation in precision oncology (2021)

Borchert, Florian ; Mock, Andreas ; Tomczak, Aurelie ; Hügel, Jonas ; Alkarkoukly, Samer ; Knurr, Alexander ; Volckmar, Anna-Lena ; Stenzinger, Albrecht ; Schirmacher, Peter ; Debus, Jürgen ; Jäger, Dirk ; Longerich, Thomas ; Fröhling, Stefan ; Eils, Roland ; Bougatf, Nina ; Sax, Ulrich ; Schapranow, Matthieu-Patrick

Counting homomorphisms to trees modulo a prime (2021)

Göbel, Andreas ; Lagodzinski, Gregor J. A. ; Seidel, Karen

Many important graph-theoretic notions can be encoded as counting graph homomorphism problems, such as partition functions in statistical physics, in particular independent sets and colourings. In this article, we study the complexity of #(p) HOMSTOH, the problem of counting graph homomorphisms from an input graph to a graph H modulo a prime number p. Dyer and Greenhill proved a dichotomy stating that the tractability of non-modular counting graph homomorphisms depends on the structure of the target graph. Many intractable cases in non-modular counting become tractable in modular counting due to the common phenomenon of cancellation. In subsequent studies on counting modulo 2, however, the influence of the structure of H on the tractability was shown to persist, which yields similar dichotomies. Our main result states that for every tree H and every prime p the problem #pHOMSTOH is either polynomial time computable or #P-p-complete. This relates to the conjecture of Faben and Jerrum stating that this dichotomy holds for every graph H when counting modulo 2. In contrast to previous results on modular counting, the tractable cases of #pHOMSTOH are essentially the same for all values of the modulo when H is a tree. To prove this result, we study the structural properties of a homomorphism. As an important interim result, our study yields a dichotomy for the problem of counting weighted independent sets in a bipartite graph modulo some prime p. These results are the first suggesting that such dichotomies hold not only for the modulo 2 case but also for the modular counting functions of all primes p.

Cultural Theory’s contributions to climate science (2022)

Verweij, Marco ; Ney, Steven ; Thompson, Michael

In his article, 'Social constructionism and climate science denial', Hansson claims to present empirical evidence that the cultural theory developed by Dame Mary Douglas, Aaron Wildavsky and ourselves (among others) leads to (climate) science denial. In this reply, we show that there is no validity to these claims. First, we show that Hansson's empirical evidence that cultural theory has led to climate science denial falls apart under closer inspection. Contrary to Hansson's claims, cultural theory has made significant contributions to understanding and addressing climate change. Second, we discuss various features of Douglas' cultural theory that differentiate it from other constructivist approaches and make it compatible with the scientific method. Thus, we also demonstrate that cultural theory cannot be accused of epistemic relativism.

Cyber security threats to bitcoin exchanges (2021)

Oosthoek, Kris ; Dörr, Christian

Bitcoin is gaining traction as an alternative store of value. Its market capitalization transcends all other cryptocurrencies in the market. But its high monetary value also makes it an attractive target to cyber criminal actors. Hacking campaigns usually target an ecosystem's weakest points. In Bitcoin, the exchange platforms are one of them. Each exchange breach is a threat not only to direct victims, but to the credibility of Bitcoin's entire ecosystem. Based on an extensive analysis of 36 breaches of Bitcoin exchanges, we show the attack patterns used to exploit Bitcoin exchange platforms using an industry standard for reporting intelligence on cyber security breaches. Based on this we are able to provide an overview of the most common attack vectors, showing that all except three hacks were possible due to relatively lax security. We show that while the security regimen of Bitcoin exchanges is subpar compared to other financial service providers, the use of stolen credentials, which does not require any hacking, is decreasing. We also show that the amount of BTC taken during a breach is decreasing, as well as the exchanges that terminate after being breached. Furthermore we show that overall security posture has improved, but still has major flaws. To discover adversarial methods post-breach, we have analyzed two cases of BTC laundering. Through this analysis we provide insight into how exchange platforms with lax cyber security even further increase the intermediary risk introduced by them into the Bitcoin ecosystem.

Cyber threat intelligence: A product without a process? (2020)

Oosthoek, Kris ; Doerr, Christian

Data dependencies for query optimization (2021)

Koßmann, Jan ; Papenbrock, Thorsten ; Naumann, Felix

Effective query optimization is a core feature of any database management system. While most query optimization techniques make use of simple metadata, such as cardinalities and other basic statistics, other optimization techniques are based on more advanced metadata including data dependencies, such as functional, uniqueness, order, or inclusion dependencies. This survey provides an overview, intuitive descriptions, and classifications of query optimization and execution strategies that are enabled by data dependencies. We consider the most popular types of data dependencies and focus on optimization strategies that target the optimization of relational database queries. The survey supports database vendors to identify optimization opportunities as well as DBMS researchers to find related work and open research questions.

Data Preparation (2020)

Hameed, Mazhar ; Naumann, Felix

Raw data are often messy: they follow different encodings, records are not well structured, values do not adhere to patterns, etc. Such data are in general not fit to be ingested by downstream applications, such as data analytics tools, or even by data management systems. The act of obtaining information from raw data relies on some data preparation process. Data preparation is integral to advanced data analysis and data management, not only for data science but for any data-driven applications. Existing data preparation tools are operational and useful, but there is still room for improvement and optimization. With increasing data volume and its messy nature, the demand for prepared data increases day by day. To cater to this demand, companies and researchers are developing techniques and tools for data preparation. To better understand the available data preparation systems, we have conducted a survey to investigate (1) prominent data preparation tools, (2) distinctive tool features, (3) the need for preliminary data processing even for these tools and, (4) features and abilities that are still lacking. We conclude with an argument in support of automatic and intelligent data preparation beyond traditional and simplistic techniques.

Data preparation for duplicate detection (2020)

Koumarelas, Ioannis ; Jiang, Lan ; Naumann, Felix

Data errors represent a major issue in most application workflows. Before any important task can take place, a certain data quality has to be guaranteed by eliminating a number of different errors that may appear in data. Typically, most of these errors are fixed with data preparation methods, such as whitespace removal. However, the particular error of duplicate records, where multiple records refer to the same entity, is usually eliminated independently with specialized techniques. Our work is the first to bring these two areas together by applying data preparation operations under a systematic approach prior to performing duplicate detection. Our process workflow can be summarized as follows: It begins with the user providing as input a sample of the gold standard, the actual dataset, and optionally some constraints to domain-specific data preparations, such as address normalization. The preparation selection operates in two consecutive phases. First, to vastly reduce the search space of ineffective data preparations, decisions are made based on the improvement or worsening of pair similarities. Second, using the remaining data preparations an iterative leave-one-out classification process removes preparations one by one and determines the redundant preparations based on the achieved area under the precision-recall curve (AUC-PR). Using this workflow, we manage to improve the results of duplicate detection up to 19% in AUC-PR.

Data Profiling ()

Abedjan, Ziawasch ; Golab, Lukasz ; Naumann, Felix ; Papenbrock, Thorsten

data4life - Eine nutzerkontrollierte Gesundheitsdaten-Infrastruktu (2019)

von Schorlemer, Stephan ; Weiß, Christian-Cornelius

Deep En-Route Filtering of Constrained Application Protocol (CoAP) Messages on 6LoWPAN Border Routers (2019)

Seidel, Felix ; Krentz, Konrad-Felix ; Meinel, Christoph

Devices on the Internet of Things (IoT) are usually battery-powered and have limited resources. Hence, energy-efficient and lightweight protocols were designed for IoT devices, such as the popular Constrained Application Protocol (CoAP). Yet, CoAP itself does not include any defenses against denial-of-sleep attacks, which are attacks that aim at depriving victim devices of entering low-power sleep modes. For example, a denial-of-sleep attack against an IoT device that runs a CoAP server is to send plenty of CoAP messages to it, thereby forcing the IoT device to expend energy for receiving and processing these CoAP messages. All current security solutions for CoAP, namely Datagram Transport Layer Security (DTLS), IPsec, and OSCORE, fail to prevent such attacks. To fill this gap, Seitz et al. proposed a method for filtering out inauthentic and replayed CoAP messages "en-route" on 6LoWPAN border routers. In this paper, we expand on Seitz et al.'s proposal in two ways. First, we revise Seitz et al.'s software architecture so that 6LoWPAN border routers can not only check the authenticity and freshness of CoAP messages, but can also perform a wide range of further checks. Second, we propose a couple of such further checks, which, as compared to Seitz et al.'s original checks, more reliably protect IoT devices that run CoAP servers from remote denial-of-sleep attacks, as well as from remote exploits. We prototyped our solution and successfully tested its compatibility with Contiki-NG's CoAP implementation.

Deep Learning of Multimodal Representations (2016)

Wang, Cheng

Destructiveness of lexicographic parsimony pressure and alleviation by a concatenation crossover in genetic programming (2020)

Kötzing, Timo ; Lagodzinski, Gregor J. A. ; Lengler, Johannes ; Melnichenko, Anna

For theoretical analyses there are two specifics distinguishing GP from many other areas of evolutionary computation: the variable size representations, in particular yielding a possible bloat (i.e. the growth of individuals with redundant parts); and also the role and the realization of crossover, which is particularly central in GP due to the tree-based representation. Whereas some theoretical work on GP has studied the effects of bloat, crossover had surprisingly little share in this work. We analyze a simple crossover operator in combination with randomized local search, where a preference for small solutions minimizes bloat (lexicographic parsimony pressure); we denote the resulting algorithm Concatenation Crossover GP. We consider three variants of the well-studied MAJORITY test function, adding large plateaus in different ways to the fitness landscape and thus giving a test bed for analyzing the interplay of variation operators and bloat control mechanisms in a setting with local optima. We show that the Concatenation Crossover GP can efficiently optimize these test functions, while local search cannot be efficient for all three variants independent of employing bloat control. (C) 2019 Elsevier B.V. All rights reserved.

Detecting layout templates in complex multiregion files (2021)

Vitagliano, Gerardo ; Jiang, Lan ; Naumann, Felix

Spreadsheets are among the most commonly used file formats for data management, distribution, and analysis. Their widespread employment makes it easy to gather large collections of data, but their flexible canvas-based structure makes automated analysis difficult without heavy preparation. One of the common problems that practitioners face is the presence of multiple, independent regions in a single spreadsheet, possibly separated by repeated empty cells. We define such files as "multiregion" files. In collections of various spreadsheets, we can observe that some share the same layout. We present the Mondrian approach to automatically identify layout templates across multiple files and systematically extract the corresponding regions. Our approach is composed of three phases: first, each file is rendered as an image and inspected for elements that could form regions; then, using a clustering algorithm, the identified elements are grouped to form regions; finally, every file layout is represented as a graph and compared with others to find layout templates. We compare our method to state-of-the-art table recognition algorithms on two corpora of real-world enterprise spreadsheets. Our approach shows the best performances in detecting reliable region boundaries within each file and can correctly identify recurring layouts across files.

Die Zukunft der Medizin (2019)

Die Medizin im 21. Jahrhundert wird sich so schnell verändern wie nie zuvor – und mit ihr das Gesundheitswesen. Bahnbrechende Entwicklungen in Forschung und Digitalisierung werden die Auswertung und Nutzung riesiger Datenmengen in kurzer Zeit ermöglichen. Das wird unsere Kenntnisse über Gesundheit und gesund sein, sowie über die Entstehung, Prävention und Heilung von Krankheiten vollkommen verändern. Gleichzeitig wird sich die Art und Weise, wie Medizin praktiziert wird, fundamental verändern. Das Selbstverständnis nahezu aller Akteure wird sich rasch weiterentwickeln müssen. Das Gesundheitssystem wird in allen Bereichen umgebaut und teilweise neu erfunden werden. Digitale Transformation, Personalisierung und Prävention sind die Treiber der neuen Medizin. Deutschland darf den Anschluss nicht verpassen. Im Vergleich mit anderen Ländern ist das deutsche Gesundheitswesen in vielen Punkten bedrohlich rückständig und fragmentiert. Um die Medizin und das Gesundheitswesen in Deutschland langfristig zukunftsfest zu machen, bedarf es vieler Anstrengungen – vor allem aber Offenheit gegenüber Veränderungen, sowie einen regulatorischen Rahmen, der ermöglicht, dass die medizinischen und digitalen Innovationen beim Patienten ankommen. DIE ZUKUNFT DER MEDIZIN beschreibt Entwicklungen und Technologien, die die Medizin und das Gesundheitswesen im 21. Jahrhundert prägen werden. Das Buch informiert über die zum Teil dramatischen, disruptiven Innovationen in der Forschung, die durch Big Data, Künstliche Intelligenz und Robotik möglich werden. Die Autoren sind führende Vordenker ihres Fachs und beschreiben aus langjähriger Erfahrung im In- und Ausland zukünftige Entwicklungen, die jetzt bereits greifbar sind.

Die Zukunftspotenziale der Blockchain-Technologie (2019)

Meinel, Christoph ; Gayvoronskaya, Tatiana ; Mühle, Alexander

Digital therapeutic care apps with decision-support interventions for people with low back pain in Germany (2022)

Lewkowicz, Daniel ; Wohlbrandt, Attila M. ; Böttinger, Erwin

Background: Digital therapeutic care apps provide a new effective and scalable approach for people with nonspecific low back pain (LBP). Digital therapeutic care apps are also driven by personalized decision-support interventions that support the user in self-managing LBP, and may induce prolonged behavior change to reduce the frequency and intensity of pain episodes. However, these therapeutic apps are associated with high attrition rates, and the initial prescription cost is higher than that of face-to-face physiotherapy. In Germany, digital therapeutic care apps are now being reimbursed by statutory health insurance; however, price targets and cost-driving factors for the formation of the reimbursement rate remain unexplored. Objective: The aim of this study was to evaluate the cost-effectiveness of a digital therapeutic care app compared to treatment as usual (TAU) in Germany. We further aimed to explore under which circumstances the reimbursement rate could be modified to consider value-based pricing. Methods: We developed a state-transition Markov model based on a best-practice analysis of prior LBP-related decision-analytic models, and evaluated the cost utility of a digital therapeutic care app compared to TAU in Germany. Based on a 3-year time horizon, we simulated the incremental cost and quality-adjusted life years (QALYs) for people with nonacute LBP from the societal perspective. In the deterministic sensitivity and scenario analyses, we focused on diverging attrition rates and app cost to assess our model's robustness and conditions for changing the reimbursement rate. All costs are reported in Euro (euro1=US $1.12). Results: Our base case results indicated that the digital therapeutic care strategy led to an incremental cost of euro121.59, but also generated 0.0221 additional QALYs compared to the TAU strategy, with an estimated incremental cost-effectiveness ratio (ICER) of euro5486 per QALY. The sensitivity analysis revealed that the reimbursement rate and the capability of digital therapeutic care to prevent reoccurring LBP episodes have a significant impact on the ICER. At the same time, the other parameters remained unaffected and thus supported the robustness of our model. In the scenario analysis, the different model time horizons and attrition rates strongly influenced the economic outcome. Reducing the cost of the app to euro99 per 3 months or decreasing the app's attrition rate resulted in digital therapeutic care being significantly less costly with more generated QALYs, and is thus considered to be the dominant strategy over TAU. Conclusions: The current reimbursement rate for a digital therapeutic care app in the statutory health insurance can be considered a cost-effective measure compared to TAU. The app's attrition rate and effect on the patient's prolonged behavior change essentially influence the settlement of an appropriate reimbursement rate. Future value-based pricing targets should focus on additional outcome parameters besides pain intensity and functional disability by including attrition rates and the app's long-term effect on quality of life.

Discovering commute patterns via process mining (2019)

Yousfi, Alaaeddine ; Weske, Mathias

Ubiquitous computing has proven its relevance and efficiency in improving the user experience across a myriad of situations. It is now the ineluctable solution to keep pace with the ever-changing environments in which current systems operate. Despite the achievements of ubiquitous computing, this discipline is still overlooked in business process management. This is surprising, since many of today’s challenges, in this domain, can be addressed by methods and techniques from ubiquitous computing, for instance user context and dynamic aspects of resource locations. This paper takes a first step to integrate methods and techniques from ubiquitous computing in business process management. To do so, we propose discovering commute patterns via process mining. Through our proposition, we can deduce the users’ significant locations, routes, travel times and travel modes. This information can be a stepping-stone toward helping the business process management community embrace the latest achievements in ubiquitous computing, mainly in location-based service. To corroborate our claims, a user study was conducted. The significant places, routes, travel modes and commuting times of our test subjects were inferred with high accuracies. All in all, ubiquitous computing can enrich the processes with new capabilities that go beyond what has been established in business process management so far.

Discovering relaxed functional dependencies based on multi-attribute dominance (2021)

Caruccio, Loredana ; Deufemia, Vincenzo ; Naumann, Felix ; Polese, Giuseppe

With the advent of big data and data lakes, data are often integrated from multiple sources. Such integrated data are often of poor quality, due to inconsistencies, errors, and so forth. One way to check the quality of data is to infer functional dependencies (fds). However, in many modern applications it might be necessary to extract properties and relationships that are not captured through fds, due to the necessity to admit exceptions, or to consider similarity rather than equality of data values. Relaxed fds (rfds) have been introduced to meet these needs, but their discovery from data adds further complexity to an already complex problem, also due to the necessity of specifying similarity and validity thresholds. We propose Domino, a new discovery algorithm for rfds that exploits the concept of dominance in order to derive similarity thresholds of attribute values while inferring rfds. An experimental evaluation on real datasets demonstrates the discovery performance and the effectiveness of the proposed algorithm.

Disentangling virtual machine architecture (2009)

Haupt, Michael ; Adams, Bram ; Timbermont, Stijn ; Gibbs, Celina ; Coady, Yvonne ; Hirschfeld, Robert

Virtual machine (VM) implementations are made of intricately intertwined subsystems, interacting largely through implicit dependencies. As the degree of crosscutting present in VMs is very high, VM implementations exhibit significant internal complexity. This study proposes an architecture approach for VMs that regards a VM as a composite of service modules coordinated through explicit bidirectional interfaces. Aspect-oriented programming techniques are used to establish these interfaces, to coordinate module interaction, and to declaratively express concrete VM architectures. A VM architecture description language is presented in a case study, illustrating the application of the proposed architectural principles.

Distributed detection of sequential anomalies in univariate time series (2021)

Schneider, Johannes ; Wenig, Phillip ; Papenbrock, Thorsten

The automated detection of sequential anomalies in time series is an essential task for many applications, such as the monitoring of technical systems, fraud detection in high-frequency trading, or the early detection of disease symptoms. All these applications require the detection to find all sequential anomalies possibly fast on potentially very large time series. In other words, the detection needs to be effective, efficient and scalable w.r.t. the input size. Series2Graph is an effective solution based on graph embeddings that are robust against re-occurring anomalies and can discover sequential anomalies of arbitrary length and works without training data. Yet, Series2Graph is no t scalable due to its single-threaded approach; it cannot, in particular, process arbitrarily large sequences due to the memory constraints of a single machine. In this paper, we propose our distributed anomaly detection system, short DADS, which is an efficient and scalable adaptation of Series2Graph. Based on the actor programming model, DADS distributes the input time sequence, intermediate state and the computation to all processors of a cluster in a way that minimizes communication costs and synchronization barriers. Our evaluation shows that DADS is orders of magnitude faster than S2G, scales almost linearly with the number of processors in the cluster and can process much larger input sequences due to its scale-out property.

DualPanto (2018)

Schneider, Oliver ; Shigeyama, Jotaro ; Kovacs, Robert ; Roumen, Thijs Jan ; Marwecki, Sebastian ; Böckhoff, Nico ; Glöckner, Daniel Amadeus Johannes ; Bounama, Jonas ; Baudisch, Patrick

We present a new haptic device that enables blind users to continuously track the absolute position of moving objects in spatial virtual environments, as is the case in sports or shooter games. Users interact with DualPanto by operating the me handle with one hand and by holding on to the it handle with the other hand. Each handle is connected to a pantograph haptic input/output device. The key feature is that the two handles are spatially registered with respect to each other. When guiding their avatar through a virtual world using the me handle, spatial registration enables users to track moving objects by having the device guide the output hand. This allows blind players of a 1-on-1 soccer game to race for the ball or evade an opponent; it allows blind players of a shooter game to aim at an opponent and dodge shots. In our user study, blind participants reported very high enjoyment when using the device to play (6.5/7).

Dynamic hierarchical mega models : comprehensive traceability and its efficient maintenance (2010)

Seibel, Andreas ; Neumann, Stefan ; Giese, Holger

In the world of model-driven engineering (MDE) support for traceability and maintenance of traceability information is essential. On the one hand, classical traceability approaches for MDE address this need by supporting automated creation of traceability information on the model element level. On the other hand, global model management approaches manually capture traceability information on the model level. However, there is currently no approach that supports comprehensive traceability, comprising traceability information on both levels, and efficient maintenance of traceability information, which requires a high-degree of automation and scalability. In this article, we present a comprehensive traceability approach that combines classical traceability approaches for MDE and global model management in form of dynamic hierarchical mega models. We further integrate efficient maintenance of traceability information based on top of dynamic hierarchical mega models. The proposed approach is further outlined by using an industrial case study and by presenting an implementation of the concepts in form of a prototype.

Dynamic pricing and advertising of perishable products with inventory holding costs (2015)

Schlosser, Rainer

We examine a special class of dynamic pricing and advertising models for the sale of perishable goods, including marginal unit costs and inventory holding costs. The time horizon is assumed to be finite and we allow several model parameters to be dependent on time. For the stochastic version of the model, we derive closed-form expressions of the value function as well as of the optimal pricing and advertising policy in feedback form. Moreover, we show that for small unit shares, the model converges to a deterministic version of the problem, whose explicit solution is characterized by an overage and an underage case. We quantify the close relationship between the open-loop solution of the deterministic model and the expected evolution of optimally controlled stochastic sales processes. For both models, we derive sensitivity results. We find that in the case of positive holding costs, on average, optimal prices increase in time and advertising rates decrease. Furthermore, we analytically verify the excellent quality of optimal feedback policies of deterministic models applied in stochastic models. (C) 2015 Elsevier B.V. All rights reserved.

Dynamic pricing under competition using reinforcement learning (2022)

Kastius, Alexander ; Schlosser, Rainer

Dynamic pricing is considered a possibility to gain an advantage over competitors in modern online markets. The past advancements in Reinforcement Learning (RL) provided more capable algorithms that can be used to solve pricing problems. In this paper, we study the performance of Deep Q-Networks (DQN) and Soft Actor Critic (SAC) in different market models. We consider tractable duopoly settings, where optimal solutions derived by dynamic programming techniques can be used for verification, as well as oligopoly settings, which are usually intractable due to the curse of dimensionality. We find that both algorithms provide reasonable results, while SAC performs better than DQN. Moreover, we show that under certain conditions, RL algorithms can be forced into collusion by their competitors without direct communication.

Eatomics (2021)

Kraus, Sara Milena ; Mathew-Stephen, Mariet ; Schapranow, Matthieu-Patrick

Quantitative proteomics data are becoming increasingly more available, and as a consequence are being analyzed and interpreted by a larger group of users. However, many of these users have less programming experience. Furthermore, experimental designs and setups are getting more complicated, especially when tissue biopsies are analyzed. Luckily, the proteomics community has already established some best practices on how to conduct quality control, differential abundance analysis and enrichment analysis. However, an easy-to-use application that wraps together all steps for the exploration and flexible analysis of quantitative proteomics data is not yet available. For Eatomics, we utilize the R Shiny framework to implement carefully chosen parts of established analysis workflows to (i) make them accessible in a user-friendly way, (ii) add a multitude of interactive exploration possibilities, and (iii) develop a unique experimental design setup module, which interactively translates a given research hypothesis into a differential abundance and enrichment analysis formula. In this, we aim to fulfill the needs of a growing group of inexperienced quantitative proteomics data analysts. Eatomics may be tested with demo data directly online via https://we.analyzegenomes.com/now/eatomics/or with the user's own data by installation from the Github repository at https://github.com/Millchmaedchen/Eatomics.

Editorial (2019)

Björk, Jennie ; Hölze, Katharina

Editorial for the Special Issue on Theory of Evolutionary Algorithms 2014 (2015)

Oliveto, Pietro S. ; Sutton, Andrew M.

Efficient discovery of matching dependencies (2020)

Schirmer, Philipp ; Papenbrock, Thorsten ; Koumarelas, Ioannis ; Naumann, Felix

Matching dependencies (MDs) are data profiling results that are often used for data integration, data cleaning, and entity matching. They are a generalization of functional dependencies (FDs) matching similar rather than same elements. As their discovery is very difficult, existing profiling algorithms find either only small subsets of all MDs or their scope is limited to only small datasets. We focus on the efficient discovery of all interesting MDs in real-world datasets. For this purpose, we propose HyMD, a novel MD discovery algorithm that finds all minimal, non-trivial MDs within given similarity boundaries. The algorithm extracts the exact similarity thresholds for the individual MDs from the data instead of using predefined similarity thresholds. For this reason, it is the first approach to solve the MD discovery problem in an exact and truly complete way. If needed, the algorithm can, however, enforce certain properties on the reported MDs, such as disjointness and minimum support, to focus the discovery on such results that are actually required by downstream use cases. HyMD is technically a hybrid approach that combines the two most popular dependency discovery strategies in related work: lattice traversal and inference from record pairs. Despite the additional effort of finding exact similarity thresholds for all MD candidates, the algorithm is still able to efficiently process large datasets, e.g., datasets larger than 3 GB.

Efficient distributed discovery of bidirectional order dependencies (2022)

Schmidl, Sebastian ; Papenbrock, Thorsten

Bidirectional order dependencies (bODs) capture order relationships between lists of attributes in a relational table. They can express that, for example, sorting books by publication date in ascending order also sorts them by age in descending order. The knowledge about order relationships is useful for many data management tasks, such as query optimization, data cleaning, or consistency checking. Because the bODs of a specific dataset are usually not explicitly given, they need to be discovered. The discovery of all minimal bODs (in set-based canonical form) is a task with exponential complexity in the number of attributes, though, which is why existing bOD discovery algorithms cannot process datasets of practically relevant size in a reasonable time. In this paper, we propose the distributed bOD discovery algorithm DISTOD, whose execution time scales with the available hardware. DISTOD is a scalable, robust, and elastic bOD discovery approach that combines efficient pruning techniques for bOD candidates in set-based canonical form with a novel, reactive, and distributed search strategy. Our evaluation on various datasets shows that DISTOD outperforms both single-threaded and distributed state-of-the-art bOD discovery algorithms by up to orders of magnitude; it can, in particular, process much larger datasets.

Efficient Shortest Paths in Scale-Free Networks with Underlying Hyperbolic Geometry (2022)

Bläsius, Thomas ; Freiberger, Cedric ; Friedrich, Tobias ; Katzmann, Maximilian ; Montenegro-Retana, Felix ; Thieffry, Marianne

A standard approach to accelerating shortest path algorithms on networks is the bidirectional search, which explores the graph from the start and the destination, simultaneously. In practice this strategy performs particularly well on scale-free real-world networks. Such networks typically have a heterogeneous degree distribution (e.g., a power-law distribution) and high clustering (i.e., vertices with a common neighbor are likely to be connected themselves). These two properties can be obtained by assuming an underlying hyperbolic geometry. To explain the observed behavior of the bidirectional search, we analyze its running time on hyperbolic random graphs and prove that it is (O) over tilde (n(2-1/alpha) + n(1/(2 alpha)) + delta(max)) with high probability, where alpha is an element of (1/2, 1) controls the power-law exponent of the degree distribution, and dmax is the maximum degree. This bound is sublinear, improving the obvious worst-case linear bound. Although our analysis depends on the underlying geometry, the algorithm itself is oblivious to it.

Embedded smart home (2017)

Malchow, Martin ; Renz, Jan ; Bauer, Matthias ; Meinel, Christoph

The popularity of MOOCs has increased considerably in the last years. A typical MOOC course consists of video content, self tests after a video and homework, which is normally in multiple choice format. After solving this homeworks for every week of a MOOC, the final exam certificate can be issued when the student has reached a sufficient score. There are also some attempts to include practical tasks, such as programming, in MOOCs for grading. Nevertheless, until now there is no known possibility to teach embedded system programming in a MOOC course where the programming can be done in a remote lab and where grading of the tasks is additionally possible. This embedded programming includes communication over GPIO pins to control LEDs and measure sensor values. We started a MOOC course called "Embedded Smart Home" as a pilot to prove the concept to teach real hardware programming in a MOOC environment under real life MOOC conditions with over 6000 students. Furthermore, also students with real hardware have the possibility to program on their own real hardware and grade their results in the MOOC course. Finally, we evaluate our approach and analyze the student acceptance of this approach to offer a course on embedded programming. We also analyze the hardware usage and working time of students solving tasks to find out if real hardware programming is an advantage and motivating achievement to support students learning success.

Evaluation of self-healing systems (2020)

Ghahremani, Sona ; Giese, Holger

Evaluating the performance of self-adaptive systems is challenging due to their interactions with often highly dynamic environments. In the specific case of self-healing systems, the performance evaluations of self-healing approaches and their parameter tuning rely on the considered characteristics of failure occurrences and the resulting interactions with the self-healing actions. In this paper, we first study the state-of-the-art for evaluating the performances of self-healing systems by means of a systematic literature review. We provide a classification of different input types for such systems and analyse the limitations of each input type. A main finding is that the employed inputs are often not sophisticated regarding the considered characteristics for failure occurrences. To further study the impact of the identified limitations, we present experiments demonstrating that wrong assumptions regarding the characteristics of the failure occurrences can result in large performance prediction errors, disadvantageous design-time decisions concerning the selection of alternative self-healing approaches, and disadvantageous deployment-time decisions concerning parameter tuning. Furthermore, the experiments indicate that employing multiple alternative input characteristics can help with reducing the risk of premature disadvantageous design-time decisions.

Event Driven Network Topology Discovery and Inventory Listing Using REAMS (2015)

Azodi, Amir ; Cheng, Feng ; Meinel, Christoph

Network Topology Discovery and Inventory Listing are two of the primary features of modern network monitoring systems (NMS). Current NMSs rely heavily on active scanning techniques for discovering and mapping network information. Although this approach works, it introduces some major drawbacks such as the performance impact it can exact, specially in larger network environments. As a consequence, scans are often run less frequently which can result in stale information being presented and used by the network monitoring system. Alternatively, some NMSs rely on their agents being deployed on the hosts they monitor. In this article, we present a new approach to Network Topology Discovery and Network Inventory Listing using only passive monitoring and scanning techniques. The proposed techniques rely solely on the event logs produced by the hosts and network devices present within a network. Finally, we discuss some of the advantages and disadvantages of our approach.

Evolutionary algorithms and submodular functions (2021)

Quinzan, Francesco ; Göbel, Andreas ; Wagner, Markus ; Friedrich, Tobias

A core operator of evolutionary algorithms (EAs) is the mutation. Recently, much attention has been devoted to the study of mutation operators with dynamic and non-uniform mutation rates. Following up on this area of work, we propose a new mutation operator and analyze its performance on the (1 + 1) Evolutionary Algorithm (EA). Our analyses show that this mutation operator competes with pre-existing ones, when used by the (1 + 1) EA on classes of problems for which results on the other mutation operators are available. We show that the (1 + 1) EA using our mutation operator finds a (1/3)-approximation ratio on any non-negative submodular function in polynomial time. We also consider the problem of maximizing a symmetric submodular function under a single matroid constraint and show that the (1 + 1) EA using our operator finds a (1/3)-approximation within polynomial time. This performance matches that of combinatorial local search algorithms specifically designed to solve these problems and outperforms them with constant probability. Finally, we evaluate the performance of the (1 + 1) EA using our operator experimentally by considering two applications: (a) the maximum directed cut problem on real-world graphs of different origins, with up to 6.6 million vertices and 56 million edges and (b) the symmetric mutual information problem using a four month period air pollution data set. In comparison with uniform mutation and a recently proposed dynamic scheme, our operator comes out on top on these instances.

Experience: Enhancing address matching with geocoding and similarity measure selection (2018)

Koumarelas, Ioannis ; Kroschk, Axel ; Mosley, Clifford ; Naumann, Felix

Given a query record, record matching is the problem of finding database records that represent the same real-world object. In the easiest scenario, a database record is completely identical to the query. However, in most cases, problems do arise, for instance, as a result of data errors or data integrated from multiple sources or received from restrictive form fields. These problems are usually difficult, because they require a variety of actions, including field segmentation, decoding of values, and similarity comparisons, each requiring some domain knowledge. In this article, we study the problem of matching records that contain address information, including attributes such as Street-address and City. To facilitate this matching process, we propose a domain-specific procedure to, first, enrich each record with a more complete representation of the address information through geocoding and reverse-geocoding and, second, to select the best similarity measure per each address attribute that will finally help the classifier to achieve the best f-measure. We report on our experience in selecting geocoding services and discovering similarity measures for a concrete but common industry use-case.

Experimenting with robotic intra-logistics domains (2018)

Gebser, Martin ; Obermeier, Philipp ; Otto, Thomas ; Schaub, Torsten H. ; Sabuncu, Orkunt ; Van Nguyen ; Tran Cao Son

We introduce the asprilo1 framework to facilitate experimental studies of approaches addressing complex dynamic applications. For this purpose, we have chosen the domain of robotic intra-logistics. This domain is not only highly relevant in the context of today's fourth industrial revolution but it moreover combines a multitude of challenging issues within a single uniform framework. This includes multi-agent planning, reasoning about action, change, resources, strategies, etc. In return, asprilo allows users to study alternative solutions as regards effectiveness and scalability. Although asprilo relies on Answer Set Programming and Python, it is readily usable by any system complying with its fact-oriented interface format. This makes it attractive for benchmarking and teaching well beyond logic programming. More precisely, asprilo consists of a versatile benchmark generator, solution checker and visualizer as well as a bunch of reference encodings featuring various ASP techniques. Importantly, the visualizer's animation capabilities are indispensable for complex scenarios like intra-logistics in order to inspect valid as well as invalid solution candidates. Also, it allows for graphically editing benchmark layouts that can be used as a basis for generating benchmark suites.

Explainable AI under contract and tort law (2020)

Hacker, Philipp ; Krestel, Ralf ; Grundmann, Stefan ; Naumann, Felix

This paper shows that the law, in subtle ways, may set hitherto unrecognized incentives for the adoption of explainable machine learning applications. In doing so, we make two novel contributions. First, on the legal side, we show that to avoid liability, professional actors, such as doctors and managers, may soon be legally compelled to use explainable ML models. We argue that the importance of explainability reaches far beyond data protection law, and crucially influences questions of contractual and tort liability for the use of ML models. To this effect, we conduct two legal case studies, in medical and corporate merger applications of ML. As a second contribution, we discuss the (legally required) trade-off between accuracy and explainability and demonstrate the effect in a technical case study in the context of spam classification.

Explicit use-case representation in object-oriented programming languages (2012)

Hirschfeld, Robert ; Perscheid, Michael ; Haupt, Michael

Use-cases are considered an integral part of most contemporary development processes since they describe a software system's expected behavior from the perspective of its prospective users. However, the presence of and traceability to use-cases is increasingly lost in later more code-centric development activities. Use-cases, being well-encapsulated at the level of requirements descriptions, eventually lead to crosscutting concerns in system design and source code. Tracing which parts of the system contribute to which use-cases is therefore hard and so limits understandability. In this paper, we propose an approach to making use-cases first-class entities in both the programming language and the runtime environment. Having use-cases present in the code and the running system will allow developers, maintainers, and operators to easily associate their units of work with what matters to the users. We suggest the combination of use-cases, acceptance tests, and dynamic analysis to automatically associate source code with use-cases. We present UseCasePy, an implementation of our approach to use-case-centered development in Python, and its application to the Django Web framework.

FaVe: Modeling IPv6 firewalls for fast formal verification (2017)

Lorenz, Claas ; Kiekheben, Sebastian ; Schnor, Bettina

As virtualization drives the automation of networking, the validation of security properties becomes more and more challenging eventually ruling out manual inspections. While formal verification in Software Defined Networks is provided by comprehensive tools with high speed reverification capabilities like NetPlumber for instance, the presence of middlebox functionality like firewalls is not considered. Also, they lack the ability to handle dynamic protocol elements like IPv6 extension header chains. In this work, we provide suitable modeling abstractions to enable both - the inclusion of firewalls and dynamic protocol elements. We exemplarily model the Linux ip6tables/netfilter packet filter and also provide abstractions for an application layer gateway. Finally, we present a prototype of our formal verification system FaVe.

Federated learning in a medical context (2021)

Pfitzner, Bjarne ; Steckhan, Nico ; Arnrich, Bert

Data privacy is a very important issue. Especially in fields like medicine, it is paramount to abide by the existing privacy regulations to preserve patients' anonymity. However, data is required for research and training machine learning models that could help gain insight into complex correlations or personalised treatments that may otherwise stay undiscovered. Those models generally scale with the amount of data available, but the current situation often prohibits building large databases across sites. So it would be beneficial to be able to combine similar or related data from different sites all over the world while still preserving data privacy. Federated learning has been proposed as a solution for this, because it relies on the sharing of machine learning models, instead of the raw data itself. That means private data never leaves the site or device it was collected on. Federated learning is an emerging research area, and many domains have been identified for the application of those methods. This systematic literature review provides an extensive look at the concept of and research into federated learning and its applicability for confidential healthcare datasets.

FIBER (2021)

Datta, Suparno ; Sachs, Jan Philipp ; Freitas da Cruz, Harry ; Martensen, Tom ; Bode, Philipp ; Morassi Sasso, Ariane ; Glicksberg, Benjamin S. ; Böttinger, Erwin

Objectives: The development of clinical predictive models hinges upon the availability of comprehensive clinical data. Tapping into such resources requires considerable effort from clinicians, data scientists, and engineers. Specifically, these efforts are focused on data extraction and preprocessing steps required prior to modeling, including complex database queries. A handful of software libraries exist that can reduce this complexity by building upon data standards. However, a gap remains concerning electronic health records (EHRs) stored in star schema clinical data warehouses, an approach often adopted in practice. In this article, we introduce the FlexIBle EHR Retrieval (FIBER) tool: a Python library built on top of a star schema (i2b2) clinical data warehouse that enables flexible generation of modeling-ready cohorts as data frames. Materials and Methods: FIBER was developed on top of a large-scale star schema EHR database which contains data from 8 million patients and over 120 million encounters. To illustrate FIBER's capabilities, we present its application by building a heart surgery patient cohort with subsequent prediction of acute kidney injury (AKI) with various machine learning models. Results: Using FIBER, we were able to build the heart surgery cohort (n = 12 061), identify the patients that developed AKI (n = 1005), and automatically extract relevant features (n = 774). Finally, we trained machine learning models that achieved area under the curve values of up to 0.77 for this exemplary use case. Conclusion: FIBER is an open-source Python library developed for extracting information from star schema clinical data warehouses and reduces time-to-modeling, helping to streamline the clinical modeling process.

First-class concepts (2022)

Mattis, Toni ; Beckmann, Tom ; Rein, Patrick ; Hirschfeld, Robert

Ideally, programs are partitioned into independently maintainable and understandable modules. As a system grows, its architecture gradually loses the capability to accommodate new concepts in a modular way. While refactoring is expensive and not always possible, and the programming language might lack dedicated primary language constructs to express certain cross-cutting concerns, programmers are still able to explain and delineate convoluted concepts through secondary means: code comments, use of whitespace and arrangement of code, documentation, or communicating tacit knowledge. Secondary constructs are easy to change and provide high flexibility in communicating cross-cutting concerns and other concepts among programmers. However, such secondary constructs usually have no reified representation that can be explored and manipulated as first-class entities through the programming environment. In this exploratory work, we discuss novel ways to express a wide range of concepts, including cross-cutting concerns, patterns, and lifecycle artifacts independently of the dominant decomposition imposed by an existing architecture. We propose the representation of concepts as first-class objects inside the programming environment that retain the capability to change as easily as code comments. We explore new tools that allow programmers to view, navigate, and change programs based on conceptual perspectives. In a small case study, we demonstrate how such views can be created and how the programming experience changes from draining programmers' attention by stretching it across multiple modules toward focusing it on cohesively presented concepts. Our designs are geared toward facilitating multiple secondary perspectives on a system to co-exist in symbiosis with the original architecture, hence making it easier to explore, understand, and explain complex contexts and narratives that are hard or impossible to express using primary modularity constructs.

Formal framework for checking compliance of data-driven case management (2021)

Haarmann, Stephan ; Holfter, Adrian ; Pufahl, Luise ; Weske, Mathias

Business processes are often specified in descriptive or normative models. Both types of models should adhere to internal and external regulations, such as company guidelines or laws. Employing compliance checking techniques, it is possible to verify process models against rules. While traditionally compliance checking focuses on well-structured processes, we address case management scenarios. In case management, knowledge workers drive multi-variant and adaptive processes. Our contribution is based on the fragment-based case management approach, which splits a process into a set of fragments. The fragments are synchronized through shared data but can, otherwise, be dynamically instantiated and executed. We formalize case models using Petri nets. We demonstrate the formalization for design-time and run-time compliance checking and present a proof-of-concept implementation. The application of the implemented compliance checking approach to a use case exemplifies its effectiveness while designing a case model. The empirical evaluation on a set of case models for measuring the performance of the approach shows that rules can often be checked in less than a second.

Formal models and analysis for self-adaptive cyber-physical systems (2017)

Giese, Holger

In this extended abstract, we will analyze the current challenges for the envisioned Self-Adaptive CPS. In addition, we will outline our results to approach these challenges with SMARTSOS [10] a generic approach based on extensions of graph transformation systems employing open and adaptive collaborations and models at runtime for trustworthy self-adaptation, self-organization, and evolution of the individual systems and the system-of-systems level taking the independent development, operation, management, and evolution of these systems into account.

From face to face (2019)

Drimalla, Hanna ; Landwehr, Niels ; Hess, Ursula ; Dziobek, Isabel

Despite advances in the conceptualisation of facial mimicry, its role in the processing of social information is a matter of debate. In the present study, we investigated the relationship between mimicry and cognitive and emotional empathy. To assess mimicry, facial electromyography was recorded for 70 participants while they completed the Multifaceted Empathy Test, which presents complex context-embedded emotional expressions. As predicted, inter-individual differences in emotional and cognitive empathy were associated with the level of facial mimicry. For positive emotions, the intensity of the mimicry response scaled with the level of state emotional empathy. Mimicry was stronger for the emotional empathy task compared to the cognitive empathy task. The specific empathy condition could be successfully detected from facial muscle activity at the level of single individuals using machine learning techniques. These results support the view that mimicry occurs depending on the social context as a tool to affiliate and it is involved in cognitive as well as emotional empathy.

From model transformation to incremental bidirectional model synchronization (2009)

Giese, Holger ; Wagner, Robert

The model-driven software development paradigm requires that appropriate model transformations are applicable in different stages of the development process. The transformations have to consistently propagate changes between the different involved models and thus ensure a proper model synchronization. However, most approaches today do not fully support the requirements for model synchronization and focus only on classical one-way batch-oriented transformations. In this paper, we present our approach for an incremental model transformation which supports model synchronization. Our approach employs the visual, formal, and bidirectional transformation technique of triple graph grammars. Using this declarative specification formalism, we focus on the efficient execution of the transformation rules and how to achieve an incremental model transformation for synchronization purposes. We present an evaluation of our approach and demonstrate that due to the speedup for the incremental processing in the average case even larger models can be tackled.

Full Lecture Recording Watching Behavior, or Why Students Watch 90-Min Lectures in 5 Min (2019)

Bauer, Matthias ; Malchow, Martin ; Meinel, Christoph

Many universities record the lectures being held in their facilities to preserve knowledge and to make it available to their students and, at least for some universities and classes, to the broad public. The way with the least effort is to record the whole lecture, which in our case usually is 90 min long. This saves the labor and time of cutting and rearranging lectures scenes to provide short learning videos as known from Massive Open Online Courses (MOOCs), etc. Many lecturers fear that recording their lectures and providing them via an online platform might lead to less participation in the actual lecture. Also, many teachers fear that the lecture recordings are not used with the same focus and dedication as lectures in a lecture hall. In this work, we show that in our experience, full lectures have an average watching duration of just a few minutes and explain the reasons for that and why, in most cases, teachers do not have to worry about that.

Generalized aggregate quality of service computation for composite services (2012)

Yang, Yong ; Dumas, Marlon ; Garcia-Banuelos, Luciano ; Polyvyanyy, Artem ; Zhang, Liang

This article addresses the problem of estimating the Quality of Service (QoS) of a composite service given the QoS of the services participating in the composition. Previous solutions to this problem impose restrictions on the topology of the orchestration models, limiting their applicability to well-structured orchestration models for example. This article lifts these restrictions by proposing a method for aggregate QoS computation that deals with more general types of unstructured orchestration models. The applicability and scalability of the proposed method are validated using a collection of models from industrial practice.

Generative multi-adversarial network for striking the right balance in abdominal image segmentation (2020)

Rezaei, Mina ; Näppi, Janne J. ; Lippert, Christoph ; Meinel, Christoph ; Yoshida, Hiroyuki

Purpose: The identification of abnormalities that are relatively rare within otherwise normal anatomy is a major challenge for deep learning in the semantic segmentation of medical images. The small number of samples of the minority classes in the training data makes the learning of optimal classification challenging, while the more frequently occurring samples of the majority class hamper the generalization of the classification boundary between infrequently occurring target objects and classes. In this paper, we developed a novel generative multi-adversarial network, called Ensemble-GAN, for mitigating this class imbalance problem in the semantic segmentation of abdominal images. Method: The Ensemble-GAN framework is composed of a single-generator and a multi-discriminator variant for handling the class imbalance problem to provide a better generalization than existing approaches. The ensemble model aggregates the estimates of multiple models by training from different initializations and losses from various subsets of the training data. The single generator network analyzes the input image as a condition to predict a corresponding semantic segmentation image by use of feedback from the ensemble of discriminator networks. To evaluate the framework, we trained our framework on two public datasets, with different imbalance ratios and imaging modalities: the Chaos 2019 and the LiTS 2017. Result: In terms of the F1 score, the accuracies of the semantic segmentation of healthy spleen, liver, and left and right kidneys were 0.93, 0.96, 0.90 and 0.94, respectively. The overall F1 scores for simultaneous segmentation of the lesions and liver were 0.83 and 0.94, respectively. Conclusion: The proposed Ensemble-GAN framework demonstrated outstanding performance in the semantic segmentation of medical images in comparison with other approaches on popular abdominal imaging benchmarks. The Ensemble-GAN has the potential to segment abdominal images more accurately than human experts.

Geospatial artificial intelligence (2020)

Döllner, Jürgen Roland Friedrich

Artificial intelligence (AI) is changing fundamentally the way how IT solutions are implemented and operated across all application domains, including the geospatial domain. This contribution outlines AI-based techniques for 3D point clouds and geospatial digital twins as generic components of geospatial AI. First, we briefly reflect on the term "AI" and outline technology developments needed to apply AI to IT solutions, seen from a software engineering perspective. Next, we characterize 3D point clouds as key category of geodata and their role for creating the basis for geospatial digital twins; we explain the feasibility of machine learning (ML) and deep learning (DL) approaches for 3D point clouds. In particular, we argue that 3D point clouds can be seen as a corpus with similar properties as natural language corpora and formulate a "Naturalness Hypothesis" for 3D point clouds. In the main part, we introduce a workflow for interpreting 3D point clouds based on ML/DL approaches that derive domain-specific and application-specific semantics for 3D point clouds without having to create explicit spatial 3D models or explicit rule sets. Finally, examples are shown how ML/DL enables us to efficiently build and maintain base data for geospatial digital twins such as virtual 3D city models, indoor models, or building information models.

Greed is good for deterministic scale-free networks (2020)

Chauhan, Ankit ; Friedrich, Tobias ; Rothenberger, Ralf

Large real-world networks typically follow a power-law degree distribution. To study such networks, numerous random graph models have been proposed. However, real-world networks are not drawn at random. Therefore, Brach et al. (27th symposium on discrete algorithms (SODA), pp 1306-1325, 2016) introduced two natural deterministic conditions: (1) a power-law upper bound on the degree distribution (PLB-U) and (2) power-law neighborhoods, that is, the degree distribution of neighbors of each vertex is also upper bounded by a power law (PLB-N). They showed that many real-world networks satisfy both properties and exploit them to design faster algorithms for a number of classical graph problems. We complement their work by showing that some well-studied random graph models exhibit both of the mentioned PLB properties. PLB-U and PLB-N hold with high probability for Chung-Lu Random Graphs and Geometric Inhomogeneous Random Graphs and almost surely for Hyperbolic Random Graphs. As a consequence, all results of Brach et al. also hold with high probability or almost surely for those random graph classes. In the second part we study three classical NP-hard optimization problems on PLB networks. It is known that on general graphs with maximum degree Delta, a greedy algorithm, which chooses nodes in the order of their degree, only achieves a Omega (ln Delta)-approximation forMinimum Vertex Cover and Minimum Dominating Set, and a Omega(Delta)-approximation forMaximum Independent Set. We prove that the PLB-U property with beta>2 suffices for the greedy approach to achieve a constant-factor approximation for all three problems. We also show that these problems are APX-hard even if PLB-U, PLB-N, and an additional power-law lower bound on the degree distribution hold. Hence, a PTAS cannot be expected unless P = NP. Furthermore, we prove that all three problems are in MAX SNP if the PLB-U property holds.

Refine

Has Fulltext

Author

Year of publication

Document Type

Language

Is part of the Bibliography

Keywords

Institute

234 search hits