publish.UP Search

173 search hits

1 to 100

Sort by

3D from 2D touch (2013)

While interaction with computers used to be dominated by mice and keyboards, new types of sensors now allow users to interact through touch, speech, or using their whole body in 3D space. These new interaction modalities are often referred to as "natural user interfaces" or "NUIs." While 2D NUIs have experienced major success on billions of mobile touch devices sold, 3D NUI systems have so far been unable to deliver a mobile form factor, mainly due to their use of cameras. The fact that cameras require a certain distance from the capture volume has prevented 3D NUI systems from reaching the flat form factor mobile users expect. In this dissertation, we address this issue by sensing 3D input using flat 2D sensors. The systems we present observe the input from 3D objects as 2D imprints upon physical contact. By sampling these imprints at very high resolutions, we obtain the objects' textures. In some cases, a texture uniquely identifies a biometric feature, such as the user's fingerprint. In other cases, an imprint stems from the user's clothing, such as when walking on multitouch floors. By analyzing from which part of the 3D object the 2D imprint results, we reconstruct the object's pose in 3D space. While our main contribution is a general approach to sensing 3D input on 2D sensors upon physical contact, we also demonstrate three applications of our approach. (1) We present high-accuracy touch devices that allow users to reliably touch targets that are a third of the size of those on current touch devices. We show that different users and 3D finger poses systematically affect touch sensing, which current devices perceive as random input noise. We introduce a model for touch that compensates for this systematic effect by deriving the 3D finger pose and the user's identity from each touch imprint. We then investigate this systematic effect in detail and explore how users conceptually touch targets. Our findings indicate that users aim by aligning visual features of their fingers with the target. We present a visual model for touch input that eliminates virtually all systematic effects on touch accuracy. (2) From each touch, we identify users biometrically by analyzing their fingerprints. Our prototype Fiberio integrates fingerprint scanning and a display into the same flat surface, solving a long-standing problem in human-computer interaction: secure authentication on touchscreens. Sensing 3D input and authenticating users upon touch allows Fiberio to implement a variety of applications that traditionally require the bulky setups of current 3D NUI systems. (3) To demonstrate the versatility of 3D reconstruction on larger touch surfaces, we present a high-resolution pressure-sensitive floor that resolves the texture of objects upon touch. Using the same principles as before, our system GravitySpace analyzes all imprints and identifies users based on their shoe soles, detects furniture, and enables accurate touch input using feet. By classifying all imprints, GravitySpace detects the users' body parts that are in contact with the floor and then reconstructs their 3D body poses using inverse kinematics. GravitySpace thus enables a range of applications for future 3D NUI systems based on a flat sensor, such as smart rooms in future homes. We conclude this dissertation by projecting into the future of mobile devices. Focusing on the mobility aspect of our work, we explore how NUI devices may one day augment users directly in the form of implanted devices.

A compliance management framework for business process models (2010)

Awad, Ahmed Mahmoud Hany Aly

Companies develop process models to explicitly describe their business operations. In the same time, business operations, business processes, must adhere to various types of compliance requirements. Regulations, e.g., Sarbanes Oxley Act of 2002, internal policies, best practices are just a few sources of compliance requirements. In some cases, non-adherence to compliance requirements makes the organization subject to legal punishment. In other cases, non-adherence to compliance leads to loss of competitive advantage and thus loss of market share. Unlike the classical domain-independent behavioral correctness of business processes, compliance requirements are domain-specific. Moreover, compliance requirements change over time. New requirements might appear due to change in laws and adoption of new policies. Compliance requirements are offered or enforced by different entities that have different objectives behind these requirements. Finally, compliance requirements might affect different aspects of business processes, e.g., control flow and data flow. As a result, it is infeasible to hard-code compliance checks in tools. Rather, a repeatable process of modeling compliance rules and checking them against business processes automatically is needed. This thesis provides a formal approach to support process design-time compliance checking. Using visual patterns, it is possible to model compliance requirements concerning control flow, data flow and conditional flow rules. Each pattern is mapped into a temporal logic formula. The thesis addresses the problem of consistency checking among various compliance requirements, as they might stem from divergent sources. Also, the thesis contributes to automatically check compliance requirements against process models using model checking. We show that extra domain knowledge, other than expressed in compliance rules, is needed to reach correct decisions. In case of violations, we are able to provide a useful feedback to the user. The feedback is in the form of parts of the process model whose execution causes the violation. In some cases, our approach is capable of providing automated remedy of the violation.

A framework for incremental view graph maintenance (2017)

Beyhl, Thomas

Nowadays, graph data models are employed, when relationships between entities have to be stored and are in the scope of queries. For each entity, this graph data model locally stores relationships to adjacent entities. Users employ graph queries to query and modify these entities and relationships. These graph queries employ graph patterns to lookup all subgraphs in the graph data that satisfy certain graph structures. These subgraphs are called graph pattern matches. However, this graph pattern matching is NP-complete for subgraph isomorphism. Thus, graph queries can suffer a long response time, when the number of entities and relationships in the graph data or the graph patterns increases. One possibility to improve the graph query performance is to employ graph views that keep ready graph pattern matches for complex graph queries for later retrieval. However, these graph views must be maintained by means of an incremental graph pattern matching to keep them consistent with the graph data from which they are derived, when the graph data changes. This maintenance adds subgraphs that satisfy a graph pattern to the graph views and removes subgraphs that do not satisfy a graph pattern anymore from the graph views. Current approaches for incremental graph pattern matching employ Rete networks. Rete networks are discrimination networks that enumerate and maintain all graph pattern matches of certain graph queries by employing a network of condition tests, which implement partial graph patterns that together constitute the overall graph query. Each condition test stores all subgraphs that satisfy the partial graph pattern. Thus, Rete networks suffer high memory consumptions, because they store a large number of partial graph pattern matches. But, especially these partial graph pattern matches enable Rete networks to update the stored graph pattern matches efficiently, because the network maintenance exploits the already stored partial graph pattern matches to find new graph pattern matches. However, other kinds of discrimination networks exist that can perform better in time and space than Rete networks. Currently, these other kinds of networks are not used for incremental graph pattern matching. This thesis employs generalized discrimination networks for incremental graph pattern matching. These discrimination networks permit a generalized network structure of condition tests to enable users to steer the trade-off between memory consumption and execution time for the incremental graph pattern matching. For that purpose, this thesis contributes a modeling language for the effective definition of generalized discrimination networks. Furthermore, this thesis contributes an efficient and scalable incremental maintenance algorithm, which updates the (partial) graph pattern matches that are stored by each condition test. Moreover, this thesis provides a modeling evaluation, which shows that the proposed modeling language enables the effective modeling of generalized discrimination networks. Furthermore, this thesis provides a performance evaluation, which shows that a) the incremental maintenance algorithm scales, when the graph data becomes large, and b) the generalized discrimination network structures can outperform Rete network structures in time and space at the same time for incremental graph pattern matching.

A quantitative evaluation of the enhanced topic-based vector space model (2007)

Polyvyanyy, Artem ; Kuropka, Dominik

This contribution presents a quantitative evaluation procedure for Information Retrieval models and the results of this procedure applied on the enhanced Topic-based Vector Space Model (eTVSM). Since the eTVSM is an ontology-based model, its effectiveness heavily depends on the quality of the underlaying ontology. Therefore the model has been tested with different ontologies to evaluate the impact of those ontologies on the effectiveness of the eTVSM. On the highest level of abstraction, the following results have been observed during our evaluation: First, the theoretically deduced statement that the eTVSM has a similar effecitivity like the classic Vector Space Model if a trivial ontology (every term is a concept and it is independet of any other concepts) is used has been approved. Second, we were able to show that the effectiveness of the eTVSM raises if an ontology is used which is only able to resolve synonyms. We were able to derive such kind of ontology automatically from the WordNet ontology. Third, we observed that more powerful ontologies automatically derived from the WordNet, dramatically dropped the effectiveness of the eTVSM model even clearly below the effectiveness level of the Vector Space Model. Fourth, we were able to show that a manually created and optimized ontology is able to raise the effectiveness of the eTVSM to a level which is clearly above the best effectiveness levels we have found in the literature for the Latent Semantic Index model with compareable document sets.

A software reference architecture for service-oriented 3D geovisualization systems (2014)

Hildebrandt, Dieter

Modern 3D geovisualization systems (3DGeoVSs) are complex and evolving systems that are required to be adaptable and leverage distributed resources, including massive geodata. This article focuses on 3DGeoVSs built based on the principles of service-oriented architectures, standards and image-based representations (SSI) to address practically relevant challenges and potentials. Such systems facilitate resource sharing and agile and efficient system construction and change in an interoperable manner, while exploiting images as efficient, decoupled and interoperable representations. The software architecture of a 3DGeoVS and its underlying visualization model have strong effects on the system's quality attributes and support various system life cycle activities. This article contributes a software reference architecture (SRA) for 3DGeoVSs based on SSI that can be used to design, describe and analyze concrete software architectures with the intended primary benefit of an increase in effectiveness and efficiency in such activities. The SRA integrates existing, proven technology and novel contributions in a unique manner. As the foundation for the SRA, we propose the generalized visualization pipeline model that generalizes and overcomes expressiveness limitations of the prevalent visualization pipeline model. To facilitate exploiting image-based representations (IReps), the SRA integrates approaches for the representation, provisioning and styling of and interaction with IReps. Five applications of the SRA provide proofs of concept for the general applicability and utility of the SRA. A qualitative evaluation indicates the overall suitability of the SRA, its applications and the general approach of building 3DGeoVSs based on SSI.

A virtual machine architecture for creating IT-security laboratories (2006)

Hu, Ji ; Cordel, Dirk ; Meinel, Christoph

E-learning is a flexible and personalized alternative to traditional education. Nonetheless, existing e-learning systems for IT security education have difficulties in delivering hands-on experience because of the lack of proximity. Laboratory environments and practical exercises are indispensable instruction tools to IT security education, but security education in con-ventional computer laboratories poses the problem of immobility as well as high creation and maintenance costs. Hence, there is a need to effectively transform security laboratories and practical exercises into e-learning forms. This report introduces the Tele-Lab IT-Security architecture that allows students not only to learn IT security principles, but also to gain hands-on security experience by exercises in an online laboratory environment. In this architecture, virtual machines are used to provide safe user work environments instead of real computers. Thus, traditional laboratory environments can be cloned onto the Internet by software, which increases accessibilities to laboratory resources and greatly reduces investment and maintenance costs. Under the Tele-Lab IT-Security framework, a set of technical solutions is also proposed to provide effective functionalities, reliability, security, and performance. The virtual machines with appropriate resource allocation, software installation, and system configurations are used to build lightweight security laboratories on a hosting computer. Reliability and availability of laboratory platforms are covered by the virtual machine management framework. This management framework provides necessary monitoring and administration services to detect and recover critical failures of virtual machines at run time. Considering the risk that virtual machines can be misused for compromising production networks, we present security management solutions to prevent misuse of laboratory resources by security isolation at the system and network levels. This work is an attempt to bridge the gap between e-learning/tele-teaching and practical IT security education. It is not to substitute conventional teaching in laboratories but to add practical features to e-learning. This report demonstrates the possibility to implement hands-on security laboratories on the Internet reliably, securely, and economically.

Action patterns in business process models (2009)

Smirnov, Sergey ; Weidlich, Matthias ; Mendling, Jan ; Weske, Mathias

Business process management experiences a large uptake by the industry, and process models play an important role in the analysis and improvement of processes. While an increasing number of staff becomes involved in actual modeling practice, it is crucial to assure model quality and homogeneity along with providing suitable aids for creating models. In this paper we consider the problem of offering recommendations to the user during the act of modeling. Our key contribution is a concept for defining and identifying so-called action patterns - chunks of actions often appearing together in business processes. In particular, we specify action patterns and demonstrate how they can be identified from existing process model repositories using association rule mining techniques. Action patterns can then be used to suggest additional actions for a process model. Our approach is challenged by applying it to the collection of process models from the SAP Reference Model.

Acute kidney injury in patients hospitalized with COVID-19 in New York City (2021)

Dellepiane, Sergio ; Vaid, Akhil ; Jaladanki, Suraj K. ; Coca, Steven ; Fayad, Zahi A. ; Charney, Alexander W. ; Böttinger, Erwin ; He, John Cijiang ; Glicksberg, Benjamin S. ; Chan, Lili ; Nadkarni, Girish

Adaptive windows for duplicate detection (2012)

Draisbach, Uwe ; Naumann, Felix ; Szott, Sascha ; Wonneberg, Oliver

Duplicate detection is the task of identifying all groups of records within a data set that represent the same real-world entity, respectively. This task is difficult, because (i) representations might differ slightly, so some similarity measure must be defined to compare pairs of records and (ii) data sets might have a high volume making a pair-wise comparison of all records infeasible. To tackle the second problem, many algorithms have been suggested that partition the data set and compare all record pairs only within each partition. One well-known such approach is the Sorted Neighborhood Method (SNM), which sorts the data according to some key and then advances a window over the data comparing only records that appear within the same window. We propose several variations of SNM that have in common a varying window size and advancement. The general intuition of such adaptive windows is that there might be regions of high similarity suggesting a larger window size and regions of lower similarity suggesting a smaller window size. We propose and thoroughly evaluate several adaption strategies, some of which are provably better than the original SNM in terms of efficiency (same results with fewer comparisons).

Advancing the discovery of unique column combinations (2011)

Abedjan, Ziawasch ; Naumann, Felix

Unique column combinations of a relational database table are sets of columns that contain only unique values. Discovering such combinations is a fundamental research problem and has many different data management and knowledge discovery applications. Existing discovery algorithms are either brute force or have a high memory load and can thus be applied only to small datasets or samples. In this paper, the wellknown GORDIAN algorithm and "Apriori-based" algorithms are compared and analyzed for further optimization. We greatly improve the Apriori algorithms through efficient candidate generation and statistics-based pruning methods. A hybrid solution HCAGORDIAN combines the advantages of GORDIAN and our new algorithm HCA, and it significantly outperforms all previous work in many situations.

Akzeptanz und Nutzerfreundlichkeit der AusweisApp : eine qualitative Untersuchung ; eine Studie am Hasso-Plattner-Institut für Softwaresystemtechnik im Auftrag des Bundesministeriums des Innern (2013)

Asheuer, Susanne ; Belgassem, Joy ; Eichorn, Wiete ; Leipold, Rio ; Licht, Lucas ; Meinel, Christoph ; Schanz, Anne ; Schnjakin, Maxim

Für die vorliegende Studie »Qualitative Untersuchung zur Akzeptanz des neuen Personalausweises und Erarbeitung von Vorschlägen zur Verbesserung der Usability der Software AusweisApp« arbeitete ein Innovationsteam mit Hilfe der Design Thinking Methode an der Aufgabenstellung »Wie können wir die AusweisApp für Nutzer intuitiv und verständlich gestalten?« Zunächst wurde die Akzeptanz des neuen Personalausweises getestet. Bürger wurden zu ihrem Wissensstand und ihren Erwartungen hinsichtlich des neuen Personalausweises befragt, darüber hinaus zur generellen Nutzung des neuen Personalausweises, der Nutzung der Online-Ausweisfunktion sowie der Usability der AusweisApp. Weiterhin wurden Nutzer bei der Verwendung der aktuellen AusweisApp beobachtet und anschließend befragt. Dies erlaubte einen tiefen Einblick in ihre Bedürfnisse. Die Ergebnisse aus der qualitativen Untersuchung wurden verwendet, um Verbesserungsvorschläge für die AusweisApp zu entwickeln, die den Bedürfnissen der Bürger entsprechen. Die Vorschläge zur Optimierung der AusweisApp wurden prototypisch umgesetzt und mit potentiellen Nutzern getestet. Die Tests haben gezeigt, dass die entwickelten Neuerungen den Bürgern den Zugang zur Nutzung der Online-Ausweisfunktion deutlich vereinfachen. Im Ergebnis konnte festgestellt werden, dass der Akzeptanzgrad des neuen Personalausweises stark divergiert. Die Einstellung der Befragten reichte von Skepsis bis hin zu Befürwortung. Der neue Personalausweis ist ein Thema, das den Bürger polarisiert. Im Rahmen der Nutzertests konnten zahlreiche Verbesserungspotenziale des bestehenden Service Designs sowohl rund um den neuen Personalausweis, als auch im Zusammenhang mit der verwendeten Software aufgedeckt werden. Während der Nutzertests, die sich an die Ideen- und Prototypenphase anschlossen, konnte das Innovtionsteam seine Vorschläge iterieren und auch verifizieren. Die ausgearbeiteten Vorschläge beziehen sich auf die AusweisApp. Die neuen Funktionen umfassen im Wesentlichen: · den direkten Zugang zu den Diensteanbietern, · umfangreiche Hilfestellungen (Tooltips, FAQ, Wizard, Video), · eine Verlaufsfunktion, · einen Beispieldienst, der die Online-Ausweisfunktion erfahrbar macht. Insbesondere gilt es, den Nutzern mit der neuen Version der AusweisApp Anwendungsfelder für ihren neuen Personalausweis und einen Mehrwert zu bieten. Die Ausarbeitung von weiteren Funktionen der AusweisApp kann dazu beitragen, dass der neue Personalausweis sein volles Potenzial entfalten kann.

An abstraction for version control systems (2011)

Kleine, Matthias ; Hirschfeld, Robert ; Bracha, Gilad

Version Control Systems (VCS) allow developers to manage changes to software artifacts. Developers interact with VCSs through a variety of client programs, such as graphical front-ends or command line tools. It is desirable to use the same version control client program against different VCSs. Unfortunately, no established abstraction over VCS concepts exists. Instead, VCS client programs implement ad-hoc solutions to support interaction with multiple VCSs. This thesis presents Pur, an abstraction over version control concepts that allows building rich client programs that can interact with multiple VCSs. We provide an implementation of this abstraction and validate it by implementing a client application.

An e-librarian service : natural language interface for an efficient semantic search within multimedia resources (2005)

Linckels, Serge ; Meinel, Christoph

1 Introduction 1.1 Project formulation 1.2 Our contribution 2 Pedagogical Aspect 4 2.1 Modern teaching 2.2 Our Contribution 2.2.1 Autonomous and exploratory learning 2.2.2 Human machine interaction 2.2.3 Short multimedia clips 3 Ontology Aspect 3.1 Ontology driven expert systems 3.2 Our contribution 3.2.1 Ontology language 3.2.2 Concept Taxonomy 3.2.3 Knowledge base annotation 3.2.4 Description Logics 4 Natural language approach 4.1 Natural language processing in computer science 4.2 Our contribution 4.2.1 Explored strategies 4.2.2 Word equivalence 4.2.3 Semantic interpretation 4.2.4 Various problems 5 Information Retrieval Aspect 5.1 Modern information retrieval 5.2 Our contribution 5.2.1 Semantic query generation 5.2.2 Semantic relatedness 6 Implementation 6.1 Prototypes 6.2 Semantic layer architecture 6.3 Development 7 Experiments 7.1 Description of the experiments 7.2 General characteristics of the three sessions, instructions and procedure 7.3 First Session 7.4 Second Session 7.5 Third Session 7.6 Discussion and conclusion 8 Conclusion and future work 8.1 Conclusion 8.2 Open questions A Description Logics B Probabilistic context-free grammars

Anbieter von Cloud Speicherdiensten im Überblick (2014)

Meinel, Christoph ; Schnjakin, Maxim ; Metzke, Tobias ; Freitag, Markus

Durch die immer stärker werdende Flut an digitalen Informationen basieren immer mehr Anwendungen auf der Nutzung von kostengünstigen Cloud Storage Diensten. Die Anzahl der Anbieter, die diese Dienste zur Verfügung stellen, hat sich in den letzten Jahren deutlich erhöht. Um den passenden Anbieter für eine Anwendung zu finden, müssen verschiedene Kriterien individuell berücksichtigt werden. In der vorliegenden Studie wird eine Auswahl an Anbietern etablierter Basic Storage Diensten vorgestellt und miteinander verglichen. Für die Gegenüberstellung werden Kriterien extrahiert, welche bei jedem der untersuchten Anbieter anwendbar sind und somit eine möglichst objektive Beurteilung erlauben. Hierzu gehören unter anderem Kosten, Recht, Sicherheit, Leistungsfähigkeit sowie bereitgestellte Schnittstellen. Die vorgestellten Kriterien können genutzt werden, um Cloud Storage Anbieter bezüglich eines konkreten Anwendungsfalles zu bewerten.

Architectural modelling and verification of open service-oriented systems of systems (2013)

Becker, Basil

Systems of Systems (SoS) have received a lot of attention recently. In this thesis we will focus on SoS that are built atop the techniques of Service-Oriented Architectures and thus combine the benefits and challenges of both paradigms. For this thesis we will understand SoS as ensembles of single autonomous systems that are integrated to a larger system, the SoS. The interesting fact about these systems is that the previously isolated systems are still maintained, improved and developed on their own. Structural dynamics is an issue in SoS, as at every point in time systems can join and leave the ensemble. This and the fact that the cooperation among the constituent systems is not necessarily observable means that we will consider these systems as open systems. Of course, the system has a clear boundary at each point in time, but this can only be identified by halting the complete SoS. However, halting a system of that size is practically impossible. Often SoS are combinations of software systems and physical systems. Hence a failure in the software system can have a serious physical impact what makes an SoS of this kind easily a safety-critical system. The contribution of this thesis is a modelling approach that extends OMG's SoaML and basically relies on collaborations and roles as an abstraction layer above the components. This will allow us to describe SoS at an architectural level. We will also give a formal semantics for our modelling approach which employs hybrid graph-transformation systems. The modelling approach is accompanied by a modular verification scheme that will be able to cope with the complexity constraints implied by the SoS' structural dynamics and size. Building such autonomous systems as SoS without evolution at the architectural level --- i. e. adding and removing of components and services --- is inadequate. Therefore our approach directly supports the modelling and verification of evolution.

AspectKE*: Security aspects with program analysis for distributed systems (2010)

Fan, Yang ; Masuhara, Hidehiko ; Aotani, Tomoyuki ; Nielson, Flemming ; Nielson, Hanne Riis

Enforcing security policies to distributed systems is difficult, in particular, when a system contains untrusted components. We designed AspectKE*, a distributed AOP language based on a tuple space, to tackle this issue. In AspectKE*, aspects can enforce access control policies that depend on future behavior of running processes. One of the key language features is the predicates and functions that extract results of static program analysis, which are useful for defining security aspects that have to know about future behavior of a program. AspectKE* also provides a novel variable binding mechanism for pointcuts, so that pointcuts can uniformly specify join points based on both static and dynamic information about the program. Our implementation strategy performs fundamental static analysis at load-time, so as to retain runtime overheads minimal. We implemented a compiler for AspectKE*, and demonstrate usefulness of AspectKE* through a security aspect for a distributed chat system.

Aspektorientierte Programmierung : Überblick über Techniken und Werkzeuge (2006)

Adam, Christian ; Brehmer, Bastian ; Hüttenrauch, Stefan ; Jeske, Janin ; Polze, Andreas ; Rasche, Andreas ; Schüler, Benjamin ; Schult, Wolfgang

Inhaltsverzeichnis 1 Einführung 2 Aspektorientierte Programmierung 2.1 Ein System als Menge von Eigenschaften 2.2 Aspekte 2.3 Aspektweber 2.4 Vorteile Aspektorientierter Programmierung 2.5 Kategorisierung der Techniken und Werkzeuge f ¨ ur Aspektorientierte Programmierung 3 Techniken und Werkzeuge zur Analyse Aspektorientierter Softwareprogramme 3.1 Virtual Source File 3.2 FEAT 3.3 JQuery 3.4 Aspect Mining Tool 4 Techniken und Werkzeuge zum Entwurf Aspektorientierter Softwareprogramme 4.1 Concern Space Modeling Schema 4.2 Modellierung von Aspekten mit UML 4.3 CoCompose 4.4 Codagen Architect 5 Techniken und Werkzeuge zur Implementierung Aspektorientierter Softwareprogramme 5.1 Statische Aspektweber 5.2 Dynamische Aspektweber 6 Zusammenfassung

Attacking complexity in logic synthesis of asynchronous circuits (2011)

Wist, Dominic

Most of the microelectronic circuits fabricated today are synchronous, i.e. they are driven by one or several clock signals. Synchronous circuit design faces several fundamental challenges such as high-speed clock distribution, integration of multiple cores operating at different clock rates, reduction of power consumption and dealing with voltage, temperature, manufacturing and runtime variations. Asynchronous or clockless design plays a key role in alleviating these challenges, however the design and test of asynchronous circuits is much more difficult in comparison to their synchronous counterparts. A driving force for a widespread use of asynchronous technology is the availability of mature EDA (Electronic Design Automation) tools which provide an entire automated design flow starting from an HDL (Hardware Description Language) specification yielding the final circuit layout. Even though there was much progress in developing such EDA tools for asynchronous circuit design during the last two decades, the maturity level as well as the acceptance of them is still not comparable with tools for synchronous circuit design. In particular, logic synthesis (which implies the application of Boolean minimisation techniques) for the entire system's control path can significantly improve the efficiency of the resulting asynchronous implementation, e.g. in terms of chip area and performance. However, logic synthesis, in particular for asynchronous circuits, suffers from complexity problems. Signal Transitions Graphs (STGs) are labelled Petri nets which are a widely used to specify the interface behaviour of speed independent (SI) circuits - a robust subclass of asynchronous circuits. STG decomposition is a promising approach to tackle complexity problems like state space explosion in logic synthesis of SI circuits. The (structural) decomposition of STGs is guided by a partition of the output signals and generates a usually much smaller component STG for each partition member, i.e. a component STG with a much smaller state space than the initial specification. However, decomposition can result in component STGs that in isolation have so-called irreducible CSC conflicts (i.e. these components are not SI synthesisable anymore) even if the specification has none of them. A new approach is presented to avoid such conflicts by introducing internal communication between the components. So far, STG decompositions are guided by the finest output partitions, i.e. one output per component. However, this might not yield optimal circuit implementations. Efficient heuristics are presented to determine coarser partitions leading to improved circuits in terms of chip area. For the new algorithms correctness proofs are given and their implementations are incorporated into the decomposition tool DESIJ. The presented techniques are successfully applied to some benchmarks - including 'real-life' specifications arising in the context of control resynthesis - which delivered promising results.

Auf dem Weg zu einem Softwareingenieurwesen (2004)

Wendt, Siegfried

(1) Über die Notwendigkeit, die bisherige Informatik in eine Grundlagenwissenschaft und eine Ingenieurwissenschaft aufzuspalten (2) Was ist Ingenieurskultur? (3) Das Kommunikationsproblem der Informatiker und ihre Unfähigkeit, es wahrzunehmen (4) Besonderheiten des Softwareingenieurwesens im Vergleich mit den klassischen Ingenieurdisziplinen (5) Softwareingenieurspläne können auch für Nichtfachleute verständlich sein (6) Principles for Planning Curricula in Software Engineering

Automatic verification of behavior preservation at the transformation level for relational model transformation (2017)

Dyck, Johannes ; Giese, Holger ; Lambers, Leen

The correctness of model transformations is a crucial element for model-driven engineering of high quality software. In particular, behavior preservation is the most important correctness property avoiding the introduction of semantic errors during the model-driven engineering process. Behavior preservation verification techniques either show that specific properties are preserved, or more generally and complex, they show some kind of behavioral equivalence or refinement between source and target model of the transformation. Both kinds of behavior preservation verification goals have been presented with automatic tool support for the instance level, i.e. for a given source and target model specified by the model transformation. However, up until now there is no automatic verification approach available at the transformation level, i.e. for all source and target models specified by the model transformation. In this report, we extend our results presented in [27] and outline a new sophisticated approach for the automatic verification of behavior preservation captured by bisimulation resp. simulation for model transformations specified by triple graph grammars and semantic definitions given by graph transformation rules. In particular, we show that the behavior preservation problem can be reduced to invariant checking for graph transformation and that the resulting checking problem can be addressed by our own invariant checker even for a complex example where a sequence chart is transformed into communicating automata. We further discuss today's limitations of invariant checking for graph transformation and motivate further lines of future work in this direction.

Babelsberg : specifying and solving constraints on object behavior (2013)

Felgentreff, Tim ; Borning, Alan ; Hirschfeld, Robert

Constraints allow developers to specify desired properties of systems in a number of domains, and have those properties be maintained automatically. This results in compact, declarative code, avoiding scattered code to check and imperatively re-satisfy invariants. Despite these advantages, constraint programming is not yet widespread, with standard imperative programming still the norm. There is a long history of research on integrating constraint programming with the imperative paradigm. However, this integration typically does not unify the constructs for encapsulation and abstraction from both paradigms. This impedes re-use of modules, as client code written in one paradigm can only use modules written to support that paradigm. Modules require redundant definitions if they are to be used in both paradigms. We present a language – Babelsberg – that unifies the constructs for en- capsulation and abstraction by using only object-oriented method definitions for both declarative and imperative code. Our prototype – Babelsberg/R – is an extension to Ruby, and continues to support Ruby’s object-oriented se- mantics. It allows programmers to add constraints to existing Ruby programs in incremental steps by placing them on the results of normal object-oriented message sends. It is implemented by modifying a state-of-the-art Ruby virtual machine. The performance of standard object-oriented code without con- straints is only modestly impacted, with typically less than 10% overhead compared with the unmodified virtual machine. Furthermore, our architec- ture for adding multiple constraint solvers allows Babelsberg to deal with constraints in a variety of domains. We argue that our approach provides a useful step toward making con- straint solving a generic tool for object-oriented programmers. We also provide example applications, written in our Ruby-based implementation, which use constraints in a variety of application domains, including interactive graphics, circuit simulations, data streaming with both hard and soft constraints on performance, and configuration file Management.

Babelsberg/RML (2015)

Felgentreff, Tim ; Hirschfeld, Robert ; Millstein, Todd ; Borning, Alan

New programming language designs are often evaluated on concrete implementations. However, in order to draw conclusions about the language design from the evaluation of concrete programming languages, these implementations need to be verified against the formalism of the design. To that end, we also have to ensure that the design actually meets its stated goals. A useful tool for the latter has been to create an executable semantics from a formalism that can execute a test suite of examples. However, this mechanism so far did not allow to verify an implementation against the design. Babelsberg is a new design for a family of object-constraint languages. Recently, we have developed a formal semantics to clarify some issues in the design of those languages. Supplementing this work, we report here on how this formalism is turned into an executable operational semantics using the RML system. Furthermore, we show how we extended the executable semantics to create a framework that can generate test suites for the concrete Babelsberg implementations that provide traceability from the design to the language. Finally, we discuss how these test suites helped us find and correct mistakes in the Babelsberg implementation for JavaScript.

Batch regions : process instance synchronization based on data (2013)

Pufahl, Luise ; Meyer, Andreas ; Weske, Mathias

Die Automatisierung von Geschäftsprozessen unterstützt Unternehmen, die Ausführung ihrer Prozesse effizienter zu gestalten. In existierenden Business Process Management Systemen, werden die Instanzen eines Prozesses völlig unabhängig voneinander ausgeführt. Jedoch kann das Synchronisieren von Instanzen mit ähnlichen Charakteristiken wie z.B. den gleichen Daten zu reduzierten Ausführungskosten führen. Zum Beispiel, wenn ein Onlinehändler zwei Bestellungen vom selben Kunden mit der gleichen Lieferanschrift erhält, können diese zusammen verpackt und versendet werden, um Versandkosten zu sparen. In diesem Papier verwenden wir Konzepte aus dem Datenbankbereich und führen Datensichten für Geschäftsprozesse ein, um Instanzen zu identifizieren, welche synchronisiert werden können. Auf Grundlage der Datensichten führen wir das Konzept der Batch-Regionen ein. Eine Batch-Region ermöglicht eine kontext-bewusste Instanzen-Synchronisierung über mehrere verbundene Aktivitäten. Das eingeführte Konzept wird mit einer Fallstudie evaluiert, bei der ein Kostenvergleich zwischen der normalen Prozessausführung und der Batchverarbeitung durchgeführt wird.

Behavioural profiles : a relational approach to behaviour consistency (2011)

Weidlich, Matthias

Business Process Management (BPM) emerged as a means to control, analyse, and optimise business operations. Conceptual models are of central importance for BPM. Most prominently, process models define the behaviour that is performed to achieve a business value. In essence, a process model is a mapping of properties of the original business process to the model, created for a purpose. Different modelling purposes, therefore, result in different models of a business process. Against this background, the misalignment of process models often observed in the field of BPM is no surprise. Even if the same business scenario is considered, models created for strategic decision making differ in content significantly from models created for process automation. Despite their differences, process models that refer to the same business process should be consistent, i.e., free of contradictions. Apparently, there is a trade-off between strictness of a notion of consistency and appropriateness of process models serving different purposes. Existing work on consistency analysis builds upon behaviour equivalences and hierarchical refinements between process models. Hence, these approaches are computationally hard and do not offer the flexibility to gradually relax consistency requirements towards a certain setting. This thesis presents a framework for the analysis of behaviour consistency that takes a fundamentally different approach. As a first step, an alignment between corresponding elements of related process models is constructed. Then, this thesis conducts behavioural analysis grounded on a relational abstraction of the behaviour of a process model, its behavioural profile. Different variants of these profiles are proposed, along with efficient computation techniques for a broad class of process models. Using behavioural profiles, consistency of an alignment between process models is judged by different notions and measures. The consistency measures are also adjusted to assess conformance of process logs that capture the observed execution of a process. Further, this thesis proposes various complementary techniques to support consistency management. It elaborates on how to implement consistent change propagation between process models, addresses the exploration of behavioural commonalities and differences, and proposes a model synthesis for behavioural profiles.

Blockchain (2018)

Gayvoronskaya, Tatiana ; Meinel, Christoph ; Schnjakin, Maxim

Der Begriff Blockchain ist in letzter Zeit zu einem Schlagwort geworden, aber nur wenige wissen, was sich genau dahinter verbirgt. Laut einer Umfrage, die im ersten Quartal 2017 veröffentlicht wurde, ist der Begriff nur bei 35 Prozent der deutschen Mittelständler bekannt. Dabei ist die Blockchain-Technologie durch ihre rasante Entwicklung und die globale Eroberung unterschiedlicher Märkte für Massenmedien sehr interessant. So sehen viele die Blockchain-Technologie entweder als eine Allzweckwaffe, zu der aber nur wenige einen Zugang haben, oder als eine Hacker-Technologie für geheime Geschäfte im Darknet. Dabei liegt die Innovation der Blockchain-Technologie in ihrer erfolgreichen Zusammensetzung bereits vorhandener Ansätze: dezentrale Netzwerke, Kryptographie, Konsensfindungsmodelle. Durch das innovative Konzept wird ein Werte-Austausch in einem dezentralen System möglich. Dabei wird kein Vertrauen zwischen dessen Knoten (z.B. Nutzer) vorausgesetzt. Mit dieser Studie möchte das Hasso-Plattner-Institut den Lesern helfen, ihren eigenen Standpunkt zur Blockchain-Technologie zu finden und dabei dazwischen unterscheiden zu können, welche Eigenschaften wirklich innovativ und welche nichts weiter als ein Hype sind. Die Autoren der vorliegenden Arbeit analysieren positive und negative Eigenschaften, welche die Blockchain-Architektur prägen, und stellen mögliche Anpassungs- und Lösungsvorschläge vor, die zu einem effizienten Einsatz der Technologie beitragen können. Jedem Unternehmen, bevor es sich für diese Technologie entscheidet, wird dabei empfohlen, für den geplanten Anwendungszweck zunächst ein klares Ziel zu definieren, das mit einem angemessenen Kosten-Nutzen-Verhältnis angestrebt werden kann. Dabei sind sowohl die Möglichkeiten als auch die Grenzen der Blockchain-Technologie zu beachten. Die relevanten Schritte, die es in diesem Zusammenhang zu beachten gilt, fasst die Studie für die Leser übersichtlich zusammen. Es wird ebenso auf akute Fragestellungen wie Skalierbarkeit der Blockchain, geeigneter Konsensalgorithmus und Sicherheit eingegangen, darunter verschiedene Arten möglicher Angriffe und die entsprechenden Gegenmaßnahmen zu deren Abwehr. Neue Blockchains etwa laufen Gefahr, geringere Sicherheit zu bieten, da Änderungen an der bereits bestehenden Technologie zu Schutzlücken und Mängeln führen können. Nach Diskussion der innovativen Eigenschaften und Probleme der Blockchain-Technologie wird auf ihre Umsetzung eingegangen. Interessierten Unternehmen stehen viele Umsetzungsmöglichkeiten zur Verfügung. Die zahlreichen Anwendungen haben entweder eine eigene Blockchain als Grundlage oder nutzen bereits bestehende und weitverbreitete Blockchain-Systeme. Zahlreiche Konsortien und Projekte bieten „Blockchain-as-a-Service“ an und unterstützen andere Unternehmen beim Entwickeln, Testen und Bereitstellen von Anwendungen. Die Studie gibt einen detaillierten Überblick über zahlreiche relevante Einsatzbereiche und Projekte im Bereich der Blockchain-Technologie. Dadurch, dass sie noch relativ jung ist und sich schnell entwickelt, fehlen ihr noch einheitliche Standards, die Zusammenarbeit der verschiedenen Systeme erlauben und an die sich alle Entwickler halten können. Aktuell orientieren sich Entwickler an Bitcoin-, Ethereum- und Hyperledger-Systeme, diese dienen als Grundlage für viele weitere Blockchain-Anwendungen. Ziel ist, den Lesern einen klaren und umfassenden Überblick über die Blockchain-Technologie und deren Möglichkeiten zu vermitteln.

Building a columnar database on shared main memory-based storage (2014)

Tinnefeld, Christian

In the field of disk-based parallel database management systems exists a great variety of solutions based on a shared-storage or a shared-nothing architecture. In contrast, main memory-based parallel database management systems are dominated solely by the shared-nothing approach as it preserves the in-memory performance advantage by processing data locally on each server. We argue that this unilateral development is going to cease due to the combination of the following three trends: a) Nowadays network technology features remote direct memory access (RDMA) and narrows the performance gap between accessing main memory inside a server and of a remote server to and even below a single order of magnitude. b) Modern storage systems scale gracefully, are elastic, and provide high-availability. c) A modern storage system such as Stanford's RAMCloud even keeps all data resident in main memory. Exploiting these characteristics in the context of a main-memory parallel database management system is desirable. The advent of RDMA-enabled network technology makes the creation of a parallel main memory DBMS based on a shared-storage approach feasible. This thesis describes building a columnar database on shared main memory-based storage. The thesis discusses the resulting architecture (Part I), the implications on query processing (Part II), and presents an evaluation of the resulting solution in terms of performance, high-availability, and elasticity (Part III). In our architecture, we use Stanford's RAMCloud as shared-storage, and the self-designed and developed in-memory AnalyticsDB as relational query processor on top. AnalyticsDB encapsulates data access and operator execution via an interface which allows seamless switching between local and remote main memory, while RAMCloud provides not only storage capacity, but also processing power. Combining both aspects allows pushing-down the execution of database operators into the storage system. We describe how the columnar data processed by AnalyticsDB is mapped to RAMCloud's key-value data model and how the performance advantages of columnar data storage can be preserved. The combination of fast network technology and the possibility to execute database operators in the storage system opens the discussion for site selection. We construct a system model that allows the estimation of operator execution costs in terms of network transfer, data processed in memory, and wall time. This can be used for database operators that work on one relation at a time - such as a scan or materialize operation - to discuss the site selection problem (data pull vs. operator push). Since a database query translates to the execution of several database operators, it is possible that the optimal site selection varies per operator. For the execution of a database operator that works on two (or more) relations at a time, such as a join, the system model is enriched by additional factors such as the chosen algorithm (e.g. Grace- vs. Distributed Block Nested Loop Join vs. Cyclo-Join), the data partitioning of the respective relations, and their overlapping as well as the allowed resource allocation. We present an evaluation on a cluster with 60 nodes where all nodes are connected via RDMA-enabled network equipment. We show that query processing performance is about 2.4x slower if everything is done via the data pull operator execution strategy (i.e. RAMCloud is being used only for data access) and about 27% slower if operator execution is also supported inside RAMCloud (in comparison to operating only on main memory inside a server without any network communication at all). The fast-crash recovery feature of RAMCloud can be leveraged to provide high-availability, e.g. a server crash during query execution only delays the query response for about one second. Our solution is elastic in a way that it can adapt to changing workloads a) within seconds, b) without interruption of the ongoing query processing, and c) without manual intervention.

Built-in recovery support for explorative programming (2014)

Steinert, Bastian

This work introduces concepts and corresponding tool support to enable a complementary approach in dealing with recovery. Programmers need to recover a development state, or a part thereof, when previously made changes reveal undesired implications. However, when the need arises suddenly and unexpectedly, recovery often involves expensive and tedious work. To avoid tedious work, literature recommends keeping away from unexpected recovery demands by following a structured and disciplined approach, which consists of the application of various best practices including working only on one thing at a time, performing small steps, as well as making proper use of versioning and testing tools. However, the attempt to avoid unexpected recovery is both time-consuming and error-prone. On the one hand, it requires disproportionate effort to minimize the risk of unexpected situations. On the other hand, applying recommended practices selectively, which saves time, can hardly avoid recovery. In addition, the constant need for foresight and self-control has unfavorable implications. It is exhaustive and impedes creative problem solving. This work proposes to make recovery fast and easy and introduces corresponding support called CoExist. Such dedicated support turns situations of unanticipated recovery from tedious experiences into pleasant ones. It makes recovery fast and easy to accomplish, even if explicit commits are unavailable or tests have been ignored for some time. When mistakes and unexpected insights are no longer associated with tedious corrective actions, programmers are encouraged to change source code as a means to reason about it, as opposed to making changes only after structuring and evaluating them mentally. This work further reports on an implementation of the proposed tool support in the Squeak/Smalltalk development environment. The development of the tools has been accompanied by regular performance and usability tests. In addition, this work investigates whether the proposed tools affect programmers’ performance. In a controlled lab study, 22 participants improved the design of two different applications. Using a repeated measurement setup, the study examined the effect of providing CoExist on programming performance. The result of analyzing 88 hours of programming suggests that built-in recovery support as provided with CoExist positively has a positive effect on programming performance in explorative programming tasks.

Business process architectures (2015)

Eid-Sabbagh, Rami-Habib

Business Process Management has become an integral part of modern organizations in the private and public sector for improving their operations. In the course of Business Process Management efforts, companies and organizations assemble large process model repositories with many hundreds and thousands of business process models bearing a large amount of information. With the advent of large business process model collections, new challenges arise as structuring and managing a large amount of process models, their maintenance, and their quality assurance. This is covered by business process architectures that have been introduced for organizing and structuring business process model collections. A variety of business process architecture approaches have been proposed that align business processes along aspects of interest, e. g., goals, functions, or objects. They provide a high level categorization of single processes ignoring their interdependencies, thus hiding valuable information. The production of goods or the delivery of services are often realized by a complex system of interdependent business processes. Hence, taking a holistic view at business processes interdependencies becomes a major necessity to organize, analyze, and assess the impact of their re-/design. Visualizing business processes interdependencies reveals hidden and implicit information from a process model collection. In this thesis, we present a novel Business Process Architecture approach for representing and analyzing business process interdependencies on an abstract level. We propose a formal definition of our Business Process Architecture approach, design correctness criteria, and develop analysis techniques for assessing their quality. We describe a methodology for applying our Business Process Architecture approach top-down and bottom-up. This includes techniques for Business Process Architecture extraction from, and decomposition to process models while considering consistency issues between business process architecture and process model level. Using our extraction algorithm, we present a novel technique to identify and visualize data interdependencies in Business Process Data Architectures. Our Business Process Architecture approach provides business process experts,managers, and other users of a process model collection with an overview that allows reasoning about a large set of process models, understanding, and analyzing their interdependencies in a facilitated way. In this regard we evaluated our Business Process Architecture approach in an experiment and provide implementations of selected techniques.

Business process architectures with multiplicities : transformation and correctness (2013)

Eid-Sabbagh, Rami-Habib ; Hewelt, Marcin ; Weske, Mathias

Business processes are instrumental to manage work in organisations. To study the interdependencies between business processes, Business Process Architectures have been introduced. These express trigger and message ow relations between business processes. When we investigate real world Business Process Architectures, we find complex interdependencies, involving multiple process instances. These aspects have not been studied in detail so far, especially concerning correctness properties. In this paper, we propose a modular transformation of BPAs to open nets for the analysis of behavior involving multiple business processes with multiplicities. For this purpose we introduce intermediary nets to portray semantics of multiplicity specifications. We evaluate our approach on a use case from the public sector.

Business process model abstraction (2011)

Smirnov, Sergey

Business process models are used within a range of organizational initiatives, where every stakeholder has a unique perspective on a process and demands the respective model. As a consequence, multiple process models capturing the very same business process coexist. Keeping such models in sync is a challenge within an ever changing business environment: once a process is changed, all its models have to be updated. Due to a large number of models and their complex relations, model maintenance becomes error-prone and expensive. Against this background, business process model abstraction emerged as an operation reducing the number of stored process models and facilitating model management. Business process model abstraction is an operation preserving essential process properties and leaving out insignificant details in order to retain information relevant for a particular purpose. Process model abstraction has been addressed by several researchers. The focus of their studies has been on particular use cases and model transformations supporting these use cases. This thesis systematically approaches the problem of business process model abstraction shaping the outcome into a framework. We investigate the current industry demand in abstraction summarizing it in a catalog of business process model abstraction use cases. The thesis focuses on one prominent use case where the user demands a model with coarse-grained activities and overall process ordering constraints. We develop model transformations that support this use case starting with the transformations based on process model structure analysis. Further, abstraction methods considering the semantics of process model elements are investigated. First, we suggest how semantically related activities can be discovered in process models-a barely researched challenge. The thesis validates the designed abstraction methods against sets of industrial process models and discusses the method implementation aspects. Second, we develop a novel model transformation, which combined with the related activity discovery allows flexible non-hierarchical abstraction. In this way this thesis advocates novel model transformations that facilitate business process model management and provides the foundations for innovative tool support.

Business process model abstraction : theory and practice (2010)

Smirnov, Sergey ; Reijers, Hajo A. ; Nugteren, Thijs ; Weske, Mathias

Business process management aims at capturing, understanding, and improving work in organizations. The central artifacts are process models, which serve different purposes. Detailed process models are used to analyze concrete working procedures, while high-level models show, for instance, handovers between departments. To provide different views on process models, business process model abstraction has emerged. While several approaches have been proposed, a number of abstraction use case that are both relevant for industry and scientifically challenging are yet to be addressed. In this paper we systematically develop, classify, and consolidate different use cases for business process model abstraction. The reported work is based on a study with BPM users in the health insurance sector and validated with a BPM consultancy company and a large BPM vendor. The identified fifteen abstraction use cases reflect the industry demand. The related work on business process model abstraction is evaluated against the use cases, which leads to a research agenda.

Cache conscious column organization in in-memory column stores (2013)

Schwalb, David ; Krüger, Jens ; Plattner, Hasso

Cost models are an essential part of database systems, as they are the basis of query performance optimization. Based on predictions made by cost models, the fastest query execution plan can be chosen and executed or algorithms can be tuned and optimised. In-memory databases shifts the focus from disk to main memory accesses and CPU costs, compared to disk based systems where input and output costs dominate the overall costs and other processing costs are often neglected. However, modelling memory accesses is fundamentally different and common models do not apply anymore. This work presents a detailed parameter evaluation for the plan operators scan with equality selection, scan with range selection, positional lookup and insert in in-memory column stores. Based on this evaluation, a cost model based on cache misses for estimating the runtime of the considered plan operators using different data structures is developed. Considered are uncompressed columns, bit compressed and dictionary encoded columns with sorted and unsorted dictionaries. Furthermore, tree indices on the columns and dictionaries are discussed. Finally, partitioned columns consisting of one partition with a sorted and one with an unsorted dictionary are investigated. New values are inserted in the unsorted dictionary partition and moved periodically by a merge process to the sorted partition. An efficient attribute merge algorithm is described, supporting the update performance required to run enterprise applications on read-optimised databases. Further, a memory traffic based cost model for the merge process is provided.

Cloud security mechanisms (2014)

Cloud computing has brought great benefits in cost and flexibility for provisioning services. The greatest challenge of cloud computing remains however the question of security. The current standard tools in access control mechanisms and cryptography can only partly solve the security challenges of cloud infrastructures. In the recent years of research in security and cryptography, novel mechanisms, protocols and algorithms have emerged that offer new ways to create secure services atop cloud infrastructures. This report provides introductions to a selection of security mechanisms that were part of the "Cloud Security Mechanisms" seminar in summer term 2013 at HPI.

Conceptual architecture patterns : FMC–based representations (2004)

This document presents the results of the seminar "Coneptual Arachitecture Patterns" of the winter term 2002 in the Hasso-Plattner-Institute. It is a compilation of the student's elaborations dealing with some conceptual architecture patterns which can be found in literature. One important focus laid on the runtime structures and the presentation of the patterns. 1. Introduction 1.1. The Seminar 1.2. Literature 2 Pipes and Filters (André Langhorst and Martin Steinle) 3 Broker (Konrad Hübner and Einar Lück) 4 Microkernel (Eiko Büttner and Stefan Richter) 5 Component Configurator (Stefan Röck and Alexander Gierak) 6 Interceptor (Marc Förster and Peter Aschenbrenner) 7 Reactor (Nikolai Cieslak and Dennis Eder) 8 Half–Sync/Half–Async (Robert Mitschke and Harald Schubert) 9 Leader/Followers (Dennis Klemann and Steffen Schmidt)

Context-aware semantic analysis of video metadata (2013)

Steinmetz, Nadine

The Semantic Web provides information contained in the World Wide Web as machine-readable facts. In comparison to a keyword-based inquiry, semantic search enables a more sophisticated exploration of web documents. By clarifying the meaning behind entities, search results are more precise and the semantics simultaneously enable an exploration of semantic relationships. However, unlike keyword searches, a semantic entity-focused search requires that web documents are annotated with semantic representations of common words and named entities. Manual semantic annotation of (web) documents is time-consuming; in response, automatic annotation services have emerged in recent years. These annotation services take continuous text as input, detect important key terms and named entities and annotate them with semantic entities contained in widely used semantic knowledge bases, such as Freebase or DBpedia. Metadata of video documents require special attention. Semantic analysis approaches for continuous text cannot be applied, because information of a context in video documents originates from multiple sources possessing different reliabilities and characteristics. This thesis presents a semantic analysis approach consisting of a context model and a disambiguation algorithm for video metadata. The context model takes into account the characteristics of video metadata and derives a confidence value for each metadata item. The confidence value represents the level of correctness and ambiguity of the textual information of the metadata item. The lower the ambiguity and the higher the prospective correctness, the higher the confidence value. The metadata items derived from the video metadata are analyzed in a specific order from high to low confidence level. Previously analyzed metadata are used as reference points in the context for subsequent disambiguation. The contextually most relevant entity is identified by means of descriptive texts and semantic relationships to the context. The context is created dynamically for each metadata item, taking into account the confidence value and other characteristics. The proposed semantic analysis follows two hypotheses: metadata items of a context should be processed in descendent order of their confidence value, and the metadata that pertains to a context should be limited by content-based segmentation boundaries. The evaluation results support the proposed hypotheses and show increased recall and precision for annotated entities, especially for metadata that originates from sources with low reliability. The algorithms have been evaluated against several state-of-the-art annotation approaches. The presented semantic analysis process is integrated into a video analysis framework and has been successfully applied in several projects for the purpose of semantic video exploration of videos.

Correct dynamic service-oriented architectures : modeling and compositional verification with dynamic collaborations (2009)

Becker, Basil ; Giese, Holger ; Neumann, Stefan

Service-oriented modeling employs collaborations to capture the coordination of multiple roles in form of service contracts. In case of dynamic collaborations the roles may join and leave the collaboration at runtime and therefore complex structural dynamics can result, which makes it very hard to ensure their correct and safe operation. We present in this paper our approach for modeling and verifying such dynamic collaborations. Modeling is supported using a well-defined subset of UML class diagrams, behavioral rules for the structural dynamics, and UML state machines for the role behavior. To be also able to verify the resulting service-oriented systems, we extended our former results for the automated verification of systems with structural dynamics [7, 8] and developed a compositional reasoning scheme, which enables the reuse of verification results. We outline our approach using the example of autonomous vehicles that use such dynamic collaborations via ad-hoc networking to coordinate and optimize their joint behavior.

Covering or complete? : Discovering conditional inclusion dependencies (2012)

Bauckmann, Jana ; Abedjan, Ziawasch ; Leser, Ulf ; Müller, Heiko ; Naumann, Felix

Data dependencies, or integrity constraints, are used to improve the quality of a database schema, to optimize queries, and to ensure consistency in a database. In the last years conditional dependencies have been introduced to analyze and improve data quality. In short, a conditional dependency is a dependency with a limited scope defined by conditions over one or more attributes. Only the matching part of the instance must adhere to the dependency. In this paper we focus on conditional inclusion dependencies (CINDs). We generalize the definition of CINDs, distinguishing covering and completeness conditions. We present a new use case for such CINDs showing their value for solving complex data quality tasks. Further, we define quality measures for conditions inspired by precision and recall. We propose efficient algorithms that identify covering and completeness conditions conforming to given quality thresholds. Our algorithms choose not only the condition values but also the condition attributes automatically. Finally, we show that our approach efficiently provides meaningful and helpful results for our use case.

CSOM/PL : a virtual machine product line (2011)

Haupt, Michael ; Marr, Stefan ; Hirschfeld, Robert

CSOM/PL is a software product line (SPL) derived from applying multi-dimensional separation of concerns (MDSOC) techniques to the domain of high-level language virtual machine (VM) implementations. For CSOM/PL, we modularised CSOM, a Smalltalk VM implemented in C, using VMADL (virtual machine architecture description language). Several features of the original CSOM were encapsulated in VMADL modules and composed in various combinations. In an evaluation of our approach, we show that applying MDSOC and SPL principles to a domain as complex as that of VMs is not only feasible but beneficial, as it improves understandability, maintainability, and configurability of VM implementations without harming performance.

Cyber-physical systems with dynamic structure : towards modeling and verification of inductive invariants (2012)

Becker, Basil ; Giese, Holger

Cyber-physical systems achieve sophisticated system behavior exploring the tight interconnection of physical coupling present in classical engineering systems and information technology based coupling. A particular challenging case are systems where these cyber-physical systems are formed ad hoc according to the specific local topology, the available networking capabilities, and the goals and constraints of the subsystems captured by the information processing part. In this paper we present a formalism that permits to model the sketched class of cyber-physical systems. The ad hoc formation of tightly coupled subsystems of arbitrary size are specified using a UML-based graph transformation system approach. Differential equations are employed to define the resulting tightly coupled behavior. Together, both form hybrid graph transformation systems where the graph transformation rules define the discrete steps where the topology or modes may change, while the differential equations capture the continuous behavior in between such discrete changes. In addition, we demonstrate that automated analysis techniques known for timed graph transformation systems for inductive invariants can be extended to also cover the hybrid case for an expressive case of hybrid models where the formed tightly coupled subsystems are restricted to smaller local networks.

Data cleansing and integration operators for a parallel data analytics platform (2014)

Heise, Arvid

The data quality of real-world datasets need to be constantly monitored and maintained to allow organizations and individuals to reliably use their data. Especially, data integration projects suffer from poor initial data quality and as a consequence consume more effort and money. Commercial products and research prototypes for data cleansing and integration help users to improve the quality of individual and combined datasets. They can be divided into either standalone systems or database management system (DBMS) extensions. On the one hand, standalone systems do not interact well with DBMS and require time-consuming data imports and exports. On the other hand, DBMS extensions are often limited by the underlying system and do not cover the full set of data cleansing and integration tasks. We overcome both limitations by implementing a concise set of five data cleansing and integration operators on the parallel data analytics platform Stratosphere. We define the semantics of the operators, present their parallel implementation, and devise optimization techniques for individual operators and combinations thereof. Users specify declarative queries in our query language METEOR with our new operators to improve the data quality of individual datasets or integrate them to larger datasets. By integrating the data cleansing operators into the higher level language layer of Stratosphere, users can easily combine cleansing operators with operators from other domains, such as information extraction, to complex data flows. Through a generic description of the operators, the Stratosphere optimizer reorders operators even from different domains to find better query plans. As a case study, we reimplemented a part of the large Open Government Data integration project GovWILD with our new operators and show that our queries run significantly faster than the original GovWILD queries, which rely on relational operators. Evaluation reveals that our operators exhibit good scalability on up to 100 cores, so that even larger inputs can be efficiently processed by scaling out to more machines. Finally, our scripts are considerably shorter than the original GovWILD scripts, which results in better maintainability of the scripts.

Data in business processes (2011)

Meyer, Andreas ; Smirnov, Sergey ; Weske, Mathias

Prozesse und Daten sind gleichermaßen wichtig für das Geschäftsprozessmanagement. Prozessdaten sind dabei insbesondere im Kontext der Automatisierung von Geschäftsprozessen, dem Prozesscontrolling und der Repräsentation der Vermögensgegenstände von Organisationen relevant. Es existieren viele Prozessmodellierungssprachen, von denen jede die Darstellung von Daten durch eine fest spezifizierte Menge an Modellierungskonstrukten ermöglicht. Allerdings unterscheiden sich diese Darstellungenund damit der Grad der Datenmodellierung stark untereinander. Dieser Report evaluiert verschiedene Prozessmodellierungssprachen bezüglich der Unterstützung von Datenmodellierung. Als einheitliche Grundlage entwickeln wir ein Framework, welches prozess- und datenrelevante Aspekte systematisch organisiert. Die Kriterien legen dabei das Hauptaugenmerk auf die datenrelevanten Aspekte. Nach Einführung des Frameworks vergleichen wir zwölf Prozessmodellierungssprachen gegen dieses. Wir generalisieren die Erkenntnisse aus den Vergleichen und identifizieren Cluster bezüglich des Grades der Datenmodellierung, in welche die einzelnen Sprachen eingeordnet werden.

Data perspective in business process management (2015)

Meyer, Andreas

Business process management (BPM) is a systematic and structured approach to model, analyze, control, and execute business operations also referred to as business processes that get carried out to achieve business goals. Central to BPM are conceptual models. Most prominently, process models describe which tasks are to be executed by whom utilizing which information to reach a business goal. Process models generally cover the perspectives of control flow, resource, data flow, and information systems. Execution of business processes leads to the work actually being carried out. Automating them increases the efficiency and is usually supported by process engines. This, though, requires the coverage of control flow, resource assignments, and process data. While the first two perspectives are well supported in current process engines, data handling needs to be implemented and maintained manually. However, model-driven data handling promises to ease implementation, reduces the error-proneness through graphical visualization, and reduces development efforts through code generation. This thesis addresses the modeling, analysis, and execution of data in business processes and presents a novel approach to execute data-annotated process models entirely model-driven. As a first step and formal grounding for the process execution, a conceptual framework for the integration of processes and data is introduced. This framework is complemented by operational semantics through a Petri net mapping extended with data considerations. Model-driven data execution comprises the handling of complex data dependencies, process data, and data exchange in case of communication between multiple process participants. This thesis introduces concepts from the database domain into BPM to enable the distinction of data operations, to specify relations between data objects of the same as well as of different types, to correlate modeled data nodes as well as received messages to the correct run-time process instances, and to generate messages for inter-process communication. The underlying approach, which is not limited to a particular process description language, has been implemented as proof-of-concept. Automation of data handling in business processes requires data-annotated and correct process models. Targeting the former, algorithms are introduced to extract information about data nodes, their states, and data dependencies from control information and to annotate the process model accordingly. Usually, not all required information can be extracted from control flow information, since some data manipulations are not specified. This requires further refinement of the process model. Given a set of object life cycles specifying allowed data manipulations, automated refinement of the process model towards containment of all data manipulations is enabled. Process models are an abstraction focusing on specific aspects in detail, e.g., the control flow and the data flow views are often represented through activity-centric and object-centric process models. This thesis introduces algorithms for roundtrip transformations enabling the stakeholder to add information to the process model in the view being most appropriate. Targeting process model correctness, this thesis introduces the notion of weak conformance that checks for consistency between given object life cycles and the process model such that the process model may only utilize data manipulations specified directly or indirectly in an object life cycle. The notion is computed via soundness checking of a hybrid representation integrating control flow and data flow correctness checking. Making a process model executable, identified violations must be corrected. Therefore, an approach is proposed that identifies for each violation multiple, alternative changes to the process model or the object life cycles. Utilizing the results of this thesis, business processes can be executed entirely model-driven from the data perspective in addition to the control flow and resource perspectives already supported before. Thereby, the model creation is supported by algorithms partly automating the creation process while model consistency is ensured by data correctness checks.

Dependency discovery for data integration (2013)

Bauckmann, Jana

Data integration aims to combine data of different sources and to provide users with a unified view on these data. This task is as challenging as valuable. In this thesis we propose algorithms for dependency discovery to provide necessary information for data integration. We focus on inclusion dependencies (INDs) in general and a special form named conditional inclusion dependencies (CINDs): (i) INDs enable the discovery of structure in a given schema. (ii) INDs and CINDs support the discovery of cross-references or links between schemas. An IND “A in B” simply states that all values of attribute A are included in the set of values of attribute B. We propose an algorithm that discovers all inclusion dependencies in a relational data source. The challenge of this task is the complexity of testing all attribute pairs and further of comparing all of each attribute pair's values. The complexity of existing approaches depends on the number of attribute pairs, while ours depends only on the number of attributes. Thus, our algorithm enables to profile entirely unknown data sources with large schemas by discovering all INDs. Further, we provide an approach to extract foreign keys from the identified INDs. We extend our IND discovery algorithm to also find three special types of INDs: (i) Composite INDs, such as “AB in CD”, (ii) approximate INDs that allow a certain amount of values of A to be not included in B, and (iii) prefix and suffix INDs that represent special cross-references between schemas. Conditional inclusion dependencies are inclusion dependencies with a limited scope defined by conditions over several attributes. Only the matching part of the instance must adhere the dependency. We generalize the definition of CINDs distinguishing covering and completeness conditions and define quality measures for conditions. We propose efficient algorithms that identify covering and completeness conditions conforming to given quality thresholds. The challenge for this task is twofold: (i) Which (and how many) attributes should be used for the conditions? (ii) Which attribute values should be chosen for the conditions? Previous approaches rely on pre-selected condition attributes or can only discover conditions applying to quality thresholds of 100%. Our approaches were motivated by two application domains: data integration in the life sciences and link discovery for linked open data. We show the efficiency and the benefits of our approaches for use cases in these domains.

Design and implementation of non-photorealistic rendering techniques for 3D geospatial data (2016)

Semmo, Amir

Geospatial data has become a natural part of a growing number of information systems and services in the economy, society, and people's personal lives. In particular, virtual 3D city and landscape models constitute valuable information sources within a wide variety of applications such as urban planning, navigation, tourist information, and disaster management. Today, these models are often visualized in detail to provide realistic imagery. However, a photorealistic rendering does not automatically lead to high image quality, with respect to an effective information transfer, which requires important or prioritized information to be interactively highlighted in a context-dependent manner. Approaches in non-photorealistic renderings particularly consider a user's task and camera perspective when attempting optimal expression, recognition, and communication of important or prioritized information. However, the design and implementation of non-photorealistic rendering techniques for 3D geospatial data pose a number of challenges, especially when inherently complex geometry, appearance, and thematic data must be processed interactively. Hence, a promising technical foundation is established by the programmable and parallel computing architecture of graphics processing units. This thesis proposes non-photorealistic rendering techniques that enable both the computation and selection of the abstraction level of 3D geospatial model contents according to user interaction and dynamically changing thematic information. To achieve this goal, the techniques integrate with hardware-accelerated rendering pipelines using shader technologies of graphics processing units for real-time image synthesis. The techniques employ principles of artistic rendering, cartographic generalization, and 3D semiotics—unlike photorealistic rendering—to synthesize illustrative renditions of geospatial feature type entities such as water surfaces, buildings, and infrastructure networks. In addition, this thesis contributes a generic system that enables to integrate different graphic styles—photorealistic and non-photorealistic—and provide their seamless transition according to user tasks, camera view, and image resolution. Evaluations of the proposed techniques have demonstrated their significance to the field of geospatial information visualization including topics such as spatial perception, cognition, and mapping. In addition, the applications in illustrative and focus+context visualization have reflected their potential impact on optimizing the information transfer regarding factors such as cognitive load, integration of non-realistic information, visualization of uncertainty, and visualization on small displays.

Design-Thinking-Diskurse : Bestimmung, Themen, Entwicklungen (2013)

Lindberg, Tilmann Sören

Der Untersuchungsgegenstand der vorliegenden Arbeit ist, die mit dem Begriff „Design Thinking“ verbundenen Diskurse zu bestimmen und deren Themen, Konzepte und Bezüge herauszuarbeiten. Diese Zielstellung ergibt sich aus den mehrfachen Widersprüchen und Vieldeutigkeiten, die die gegenwärtigen Verwendungen des Design-Thinking-Begriffs charakterisieren und den kohärenten Gebrauch in Wissenschaft und Wirtschaft erschweren. Diese Arbeit soll einen Beitrag dazu leisten, „Design Thinking“ in den unterschiedlichen Diskurszusammenhängen grundlegend zu verstehen und für zukünftige Verwendungen des Design-Thinking-Begriffs eine solide Argumentationsbasis zu schaffen.

Development of AUTOSAR standard documents at Carmeq GmbH (2015)

Hebig, Regina ; Giese, Holger ; Batoulis, Kimon ; Langer, Philipp ; Zamani Farahani, Armin ; Yao, Gary ; Wolowyk, Mychajlo

This report documents the captured MDE history of Carmeq GmbH, in context of the project Evolution of MDE Settings in Practice. The goal of the project is the elicitation of MDE approaches and their evolution.

Die Cloud für Schulen in Deutschland (2017)

Meinel, Christoph ; Renz, Jan ; Grella, Catrina ; Karn, Nils ; Hagedorn, Christiane

Die digitale Entwicklung durchdringt unser Bildungssystem, doch Schulen sind auf die Veränderungen kaum vorbereitet: Überforderte Lehrer/innen, infrastrukturell schwach ausgestattete Unterrichtsräume und unzureichend gewartete Computernetzwerke sind keine Seltenheit. Veraltete Hard- und Software erschweren digitale Bildung in Schulen eher, als dass sie diese ermöglichen: Ein zukunftssicherer Ansatz ist es, die Rechner weitgehend aus den Schulen zu entfernen und Bildungsinhalte in eine Cloud zu überführen. Zeitgemäßer Unterricht benötigt moderne Technologie und eine zukunftsorientierte Infrastruktur. Eine Schul-Cloud (https://hpi.de/schul-cloud) kann dabei helfen, die digitale Transformation in Schulen zu meistern und den fächerübergreifenden Unterricht mit digitalen Inhalten zu bereichern. Den Schüler/innen und Lehrkräften kann sie viele Möglichkeiten eröffnen: einen einfachen Zugang zu neuesten, professionell gewarteten Anwendungen, die Vernetzung verschiedener Lernorte, Erleichterung von Unterrichtsvorbereitung und Differenzierung. Die Schul-Cloud bietet Flexibilität, fördert die schul- und fächerübergreifende Anwendbarkeit und schafft eine wichtige Voraussetzung für die gesellschaftliche Teilhabe und Mitgestaltung der digitalen Welt. Neben den technischen Komponenten werden im vorliegenden Bericht ausgewählte Dienste der Schul-Cloud exemplarisch beschrieben und weiterführende Schritte aufgezeigt. Das in Zusammenarbeit mit zahlreichen Expertinnen und Experten am Hasso-Plattner-Institut (HPI) entwickelte und durch das Bundesministerium für Bildung und Forschung (BMBF) geförderte Konzept einer Schul-Cloud stellt eine wichtige Grundlage für die Einführung Cloud-basierter Strukturen und -Dienste im Bildungsbereich dar. Gemeinsam mit dem nationalen Excellence-Schulnetzwerk MINT-EC als Kooperationspartner startet ab sofort die Pilotphase. Aufgrund des modularen, skalierbaren Ansatzes der Schul-Cloud kommt dem infrastrukturellen Prototypen langfristig das Potential zu, auch über die begrenzte Anzahl an Pilotschulen hinaus bundesweit effizient eingesetzt zu werden.

Dritter Deutscher IPv6 Gipfel 2010 (2010)

Am 24. und 25. Juni 2010 fand am Hasso-Plattner-Institut für Softwaresystemtechnik GmbH in Potsdam der 3. Deutsche IPv6 Gipfel 2010 statt, dessen Dokumentation der vorliegende technische Report dient. Als nationaler Arm des weltweiten IPv6-Forums fördert der Deutsche IPv6-Rat den Übergangsprozess zur neuen Internetgeneration und brachte in diesem Rahmen nationale und internationale Experten aus Wirtschaft, Wissenschaft und öffentlicher Verwaltung zusammen, um Awareness für das Zukunftsthema IPv6 zu schaffen und um ein Resumé über die bislang erzielten Fortschritte zu ziehen. Die Grenzen des alten Internetprotokolls IPv4 sind in den vergangenen zwei Jahren deutlicher denn je zutage getreten. Waren im vergangenen Jahr anlässlich des 2. IPv6 Gipfels noch 11% aller zu vergebenden IPv4 Adressen verfügbar, ist diese Zahl mittlerweile auf nur noch 6% geschrumpft. Ehrengast war in diesem Jahr der „europäische Vater“ des Internets, Prof. Peter T. Kirstein vom University College London, dessen Hauptvortrag von weiteren Beiträgen hochrangiger Vertretern aus Politik, Wissenschaft und Wirtschaft ergänzt wurde.

ecoControl (2015)

Herbst, Eva‐Maria ; Maschler, Fabian ; Niephaus, Fabio ; Reimann, Max ; Steier, Julia ; Felgentreff, Tim ; Lincke, Jens ; Taeumel, Marcel ; Hirschfeld, Robert ; Witt, Carsten

Eine dezentrale Energieversorgung ist ein erster Schritt in Richtung Energiewende. Dabei werden auch in Mehrfamilienhäusern vermehrt verschiedene Strom- und Wärmeerzeuger eingesetzt. Besonders in Deutschland kommen in diesem Zusammenhang Blockheizkraftwerke immer häufiger zum Einsatz, weil sie Gas sehr effizient in Strom und Wärme umwandeln können. Außerdem ermöglichen sie, im Zusammenspiel mit anderen Energiesystemen wie beispielsweise Photovoltaik-Anlagen, eine kontinuierliche und dezentrale Energieversorgung. Bei dem Betrieb von unterschiedlichen Energiesystemen ist es wünschenswert, dass die Systeme aufeinander abgestimmt arbeiten. Allerdings ist es bisher schwierig, heterogene Energiesysteme effizient miteinander zu betreiben. Dadurch bleiben Einsparungspotentiale ungenutzt. Eine zentrale Steuerung kann deshalb die Effizienz des Gesamtsystems verbessern. Mit ecoControl stellen wir einen erweiterbaren Prototypen vor, der die Kooperation von Energiesystemen optimiert und Umweltfaktoren miteinbezieht. Dazu stellt die Software eine einheitliche Bedienungsoberfläche zur Konfiguration aller Systeme zur Verfügung. Außerdem bietet sie die Möglichkeit, Optimierungsalgorithmen mit Hilfe einer Programmierschnittstelle zu entwickeln, zu testen und auszuführen. Innerhalb solcher Algorithmen können von ecoControl bereitgestellte Vorhersagen genutzt werden. Diese Vorhersagen basieren auf dem individuellen Verhalten von jedem Energiesystem, Wettervorhersagen und auf Prognosen des Energieverbrauchs. Mithilfe einer Simulation können Techniker unterschiedliche Konfigurationen und Optimierungen sofort ausprobieren, ohne diese über einen langen Zeitraum an realen Geräten testen zu müssen. ecoControl hilft darüber hinaus auch Hausverwaltungen und Vermietern bei der Verwaltung und Analyse der Energiekosten. Wir haben anhand von Fallbeispielen gezeigt, dass Optimierungsalgorithmen, welche die Nutzung von Wärmespeichern verbessern, die Effizienz des Gesamtsystems erheblich verbessern können. Schließlich kommen wir zu dem Schluss, dass ecoControl in einem nächsten Schritt unter echten Bedingungen getestet werden muss, sobald eine geeignete Hardwarekomponente verfügbar ist. Über diese Schnittstelle werden die Messwerte an ecoControl gesendet und Steuersignale an die Geräte weitergeleitet.

Effective and efficient similarity search in databases (2013)

Lange, Dustin

Given a large set of records in a database and a query record, similarity search aims to find all records sufficiently similar to the query record. To solve this problem, two main aspects need to be considered: First, to perform effective search, the set of relevant records is defined using a similarity measure. Second, an efficient access method is to be found that performs only few database accesses and comparisons using the similarity measure. This thesis solves both aspects with an emphasis on the latter. In the first part of this thesis, a frequency-aware similarity measure is introduced. Compared record pairs are partitioned according to frequencies of attribute values. For each partition, a different similarity measure is created: machine learning techniques combine a set of base similarity measures into an overall similarity measure. After that, a similarity index for string attributes is proposed, the State Set Index (SSI), which is based on a trie (prefix tree) that is interpreted as a nondeterministic finite automaton. For processing range queries, the notion of query plans is introduced in this thesis to describe which similarity indexes to access and which thresholds to apply. The query result should be as complete as possible under some cost threshold. Two query planning variants are introduced: (1) Static planning selects a plan at compile time that is used for all queries. (2) Query-specific planning selects a different plan for each query. For answering top-k queries, the Bulk Sorted Access Algorithm (BSA) is introduced, which retrieves large chunks of records from the similarity indexes using fixed thresholds, and which focuses its efforts on records that are ranked high in more than one attribute and thus promising candidates. The described components form a complete similarity search system. Based on prototypical implementations, this thesis shows comparative evaluation results for all proposed approaches on different real-world data sets, one of which is a large person data set from a German credit rating agency.

Efficient and exact computation of inclusion dependencies for data integration (2010)

Bauckmann, Jana ; Leser, Ulf ; Naumann, Felix

Data obtained from foreign data sources often come with only superficial structural information, such as relation names and attribute names. Other types of metadata that are important for effective integration and meaningful querying of such data sets are missing. In particular, relationships among attributes, such as foreign keys, are crucial metadata for understanding the structure of an unknown database. The discovery of such relationships is difficult, because in principle for each pair of attributes in the database each pair of data values must be compared. A precondition for a foreign key is an inclusion dependency (IND) between the key and the foreign key attributes. We present with Spider an algorithm that efficiently finds all INDs in a given relational database. It leverages the sorting facilities of DBMS but performs the actual comparisons outside of the database to save computation. Spider analyzes very large databases up to an order of magnitude faster than previous approaches. We also evaluate in detail the effectiveness of several heuristics to reduce the number of necessary comparisons. Furthermore, we generalize Spider to find composite INDs covering multiple attributes, and partial INDs, which are true INDs for all but a certain number of values. This last type is particularly relevant when integrating dirty data as is often the case in the life sciences domain - our driving motivation.

Efficient and scalable graph view maintenance for deductive graph databases based on generalized discrimination networks (2015)

Beyhl, Thomas ; Giese, Holger

Graph databases provide a natural way of storing and querying graph data. In contrast to relational databases, queries over graph databases enable to refer directly to the graph structure of such graph data. For example, graph pattern matching can be employed to formulate queries over graph data. However, as for relational databases running complex queries can be very time-consuming and ruin the interactivity with the database. One possible approach to deal with this performance issue is to employ database views that consist of pre-computed answers to common and often stated queries. But to ensure that database views yield consistent query results in comparison with the data from which they are derived, these database views must be updated before queries make use of these database views. Such a maintenance of database views must be performed efficiently, otherwise the effort to create and maintain views may not pay off in comparison to processing the queries directly on the data from which the database views are derived. At the time of writing, graph databases do not support database views and are limited to graph indexes that index nodes and edges of the graph data for fast query evaluation, but do not enable to maintain pre-computed answers of complex queries over graph data. Moreover, the maintenance of database views in graph databases becomes even more challenging when negation and recursion have to be supported as in deductive relational databases. In this technical report, we present an approach for the efficient and scalable incremental graph view maintenance for deductive graph databases. The main concept of our approach is a generalized discrimination network that enables to model nested graph conditions including negative application conditions and recursion, which specify the content of graph views derived from graph data stored by graph databases. The discrimination network enables to automatically derive generic maintenance rules using graph transformations for maintaining graph views in case the graph data from which the graph views are derived change. We evaluate our approach in terms of a case study using multiple data sets derived from open source projects.

Efficient model synchronization of large-scale models (2009)

Giese, Holger ; Hildebrandt, Stephan

Model-driven software development requires techniques to consistently propagate modiﬁcations between different related models to realize its full potential. For large-scale models, efﬁciency is essential in this respect. In this paper, we present an improved model synchronization algorithm based on triple graph grammars that is highly efﬁcient and, therefore, can also synchronize large-scale models sufﬁciently fast. We can show, that the overall algorithm has optimal complexity if it is dominating the rule matching and further present extensive measurements that show the efﬁciency of the presented model transformation and synchronization technique.

Einführung von IPv6 in Unternehmensnetzen : ein Leitfaden (2011)

Boeddinghaus, Wilhelm ; Meinel, Christoph ; Sack, Harald

Embedded operating system projects (2014)

Grapentin, Andreas ; Heidler, Kirstin ; Korsch, Dimitri ; Kumar Sah, Rakesh ; Kunzmann, Nicco ; Henning, Johannes ; Mattis, Toni ; Rein, Patrick ; Seckler, Eric ; Groneberg, Björn ; Zimmermann, Florian

In today’s life, embedded systems are ubiquitous. But they differ from traditional desktop systems in many aspects – these include predictable timing behavior (real-time), the management of scarce resources (memory, network), reliable communication protocols, energy management, special purpose user-interfaces (headless operation), system configuration, programming languages (to support software/hardware co-design), and modeling techniques. Within this technical report, authors present results from the lecture “Operating Systems for Embedded Computing” that has been offered by the “Operating Systems and Middleware” group at HPI in Winter term 2013/14. Focus of the lecture and accompanying projects was on principles of real-time computing. Students had the chance to gather practical experience with a number of different OSes and applications and present experiences with near-hardware programming. Projects address the entire spectrum, from bare-metal programming to harnessing a real-time OS to exercising the full software/hardware co-design cycle. Three outstanding projects are at the heart of this technical report. Project 1 focuses on the development of a bare-metal operating system for LEGO Mindstorms EV3. While still a toy, it comes with a powerful ARM processor, 64 MB of main memory, standard interfaces, such as Bluetooth and network protocol stacks. EV3 runs a version of 1 1 Introduction Linux. Sources are available from Lego’s web site. However, many devices and their driver software are proprietary and not well documented. Developing a new, bare-metal OS for the EV3 requires an understanding of the EV3 boot process. Since no standard input/output devices are available, initial debugging steps are tedious. After managing these initial steps, the project was able to adapt device drivers for a few Lego devices to an extent that a demonstrator (the Segway application) could be successfully run on the new OS. Project 2 looks at the EV3 from a different angle. The EV3 is running a pretty decent version of Linux- in principle, the RT_PREEMPT patch can turn any Linux system into a real-time OS by modifying the behavior of a number of synchronization constructs at the heart of the OS. Priority inversion is a problem that is solved by protocols such as priority inheritance or priority ceiling. Real-time OSes implement at least one of the protocols. The central idea of the project was the comparison of non-real-time and real-time variants of Linux on the EV3 hardware. A task set that showed effects of priority inversion on standard EV3 Linux would operate flawlessly on the Linux version with the RT_PREEMPT-patch applied. If only patching Lego’s version of Linux was that easy... Project 3 takes the notion of real-time computing more seriously. The application scenario was centered around our Carrera Digital 132 racetrack. Obtaining position information from the track, controlling individual cars, detecting and modifying the Carrera Digital protocol required design and implementation of custom controller hardware. What to implement in hardware, firmware, and what to implement in application software – this was the central question addressed by the project.

Energy-efficient and performance-aware virtual machine management for cloud data centers (2014)

Takouna, Ibrahim

Virtualized cloud data centers provide on-demand resources, enable agile resource provisioning, and host heterogeneous applications with different resource requirements. These data centers consume enormous amounts of energy, increasing operational expenses, inducing high thermal inside data centers, and raising carbon dioxide emissions. The increase in energy consumption can result from ineffective resource management that causes inefficient resource utilization. This dissertation presents detailed models and novel techniques and algorithms for virtual resource management in cloud data centers. The proposed techniques take into account Service Level Agreements (SLAs) and workload heterogeneity in terms of memory access demand and communication patterns of web applications and High Performance Computing (HPC) applications. To evaluate our proposed techniques, we use simulation and real workload traces of web applications and HPC applications and compare our techniques against the other recently proposed techniques using several performance metrics. The major contributions of this dissertation are the following: proactive resource provisioning technique based on robust optimization to increase the hosts' availability for hosting new VMs while minimizing the idle energy consumption. Additionally, this technique mitigates undesirable changes in the power state of the hosts by which the hosts' reliability can be enhanced in avoiding failure during a power state change. The proposed technique exploits the range-based prediction algorithm for implementing robust optimization, taking into consideration the uncertainty of demand. An adaptive range-based prediction for predicting workload with high fluctuations in the short-term. The range prediction is implemented in two ways: standard deviation and median absolute deviation. The range is changed based on an adaptive confidence window to cope with the workload fluctuations. A robust VM consolidation for efficient energy and performance management to achieve equilibrium between energy and performance trade-offs. Our technique reduces the number of VM migrations compared to recently proposed techniques. This also contributes to a reduction in energy consumption by the network infrastructure. Additionally, our technique reduces SLA violations and the number of power state changes. A generic model for the network of a data center to simulate the communication delay and its impact on VM performance, as well as network energy consumption. In addition, a generic model for a memory-bus of a server, including latency and energy consumption models for different memory frequencies. This allows simulating the memory delay and its influence on VM performance, as well as memory energy consumption. Communication-aware and energy-efficient consolidation for parallel applications to enable the dynamic discovery of communication patterns and reschedule VMs using migration based on the determined communication patterns. A novel dynamic pattern discovery technique is implemented, based on signal processing of network utilization of VMs instead of using the information from the hosts' virtual switches or initiation from VMs. The result shows that our proposed approach reduces the network's average utilization, achieves energy savings due to reducing the number of active switches, and provides better VM performance compared to CPU-based placement. Memory-aware VM consolidation for independent VMs, which exploits the diversity of VMs' memory access to balance memory-bus utilization of hosts. The proposed technique, Memory-bus Load Balancing (MLB), reactively redistributes VMs according to their utilization of a memory-bus using VM migration to improve the performance of the overall system. Furthermore, Dynamic Voltage and Frequency Scaling (DVFS) of the memory and the proposed MLB technique are combined to achieve better energy savings.

Enriching raw events to enable process intelligence : research challenges (2013)

Herzberg, Nico ; Weske, Mathias

Die wertschöpfenden Tätigkeiten in Unternehmen folgen definierten Geschäftsprozessen und werden entsprechend ausgeführt. Dabei werden wertvolle Daten über die Prozessausführung erzeugt. Die Menge und Qualität dieser Daten ist sehr stark von der Prozessausführungsumgebung abhängig, welche überwiegend manuell als auch vollautomatisiert sein kann. Die stetige Verbesserung von Prozessen ist einer der Hauptpfeiler des Business Process Managements, mit der Aufgabe die Wettbewerbsfähigkeit von Unternehmen zu sichern und zu steigern. Um Prozesse zu verbessern muss man diese analysieren und ist auf Daten der Prozessausführung angewiesen. Speziell bei manueller Prozessausführung sind die Daten nur selten direkt zur konkreten Prozessausführung verknüpft. In dieser Arbeit präsentieren wir einen Ansatz zur Verwendung und Anreicherung von Prozessausführungsdaten mit Kontextdaten – Daten die unabhängig zu den Prozessdaten existieren – und Wissen aus den dazugehörigen Prozessmodellen, um ein hochwertige Event- Datenbasis für Process Intelligence Anwendungen, wie zum Beispiel Prozessmonitoring, Prozessanalyse und Process Mining, sicherstellen zu können. Des Weiteren zeigen wir offene Fragestellungen und Herausforderungen auf, welche in Zukunft Gegenstand unserer Forschung sein werden.

Enriching the Web of Data with topics and links (2013)

Böhm, Christoph

This thesis presents novel ideas and research findings for the Web of Data – a global data space spanning many so-called Linked Open Data sources. Linked Open Data adheres to a set of simple principles to allow easy access and reuse for data published on the Web. Linked Open Data is by now an established concept and many (mostly academic) publishers adopted the principles building a powerful web of structured knowledge available to everybody. However, so far, Linked Open Data does not yet play a significant role among common web technologies that currently facilitate a high-standard Web experience. In this work, we thoroughly discuss the state-of-the-art for Linked Open Data and highlight several shortcomings – some of them we tackle in the main part of this work. First, we propose a novel type of data source meta-information, namely the topics of a dataset. This information could be published with dataset descriptions and support a variety of use cases, such as data source exploration and selection. For the topic retrieval, we present an approach coined Annotated Pattern Percolation (APP), which we evaluate with respect to topics extracted from Wikipedia portals. Second, we contribute to entity linking research by presenting an optimization model for joint entity linking, showing its hardness, and proposing three heuristics implemented in the LINked Data Alignment (LINDA) system. Our first solution can exploit multi-core machines, whereas the second and third approach are designed to run in a distributed shared-nothing environment. We discuss and evaluate the properties of our approaches leading to recommendations which algorithm to use in a specific scenario. The distributed algorithms are among the first of their kind, i.e., approaches for joint entity linking in a distributed fashion. Also, we illustrate that we can tackle the entity linking problem on the very large scale with data comprising more than 100 millions of entity representations from very many sources. Finally, we approach a sub-problem of entity linking, namely the alignment of concepts. We again target a method that looks at the data in its entirety and does not neglect existing relations. Also, this concept alignment method shall execute very fast to serve as a preprocessing for further computations. Our approach, called Holistic Concept Matching (HCM), achieves the required speed through grouping the input by comparing so-called knowledge representations. Within the groups, we perform complex similarity computations, relation conclusions, and detect semantic contradictions. The quality of our result is again evaluated on a large and heterogeneous dataset from the real Web. In summary, this work contributes a set of techniques for enhancing the current state of the Web of Data. All approaches have been tested on large and heterogeneous real-world input.

Erster Deutscher IPv6 Gipfel (2008)

Meinel, Christoph ; Sack, Harald ; Bross, Justus

Inhalt: KOMMUNIQUÉ GRUßWORT PROGRAMM HINTERGRÜNDE UND FAKTEN REFERENTEN: BIOGRAFIE & VOTRAGSZUSAMMENFASSUNG 1.) DER ERSTE DEUTSCHE IPV6 GIPFEL AM HASSO PLATTNER INSTITUT IN POTSDAM - PROF. DR. CHRISTOPH MEINEL - VIVIANE REDING 2.) IPV6, ITS TIME HAS COME - VINTON CERF 3.) DIE BEDEUTUNG VON IPV6 FÜR DIE ÖFFENTLICHE VERWALTUNG IN DEUTSCHLAND - MARTIN SCHALLBRUCH 4.) TOWARDS THE FUTURE OF THE INTERNET - PROF. DR. LUTZ HEUSER 5.) IPV6 STRATEGY & DEPLOYMENT STATUS IN JAPAN - HIROSHI MIYATA 6.) IPV6 STRATEGY & DEPLOYMENT STATUS IN CHINA - PROF. WU HEQUAN 7.) IPV6 STRATEGY AND DEPLOYMENT STATUS IN KOREA - DR. EUNSOOK KIM 8.) IPV6 DEPLOYMENT EXPERIENCES IN GREEK SCHOOL NETWORK - ATHANASSIOS LIAKOPOULOS 9.) IPV6 NETWORK MOBILITY AND IST USAGE - JEAN-MARIE BONNIN 10.) IPV6 - RÜSTZEUG FÜR OPERATOR & ISP IPV6 DEPLOYMENT UND STRATEGIEN DER DEUTSCHEN TELEKOM - HENNING GROTE 11.) VIEW FROM THE IPV6 DEPLOYMENT FRONTLINE - YVES POPPE 12.) DEPLOYING IPV6 IN MOBILE ENVIRONMENTS - WOLFGANG FRITSCHE 13.) PRODUCTION READY IPV6 FROM CUSTOMER LAN TO THE INTERNET - LUTZ DONNERHACKE 14.) IPV6 - DIE BASIS FÜR NETZWERKZENTRIERTE OPERATIONSFÜHRUNG (NETOPFÜ) IN DER BUNDESWEHR HERAUSFORDERUNGEN - ANWENDUNGSFALLBETRACHTUNGEN - AKTIVITÄTEN - CARSTEN HATZIG 15.) WINDOWS VISTA & IPV6 - BERND OURGHANLIAN 16.) IPV6 & HOME NETWORKING TECHINCAL AND BUSINESS CHALLENGES - DR. TAYEB BEN MERIEM 17.) DNS AND DHCP FOR DUAL STACK NETWORKS - LAWRENCE HUGHES 18.) CAR INDUSTRY: GERMAN EXPERIENCE WITH IPV6 - AMARDEO SARMA 19.) IPV6 & AUTONOMIC NETWORKING - RANGANAI CHAPARADZA 20.) P2P & GRID USING IPV6 AND MOBILE IPV6 - DR. LATIF LADID

Evolution of model-driven engineering settings in practice (2014)

Hebig, Regina

Nowadays, software systems are getting more and more complex. To tackle this challenge most diverse techniques, such as design patterns, service oriented architectures (SOA), software development processes, and model-driven engineering (MDE), are used to improve productivity, while time to market and quality of the products stay stable. Multiple of these techniques are used in parallel to profit from their benefits. While the use of sophisticated software development processes is standard, today, MDE is just adopted in practice. However, research has shown that the application of MDE is not always successful. It is not fully understood when advantages of MDE can be used and to what degree MDE can also be disadvantageous for productivity. Further, when combining different techniques that aim to affect the same factor (e.g. productivity) the question arises whether these techniques really complement each other or, in contrast, compensate their effects. Due to that, there is the concrete question how MDE and other techniques, such as software development process, are interrelated. Both aspects (advantages and disadvantages for productivity as well as the interrelation to other techniques) need to be understood to identify risks relating to the productivity impact of MDE. Before studying MDE's impact on productivity, it is necessary to investigate the range of validity that can be reached for the results. This includes two questions. First, there is the question whether MDE's impact on productivity is similar for all approaches of adopting MDE in practice. Second, there is the question whether MDE's impact on productivity for an approach of using MDE in practice remains stable over time. The answers for both questions are crucial for handling risks of MDE, but also for the design of future studies on MDE success. This thesis addresses these questions with the goal to support adoption of MDE in future. To enable a differentiated discussion about MDE, the term MDE setting'' is introduced. MDE setting refers to the applied technical setting, i.e. the employed manual and automated activities, artifacts, languages, and tools. An MDE setting's possible impact on productivity is studied with a focus on changeability and the interrelation to software development processes. This is done by introducing a taxonomy of changeability concerns that might be affected by an MDE setting. Further, three MDE traits are identified and it is studied for which manifestations of these MDE traits software development processes are impacted. To enable the assessment and evaluation of an MDE setting's impacts, the Software Manufacture Model language is introduced. This is a process modeling language that allows to reason about how relations between (modeling) artifacts (e.g. models or code files) change during application of manual or automated development activities. On that basis, risk analysis techniques are provided. These techniques allow identifying changeability risks and assessing the manifestations of the MDE traits (and with it an MDE setting's impact on software development processes). To address the range of validity, MDE settings from practice and their evolution histories were capture in context of this thesis. First, this data is used to show that MDE settings cover the whole spectrum concerning their impact on changeability or interrelation to software development processes. Neither it is seldom that MDE settings are neutral for processes nor is it seldom that MDE settings have impact on processes. Similarly, the impact on changeability differs relevantly. Second, a taxonomy of evolution of MDE settings is introduced. In that context it is discussed to what extent different types of changes on an MDE setting can influence this MDE setting's impact on changeability and the interrelation to processes. The category of structural evolution, which can change these characteristics of an MDE setting, is identified. The captured MDE settings from practice are used to show that structural evolution exists and is common. In addition, some examples of structural evolution steps are collected that actually led to a change in the characteristics of the respective MDE settings. Two implications are: First, the assessed diversity of MDE settings evaluates the need for the analysis techniques that shall be presented in this thesis. Second, evolution is one explanation for the diversity of MDE settings in practice. To summarize, this thesis studies the nature and evolution of MDE settings in practice. As a result support for the adoption of MDE settings is provided in form of techniques for the identification of risks relating to productivity impacts.

Explorative authoring of Active Web content in a mobile environment (2013)

Calmez, Conrad ; Hesse, Hubert ; Siegmund, Benjamin ; Stamm, Sebastian ; Thomschke, Astrid ; Hirschfeld, Robert ; Ingalls, Dan ; Lincke, Jens

Developing rich Web applications can be a complex job - especially when it comes to mobile device support. Web-based environments such as Lively Webwerkstatt can help developers implement such applications by making the development process more direct and interactive. Further the process of developing software is collaborative which creates the need that the development environment offers collaboration facilities. This report describes extensions of the webbased development environment Lively Webwerkstatt such that it can be used in a mobile environment. The extensions are collaboration mechanisms, user interface adaptations but as well event processing and performance measuring on mobile devices.

Exploratives Erstellen von interaktiven Inhalten in einer dynamischen Umgebung (2015)

Otto, Philipp ; Pollak, Jaqueline ; Werner, Daniel ; Wolff, Felix ; Steinert, Bastian ; Thamsen, Lauritz ; Taeumel, Marcel ; Lincke, Jens ; Krahn, Robert ; Ingalls, Daniel H. H. ; Hirschfeld, Robert

Bei der Erstellung von Visualisierungen gibt es im Wesentlichen zwei Ansätze. Zum einen können mit geringem Aufwand schnell Standarddiagramme erstellt werden. Zum anderen gibt es die Möglichkeit, individuelle und interaktive Visualisierungen zu programmieren. Dies ist jedoch mit einem deutlich höheren Aufwand verbunden. Flower ermöglicht eine schnelle Erstellung individueller und interaktiver Visualisierungen, indem es den Entwicklungssprozess stark vereinfacht und die Nutzer bei den einzelnen Aktivitäten wie dem Import und der Aufbereitung von Daten, deren Abbildung auf visuelle Elemente sowie der Integration von Interaktivität direkt unterstützt.

Extending a dynamic programming language and runtime environment with access control (2016)

Tessenow, Philipp ; Felgentreff, Tim ; Bracha, Gilad ; Hirschfeld, Robert

Complexity in software systems is a major factor driving development and maintenance costs. To master this complexity, software is divided into modules that can be developed and tested separately. In order to support this separation of modules, each module should provide a clean and concise public interface. Therefore, the ability to selectively hide functionality using access control is an important feature in a programming language intended for complex software systems. Software systems are increasingly distributed, adding not only to their inherent complexity, but also presenting security challenges. The object-capability approach addresses these challenges by defining language properties providing only minimal capabilities to objects. One programming language that is based on the object-capability approach is Newspeak, a dynamic programming language designed for modularity and security. The Newspeak specification describes access control as one of Newspeak’s properties, because it is a requirement for the object-capability approach. However, access control, as defined in the Newspeak specification, is currently not enforced in its implementation. This work introduces an access control implementation for Newspeak, enabling the security of object-capabilities and enhancing modularity. We describe our implementation of access control for Newspeak. We adapted the runtime environment, the reflective system, the compiler toolchain, and the virtual machine. Finally, we describe a migration strategy for the existing Newspeak code base, so that our access control implementation can be integrated with minimal effort.

Extending a Java Virtual Machine to Dynamic Object-oriented Languages (2013)

Pape, Tobias ; Treffer, Arian ; Hirschfeld, Robert ; Haupt, Michael

There are two common approaches to implement a virtual machine (VM) for a dynamic object-oriented language. On the one hand, it can be implemented in a C-like language for best performance and maximum control over the resulting executable. On the other hand, it can be implemented in a language such as Java that allows for higher-level abstractions. These abstractions, such as proper object-oriented modularization, automatic memory management, or interfaces, are missing in C-like languages but they can simplify the implementation of prevalent but complex concepts in VMs, such as garbage collectors (GCs) or just-in-time compilers (JITs). Yet, the implementation of a dynamic object-oriented language in Java eventually results in two VMs on top of each other (double stack), which impedes performance. For statically typed languages, the Maxine VM solves this problem; it is written in Java but can be executed without a Java virtual machine (JVM). However, it is currently not possible to execute dynamic object-oriented languages in Maxine. This work presents an approach to bringing object models and execution models of dynamic object-oriented languages to the Maxine VM and the application of this approach to Squeak/Smalltalk. The representation of objects in and the execution of dynamic object-oriented languages pose certain challenges to the Maxine VM that lacks certain variation points necessary to enable an effortless and straightforward implementation of dynamic object-oriented languages' execution models. The implementation of Squeak/Smalltalk in Maxine as a feasibility study is to unveil such missing variation points.

Extracting structured information from Wikipedia articles to populate infoboxes (2010)

Lange, Dustin ; Böhm, Christoph ; Naumann, Felix

Roughly every third Wikipedia article contains an infobox - a table that displays important facts about the subject in attribute-value form. The schema of an infobox, i.e., the attributes that can be expressed for a concept, is defined by an infobox template. Often, authors do not specify all template attributes, resulting in incomplete infoboxes. With iPopulator, we introduce a system that automatically populates infoboxes of Wikipedia articles by extracting attribute values from the article's text. In contrast to prior work, iPopulator detects and exploits the structure of attribute values for independently extracting value parts. We have tested iPopulator on the entire set of infobox templates and provide a detailed analysis of its effectiveness. For instance, we achieve an average extraction precision of 91% for 1,727 distinct infobox template attributes.

Extraktion und Identifikation von Entitäten in Textdaten im Umfeld der Enterprise Search (2010)

Brauer, Falk

Die automatische Informationsextraktion (IE) aus unstrukturierten Texten ermöglicht völlig neue Wege, auf relevante Informationen zuzugreifen und deren Inhalte zu analysieren, die weit über bisherige Verfahren zur Stichwort-basierten Dokumentsuche hinausgehen. Die Entwicklung von Programmen zur Extraktion von maschinenlesbaren Daten aus Texten erfordert jedoch nach wie vor die Entwicklung von domänenspezifischen Extraktionsprogrammen. Insbesondere im Bereich der Enterprise Search (der Informationssuche im Unternehmensumfeld), in dem eine große Menge von heterogenen Dokumenttypen existiert, ist es oft notwendig ad-hoc Programm-module zur Extraktion von geschäftsrelevanten Entitäten zu entwickeln, die mit generischen Modulen in monolithischen IE-Systemen kombiniert werden. Dieser Umstand ist insbesondere kritisch, da potentiell für jeden einzelnen Anwendungsfall ein von Grund auf neues IE-System entwickelt werden muss. Die vorliegende Dissertation untersucht die effiziente Entwicklung und Ausführung von IE-Systemen im Kontext der Enterprise Search und effektive Methoden zur Ausnutzung bekannter strukturierter Daten im Unternehmenskontext für die Extraktion und Identifikation von geschäftsrelevanten Entitäten in Doku-menten. Grundlage der Arbeit ist eine neuartige Plattform zur Komposition von IE-Systemen auf Basis der Beschreibung des Datenflusses zwischen generischen und anwendungsspezifischen IE-Modulen. Die Plattform unterstützt insbesondere die Entwicklung und Wiederverwendung von generischen IE-Modulen und zeichnet sich durch eine höhere Flexibilität und Ausdrucksmächtigkeit im Vergleich zu vorherigen Methoden aus. Ein in der Dissertation entwickeltes Verfahren zur Dokumentverarbeitung interpretiert den Daten-austausch zwischen IE-Modulen als Datenströme und ermöglicht damit eine weitgehende Parallelisierung von einzelnen Modulen. Die autonome Ausführung der Module führt zu einer wesentlichen Beschleu-nigung der Verarbeitung von Einzeldokumenten und verbesserten Antwortzeiten, z. B. für Extraktions-dienste. Bisherige Ansätze untersuchen lediglich die Steigerung des durchschnittlichen Dokumenten-durchsatzes durch verteilte Ausführung von Instanzen eines IE-Systems. Die Informationsextraktion im Kontext der Enterprise Search unterscheidet sich z. B. von der Extraktion aus dem World Wide Web dadurch, dass in der Regel strukturierte Referenzdaten z. B. in Form von Unternehmensdatenbanken oder Terminologien zur Verfügung stehen, die oft auch die Beziehungen von Entitäten beschreiben. Entitäten im Unternehmensumfeld haben weiterhin bestimmte Charakteristiken: Eine Klasse von relevanten Entitäten folgt bestimmten Bildungsvorschriften, die nicht immer bekannt sind, auf die aber mit Hilfe von bekannten Beispielentitäten geschlossen werden kann, so dass unbekannte Entitäten extrahiert werden können. Die Bezeichner der anderen Klasse von Entitäten haben eher umschreibenden Charakter. Die korrespondierenden Umschreibungen in Texten können variieren, wodurch eine Identifikation derartiger Entitäten oft erschwert wird. Zur effizienteren Entwicklung von IE-Systemen wird in der Dissertation ein Verfahren untersucht, das alleine anhand von Beispielentitäten effektive Reguläre Ausdrücke zur Extraktion von unbekannten Entitäten erlernt und damit den manuellen Aufwand in derartigen Anwendungsfällen minimiert. Verschiedene Generalisierungs- und Spezialisierungsheuristiken erkennen Muster auf verschiedenen Abstraktionsebenen und schaffen dadurch einen Ausgleich zwischen Genauigkeit und Vollständigkeit bei der Extraktion. Bekannte Regellernverfahren im Bereich der Informationsextraktion unterstützen die beschriebenen Problemstellungen nicht, sondern benötigen einen (annotierten) Dokumentenkorpus. Eine Methode zur Identifikation von Entitäten, die durch Graph-strukturierte Referenzdaten vordefiniert sind, wird als dritter Schwerpunkt untersucht. Es werden Verfahren konzipiert, welche über einen exakten Zeichenkettenvergleich zwischen Text und Referenzdatensatz hinausgehen und Teilübereinstimmungen und Beziehungen zwischen Entitäten zur Identifikation und Disambiguierung heranziehen. Das in der Arbeit vorgestellte Verfahren ist bisherigen Ansätzen hinsichtlich der Genauigkeit und Vollständigkeit bei der Identifikation überlegen.

Fundamentals of Service-Oriented Engineering (2006)

Breest, Martin ; Bouché, Paul ; Grund, Martin ; Haubrock, Sören ; Hüttenrauch, Stefan ; Kylau, Uwe ; Ploskonos, Anna ; Queck, Tobias ; Schreiter, Torben

Since 2002, keywords like service-oriented engineering, service-oriented computing, and service-oriented architecture have been widely used in research, education, and enterprises. These and related terms are often misunderstood or used incorrectly. To correct these misunderstandings, a deeper knowledge of the concepts, the historical backgrounds, and an overview of service-oriented architectures is demanded and given in this paper.

Fünfter Deutscher IPv6 Gipfel 2012 (2013)

Am 29. und 30. November 2012 fand am Hasso-Plattner-Institut für Softwaresystemtechnik GmbH in Potsdam der 5. Deutsche IPv6 Gipfel 2012 statt, als dessen Dokumentation der vorliegende technische Report dient. Wie mit den vorhergegangenen nationalen IPv6 Gipfeln verfolgte der Deutsche IPv6-Rat auch mit dem 5. Gipfel, der unter dem Motto „IPv6- der Wachstumstreiber für die Deutsche Wirtschaft” stand, das Ziel, Einblicke in aktuelle Entwicklungen rund um den Einsatz von IPv6 zu geben. Unter anderem wurden die Vorzüge des neue Internetstandards IPv6 vorgestellt und über die Anwendung von IPv6 auf dem Massenmarkt, sowie den Einsatz von IPv6 in Unternehmen und in der öffentlichen Verwaltung referiert. Weitere Themen des Gipfels bezogen sich auf Aktionen und Bedingungen in Unternehmen und Privathaushalten, die für den Umstieg auf IPv6 notwendig sind und welche Erfahrungen dabei bereits gesammelt werden konnten. Neben Vorträgen des Bundesbeauftragten für Datenschutz Peter Schaar und des Geschäftsführers der Technik Telekom Deutschland GmbH, Bruno Jacobfeuerborn, wurden weiteren Beiträge hochrangiger Vertretern aus Politik, Wissenschaft und Wirtschaft präsentiert, die in diesem technischen Bericht zusammengestellt sind.

Grid-Computing : [Seminar im Sommersemester 2003] (2005)

Polze, Andreas ; Schnor, Bettina

1. Applikationen für weitverteiltes Rechnen Dennis Klemann, Lars Schmidt-Bielicke, Philipp Seuring 2. Das Globus-Toolkit Dietmar Bremser, Alexis Krepp, Tobias Rausch 3. Open Grid Services Architecture Lars Trieloff 4. Condor, Condor-G, Classad Stefan Henze, Kai Köhne 5. The Cactus Framework Thomas Hille, Martin Karlsch 6. High Performance Scheduler mit Maui/PBS Ole Weidner, Jörg Schummer, Benedikt Meuthrath 7. Bandbreiten-Monitoring mit NWS Alexander Ritter, Gregor Höfert 8. The Paradyn Parallel Performance Measurement Tool Jens Ulferts, Christian Liesegang 9. Grid-Applikationen in der Praxis Steffen Bach, Michael Blume, Helge Issel

How Often do Experts Make Mistakes? (2010)

Palix, Nicolas ; Lawall, Julia L. ; Thomas, Gaël ; Muller, Gilles

Large open-source software projects involve developers with a wide variety of backgrounds and expertise. Such software projects furthermore include many internal APIs that developers must understand and use properly. According to the intended purpose of these APIs, they are more or less frequently used, and used by developers with more or less expertise. In this paper, we study the impact of usage patterns and developer expertise on the rate of defects occurring in the use of internal APIs. For this preliminary study, we focus on memory management APIs in the Linux kernel, as the use of these has been shown to be highly error prone in previous work. We study defect rates and developer expertise, to consider e.g., whether widely used APIs are more defect prone because they are used by less experienced developers, or whether defects in widely used APIs are more likely to be fixed.

HPI Future SOC Lab (2014)

The “HPI Future SOC Lab” is a cooperation of the Hasso-Plattner-Institut (HPI) and industrial partners. Its mission is to enable and promote exchange and interaction between the research community and the industrial partners. The HPI Future SOC Lab provides researchers with free of charge access to a complete infrastructure of state of the art hard- and software. This infrastructure includes components, which might be too expensive for an ordinary research environment, such as servers with up to 64 cores. The offerings address researchers particularly from but not limited to the areas of computer science and business information systems. Main areas of research include cloud computing, parallelization, and In-Memory technologies. This technical report presents results of research projects executed in 2014. Selected projects have presented their results on April 9th and September 29th 2014 at the Future SOC Lab Day events.

HPI Future SOC Lab (2014)

The “HPI Future SOC Lab” is a cooperation of the Hasso-Plattner-Institut (HPI) and industrial partners. Its mission is to enable and promote exchange and interaction between the research community and the industrial partners. The HPI Future SOC Lab provides researchers with free of charge access to a complete infrastructure of state of the art hard- and software. This infrastructure includes components, which might be too expensive for an ordinary research environment, such as servers with up to 64 cores. The offerings address researchers particularly from but not limited to the areas of computer science and business information systems. Main areas of research include cloud computing, parallelization, and In-Memory technologies. This technical report presents results of research projects executed in 2013. Selected projects have presented their results on April 10th and September 24th 2013 at the Future SOC Lab Day events.

HPI Future SOC Lab (2015)

Kurbel, Karl ; Nowak, Dawid ; Azodi, Amir ; Jaeger, David ; Meinel, Christoph ; Cheng, Feng ; Sapegin, Andrey ; Gawron, Marian ; Morelli, Frank ; Stahl, Lukas ; Kerl, Stefan ; Janz, Mariska ; Hadaya, Abdulmasih ; Ivanov, Ivaylo ; Wiese, Lena ; Neves, Mariana ; Schapranow, Matthieu-Patrick ; Fähnrich, Cindy ; Feinbube, Frank ; Eberhardt, Felix ; Hagen, Wieland ; Plauth, Max ; Herscheid, Lena ; Polze, Andreas ; Barkowsky, Matthias ; Dinger, Henriette ; Faber, Lukas ; Montenegro, Felix ; Czachórski, Tadeusz ; Nycz, Monika ; Nycz, Tomasz ; Baader, Galina ; Besner, Veronika ; Hecht, Sonja ; Schermann, Michael ; Krcmar, Helmut ; Wiradarma, Timur Pratama ; Hentschel, Christian ; Sack, Harald ; Abramowicz, Witold ; Sokolowska, Wioletta ; Hossa, Tymoteusz ; Opalka, Jakub ; Fabisz, Karol ; Kubaczyk, Mateusz ; Cmil, Milena ; Meng, Tianhui ; Dadashnia, Sharam ; Niesen, Tim ; Fettke, Peter ; Loos, Peter ; Perscheid, Cindy ; Schwarz, Christian ; Schmidt, Christopher ; Scholz, Matthias ; Bock, Nikolai ; Piller, Gunther ; Böhm, Klaus ; Norkus, Oliver ; Clark, Brian ; Friedrich, Björn ; Izadpanah, Babak ; Merkel, Florian ; Schweer, Ilias ; Zimak, Alexander ; Sauer, Jürgen ; Fabian, Benjamin ; Tilch, Georg ; Müller, David ; Plöger, Sabrina ; Friedrich, Christoph M. ; Engels, Christoph ; Amirkhanyan, Aragats ; van der Walt, Estée ; Eloff, J. H. P. ; Scheuermann, Bernd ; Weinknecht, Elisa

Das Future SOC Lab am HPI ist eine Kooperation des Hasso-Plattner-Instituts mit verschiedenen Industriepartnern. Seine Aufgabe ist die Ermöglichung und Förderung des Austausches zwischen Forschungsgemeinschaft und Industrie. Am Lab wird interessierten Wissenschaftlern eine Infrastruktur von neuester Hard- und Software kostenfrei für Forschungszwecke zur Verfügung gestellt. Dazu zählen teilweise noch nicht am Markt verfügbare Technologien, die im normalen Hochschulbereich in der Regel nicht zu finanzieren wären, bspw. Server mit bis zu 64 Cores und 2 TB Hauptspeicher. Diese Angebote richten sich insbesondere an Wissenschaftler in den Gebieten Informatik und Wirtschaftsinformatik. Einige der Schwerpunkte sind Cloud Computing, Parallelisierung und In-Memory Technologien. In diesem Technischen Bericht werden die Ergebnisse der Forschungsprojekte des Jahres 2015 vorgestellt. Ausgewählte Projekte stellten ihre Ergebnisse am 15. April 2015 und 4. November 2015 im Rahmen der Future SOC Lab Tag Veranstaltungen vor.

HPI Future SOC Lab (2013)

The “HPI Future SOC Lab” is a cooperation of the Hasso-Plattner-Institut (HPI) and industrial partners. Its mission is to enable and promote exchange and interaction between the research community and the industrial partners. The HPI Future SOC Lab provides researchers with free of charge access to a complete infrastructure of state of the art hard- and software. This infrastructure includes components, which might be too expensive for an ordinary research environment, such as servers with up to 64 cores. The offerings address researchers particularly from but not limited to the areas of computer science and business information systems. Main areas of research include cloud computing, parallelization, and In-Memory technologies. This technical report presents results of research projects executed in 2012. Selected projects have presented their results on June 18th and November 26th 2012 at the Future SOC Lab Day events.

HPI Future SOC Lab (2013)

Together with industrial partners Hasso-Plattner-Institut (HPI) is currently establishing a “HPI Future SOC Lab,” which will provide a complete infrastructure for research on on-demand systems. The lab utilizes the latest, multi/many-core hardware and its practical implementation and testing as well as further development. The necessary components for such a highly ambitious project are provided by renowned companies: Fujitsu and Hewlett Packard provide their latest 4 and 8-way servers with 1-2 TB RAM, SAP will make available its latest Business byDesign (ByD) system in its most complete version. EMC² provides high performance storage systems and VMware offers virtualization solutions. The lab will operate on the basis of real data from large enterprises. The HPI Future SOC Lab, which will be open for use by interested researchers also from other universities, will provide an opportunity to study real-life complex systems and follow new ideas all the way to their practical implementation and testing. This technical report presents results of research projects executed in 2011. Selected projects have presented their results on June 15th and October 26th 2011 at the Future SOC Lab Day events.

Imaginary Interfaces (2013)

Gustafson, Sean

The size of a mobile device is primarily determined by the size of the touchscreen. As such, researchers have found that the way to achieve ultimate mobility is to abandon the screen altogether. These wearable devices are operated using hand gestures, voice commands or a small number of physical buttons. By abandoning the screen these devices also abandon the currently dominant spatial interaction style (such as tapping on buttons), because, seemingly, there is nothing to tap on. Unfortunately this design prevents users from transferring their learned interaction knowledge gained from traditional touchscreen-based devices. In this dissertation, I present Imaginary Interfaces, which return spatial interaction to screenless mobile devices. With these interfaces, users point and draw in the empty space in front of them or on the palm of their hands. While they cannot see the results of their interaction, they obtain some visual and tactile feedback by watching and feeling their hands interact. After introducing the concept of Imaginary Interfaces, I present two hardware prototypes that showcase two different forms of interaction with an imaginary interface, each with its own advantages: mid-air imaginary interfaces can be large and expressive, while palm-based imaginary interfaces offer an abundance of tactile features that encourage learning. Given that imaginary interfaces offer no visual output, one of the key challenges is to enable users to discover the interface's layout. This dissertation offers three main solutions: offline learning with coordinates, browsing with audio feedback and learning by transfer. The latter I demonstrate with the Imaginary Phone, a palm-based imaginary interface that mimics the layout of a physical mobile phone that users are already familiar with. Although these designs enable interaction with Imaginary Interfaces, they tell us little about why this interaction is possible. In the final part of this dissertation, I present an exploration into which human perceptual abilities are used when interacting with a palm-based imaginary interface and how much each accounts for performance with the interface. These findings deepen our understanding of Imaginary Interfaces and suggest that palm-based Imaginary Interfaces can enable stand-alone eyes-free use for many applications, including interfaces for visually impaired users.

Improving hosted continuous integration services (2017)

Weyand, Christopher ; Chromik, Jonas ; Wolf, Lennard ; Kötte, Steffen ; Haase, Konstantin ; Felgentreff, Tim ; Lincke, Jens ; Hirschfeld, Robert

Developing large software projects is a complicated task and can be demanding for developers. Continuous integration is common practice for reducing complexity. By integrating and testing changes often, changesets are kept small and therefore easily comprehensible. Travis CI is a service that offers continuous integration and continuous deployment in the cloud. Software projects are build, tested, and deployed using the Travis CI infrastructure without interrupting the development process. This report describes how Travis CI works, presents how time-driven, periodic building is implemented as well as how CI data visualization can be done, and proposes a way of dealing with dependency problems.

Improving RDF data with data mining (2014)

Abedjan, Ziawasch

Linked Open Data (LOD) comprises very many and often large public data sets and knowledge bases. Those datasets are mostly presented in the RDF triple structure of subject, predicate, and object, where each triple represents a statement or fact. Unfortunately, the heterogeneity of available open data requires significant integration steps before it can be used in applications. Meta information, such as ontological definitions and exact range definitions of predicates, are desirable and ideally provided by an ontology. However in the context of LOD, ontologies are often incomplete or simply not available. Thus, it is useful to automatically generate meta information, such as ontological dependencies, range definitions, and topical classifications. Association rule mining, which was originally applied for sales analysis on transactional databases, is a promising and novel technique to explore such data. We designed an adaptation of this technique for min-ing Rdf data and introduce the concept of “mining configurations”, which allows us to mine RDF data sets in various ways. Different configurations enable us to identify schema and value dependencies that in combination result in interesting use cases. To this end, we present rule-based approaches for auto-completion, data enrichment, ontology improvement, and query relaxation. Auto-completion remedies the problem of inconsistent ontology usage, providing an editing user with a sorted list of commonly used predicates. A combination of different configurations step extends this approach to create completely new facts for a knowledge base. We present two approaches for fact generation, a user-based approach where a user selects the entity to be amended with new facts and a data-driven approach where an algorithm discovers entities that have to be amended with missing facts. As knowledge bases constantly grow and evolve, another approach to improve the usage of RDF data is to improve existing ontologies. Here, we present an association rule based approach to reconcile ontology and data. Interlacing different mining configurations, we infer an algorithm to discover synonymously used predicates. Those predicates can be used to expand query results and to support users during query formulation. We provide a wide range of experiments on real world datasets for each use case. The experiments and evaluations show the added value of association rule mining for the integration and usability of RDF data and confirm the appropriateness of our mining configuration methodology.

Inductive invariant checking with partial negative application conditions (2015)

Dyck, Johannes ; Giese, Holger

Graph transformation systems are a powerful formal model to capture model transformations or systems with infinite state space, among others. However, this expressive power comes at the cost of rather limited automated analysis capabilities. The general case of unbounded many initial graphs or infinite state spaces is only supported by approaches with rather limited scalability or expressiveness. In this report we improve an existing approach for the automated verification of inductive invariants for graph transformation systems. By employing partial negative application conditions to represent and check many alternative conditions in a more compact manner, we can check examples with rules and constraints of substantially higher complexity. We also substantially extend the expressive power by supporting more complex negative application conditions and provide higher accuracy by employing advanced implication checks. The improvements are evaluated and compared with another applicable tool by considering three case studies.

Industrial case study on the integration of SysML and AUTOSAR with triple graph grammars (2012)

Giese, Holger ; Hildebrandt, Stephan ; Neumann, Stefan ; Wätzoldt, Sebastian

During the overall development of complex engineering systems different modeling notations are employed. For example, in the domain of automotive systems system engineering models are employed quite early to capture the requirements and basic structuring of the entire system, while software engineering models are used later on to describe the concrete software architecture. Each model helps in addressing the specific design issue with appropriate notations and at a suitable level of abstraction. However, when we step forward from system design to the software design, the engineers have to ensure that all decisions captured in the system design model are correctly transferred to the software engineering model. Even worse, when changes occur later on in either model, today the consistency has to be reestablished in a cumbersome manual step. In this report, we present in an extended version of [Holger Giese, Stefan Neumann, and Stephan Hildebrandt. Model Synchronization at Work: Keeping SysML and AUTOSAR Models Consistent. In Gregor Engels, Claus Lewerentz, Wilhelm Schäfer, Andy Schürr, and B. Westfechtel, editors, Graph Transformations and Model Driven Enginering - Essays Dedicated to Manfred Nagl on the Occasion of his 65th Birthday, volume 5765 of Lecture Notes in Computer Science, pages 555–579. Springer Berlin / Heidelberg, 2010.] how model synchronization and consistency rules can be applied to automate this task and ensure that the different models are kept consistent. We also introduce a general approach for model synchronization. Besides synchronization, the approach consists of tool adapters as well as consistency rules covering the overlap between the synchronized parts of a model and the rest. We present the model synchronization algorithm based on triple graph grammars in detail and further exemplify the general approach by means of a model synchronization solution between system engineering models in SysML and software engineering models in AUTOSAR which has been developed for an industrial partner. In the appendix as extension to [19] the meta-models and all TGG rules for the SysML to AUTOSAR model synchronization are documented.

Interacting with personal fabrication devices (2016)

Mueller, Stefanie

Personal fabrication tools, such as 3D printers, are on the way of enabling a future in which non-technical users will be able to create custom objects. However, while the hardware is there, the current interaction model behind existing design tools is not suitable for non-technical users. Today, 3D printers are operated by fabricating the object in one go, which tends to take overnight due to the slow 3D printing technology. Consequently, the current interaction model requires users to think carefully before printing as every mistake may imply another overnight print. Planning every step ahead, however, is not feasible for non-technical users as they lack the experience to reason about the consequences of their design decisions. In this dissertation, we propose changing the interaction model around personal fabrication tools to better serve this user group. We draw inspiration from personal computing and argue that the evolution of personal fabrication may resemble the evolution of personal computing: Computing started with machines that executed a program in one go before returning the result to the user. By decreasing the interaction unit to single requests, turn-taking systems such as the command line evolved, which provided users with feedback after every input. Finally, with the introduction of direct-manipulation interfaces, users continuously interacted with a program receiving feedback about every action in real-time. In this dissertation, we explore whether these interaction concepts can be applied to personal fabrication as well. We start with fabricating an object in one go and investigate how to tighten the feedback-cycle on an object-level: We contribute a method called low-fidelity fabrication, which saves up to 90% fabrication time by creating objects as fast low-fidelity previews, which are sufficient to evaluate key design aspects. Depending on what is currently being tested, we propose different conversions that enable users to focus on different parts: faBrickator allows for a modular design in the early stages of prototyping; when users move on WirePrint allows quickly testing an object's shape, while Platener allows testing an object's technical function. We present an interactive editor for each technique and explain the underlying conversion algorithms. By interacting on smaller units, such as a single element of an object, we explore what it means to transition from systems that fabricate objects in one go to turn-taking systems. We start with a 2D system called constructable: Users draw with a laser pointer onto the workpiece inside a laser cutter. The drawing is captured with an overhead camera. As soon as the the user finishes drawing an element, such as a line, the constructable system beautifies the path and cuts it--resulting in physical output after every editing step. We extend constructable towards 3D editing by developing a novel laser-cutting technique for 3D objects called LaserOrigami that works by heating up the workpiece with the defocused laser until the material becomes compliant and bends down under gravity. While constructable and LaserOrigami allow for fast physical feedback, the interaction is still best described as turn-taking since it consists of two discrete steps: users first create an input and afterwards the system provides physical output. By decreasing the interaction unit even further to a single feature, we can achieve real-time physical feedback: Input by the user and output by the fabrication device are so tightly coupled that no visible lag exists. This allows us to explore what it means to transition from turn-taking interfaces, which only allow exploring one option at a time, to direct manipulation interfaces with real-time physical feedback, which allow users to explore the entire space of options continuously with a single interaction. We present a system called FormFab, which allows for such direct control. FormFab is based on the same principle as LaserOrigami: It uses a workpiece that when warmed up becomes compliant and can be reshaped. However, FormFab achieves the reshaping not based on gravity, but through a pneumatic system that users can control interactively. As users interact, they see the shape change in real-time. We conclude this dissertation by extrapolating the current evolution into a future in which large numbers of people use the new technology to create objects. We see two additional challenges on the horizon: sustainability and intellectual property. We investigate sustainability by demonstrating how to print less and instead patch physical objects. We explore questions around intellectual property with a system called Scotty that transfers objects without creating duplicates, thereby preserving the designer's copyright.

Interactive rendering techniques for focus+context visualization of 3D geovirtual environments (2013)

Trapp, Matthias

This thesis introduces a collection of new real-time rendering techniques and applications for focus+context visualization of interactive 3D geovirtual environments such as virtual 3D city and landscape models. These environments are generally characterized by a large number of objects and are of high complexity with respect to geometry and textures. For these reasons, their interactive 3D rendering represents a major challenge. Their 3D depiction implies a number of weaknesses such as occlusions, cluttered image contents, and partial screen-space usage. To overcome these limitations and, thus, to facilitate the effective communication of geo-information, principles of focus+context visualization can be used for the design of real-time 3D rendering techniques for 3D geovirtual environments (see Figure). In general, detailed views of a 3D geovirtual environment are combined seamlessly with abstracted views of the context within a single image. To perform the real-time image synthesis required for interactive visualization, dedicated parallel processors (GPUs) for rasterization of computer graphics primitives are used. For this purpose, the design and implementation of appropriate data structures and rendering pipelines are necessary. The contribution of this work comprises the following five real-time rendering methods: • The rendering technique for 3D generalization lenses enables the combination of different 3D city geometries (e.g., generalized versions of a 3D city model) in a single image in real time. The method is based on a generalized and fragment-precise clipping approach, which uses a compressible, raster-based data structure. It enables the combination of detailed views in the focus area with the representation of abstracted variants in the context area. • The rendering technique for the interactive visualization of dynamic raster data in 3D geovirtual environments facilitates the rendering of 2D surface lenses. It enables a flexible combination of different raster layers (e.g., aerial images or videos) using projective texturing for decoupling image and geometry data. Thus, various overlapping and nested 2D surface lenses of different contents can be visualized interactively. • The interactive rendering technique for image-based deformation of 3D geovirtual environments enables the real-time image synthesis of non-planar projections, such as cylindrical and spherical projections, as well as multi-focal 3D fisheye-lenses and the combination of planar and non-planar projections. • The rendering technique for view-dependent multi-perspective views of 3D geovirtual environments, based on the application of global deformations to the 3D scene geometry, can be used for synthesizing interactive panorama maps to combine detailed views close to the camera (focus) with abstract views in the background (context). This approach reduces occlusions, increases the usage the available screen space, and reduces the overload of image contents. • The object-based and image-based rendering techniques for highlighting objects and focus areas inside and outside the view frustum facilitate preattentive perception. The concepts and implementations of interactive image synthesis for focus+context visualization and their selected applications enable a more effective communication of spatial information, and provide building blocks for design and development of new applications and systems in the field of 3D geovirtual environments.

Java language conversion assistant : an analysis (2004)

This document is an analysis of the 'Java Language Conversion Assistant'. Itr will also cover a language analysis of the Java Programming Language as well as a survey of related work concerning Java and C# interoperability on the one hand and language conversion in general on the other. Part I deals with language analysis. Part II covers the JLCA tool and tests used to analyse the tool. Additionally, it gives an overview of the above mentioned related work. Part III presents a complete project that has been translated using the JLCA.

k-Inductive invariant checking for graph transformation systems (2017)

Dyck, Johannes ; Giese, Holger

While offering significant expressive power, graph transformation systems often come with rather limited capabilities for automated analysis, particularly if systems with many possible initial graphs and large or infinite state spaces are concerned. One approach that tries to overcome these limitations is inductive invariant checking. However, the verification of inductive invariants often requires extensive knowledge about the system in question and faces the approach-inherent challenges of locality and lack of context. To address that, this report discusses k-inductive invariant checking for graph transformation systems as a generalization of inductive invariants. The additional context acquired by taking multiple (k) steps into account is the key difference to inductive invariant checking and is often enough to establish the desired invariants without requiring the iterative development of additional properties. To analyze possibly infinite systems in a finite fashion, we introduce a symbolic encoding for transformation traces using a restricted form of nested application conditions. As its central contribution, this report then presents a formal approach and algorithm to verify graph constraints as k-inductive invariants. We prove the approach's correctness and demonstrate its applicability by means of several examples evaluated with a prototypical implementation of our algorithm.

Konzepte der Softwarevisualisierung für komplexe, objektorientierte Softwaresysteme (2005)

1. Grundlagen der Softwarevisualisierung Johannes Bohnet und Jürgen Döllner 2. Visualisierung und Exploration von Softwaresystemen mit dem Werkzeug SHriMP/Creole Alexander Gierak 3. Annex: SHriMP/Creole in der Anwendung Nebojsa Lazic 4. Metrikbasierte Softwarevisualisierung mit dem Reverse-Engineering-Werkzeug CodeCrawler Daniel Brinkmann 5. Annex: CodeCrawler in der Anwendung Benjamin Hagedorn 6. Quellcodezeilenbasierte Softwarevisualisierung Nebojsa Lazic 7. Landschafts- und Stadtmetaphern zur Softwarevisualisierung Benjamin Hagedorn 8. Visualisierung von Softwareevolution Michael Schöbel 9. Ergebnisse und Ausblick Johannes Bohnet Literaturverzeichnis Autorenverzeichnis

Malleability, obliviousness and aspects for broadcast service attachment (2010)

Harrison, William

An important characteristic of Service-Oriented Architectures is that clients do not depend on the service implementation's internal assignment of methods to objects. It is perhaps the most important technical characteristic that differentiates them from more common object-oriented solutions. This characteristic makes clients and services malleable, allowing them to be rearranged at run-time as circumstances change. That improvement in malleability is impaired by requiring clients to direct service requests to particular services. Ideally, the clients are totally oblivious to the service structure, as they are to aspect structure in aspect-oriented software. Removing knowledge of a method implementation's location, whether in object or service, requires re-defining the boundary line between programming language and middleware, making clearer specification of dependence on protocols, and bringing the transaction-like concept of failure scopes into language semantics as well. This paper explores consequences and advantages of a transition from object-request brokering to service-request brokering, including the potential to improve our ability to write more parallel software.

Management Digitaler Identitäten (2017)

Tietz, Christian ; Pelchen, Chris ; Meinel, Christoph ; Schnjakin, Maxim

Um den zunehmenden Diebstahl digitaler Identitäten zu bekämpfen, gibt es bereits mehr als ein Dutzend Technologien. Sie sind, vor allem bei der Authentifizierung per Passwort, mit spezifischen Nachteilen behaftet, haben andererseits aber auch jeweils besondere Vorteile. Wie solche Kommunikationsstandards und -Protokolle wirkungsvoll miteinander kombiniert werden können, um dadurch mehr Sicherheit zu erreichen, haben die Autoren dieser Studie analysiert. Sie sprechen sich für neuartige Identitätsmanagement-Systeme aus, die sich flexibel auf verschiedene Rollen eines einzelnen Nutzers einstellen können und bequemer zu nutzen sind als bisherige Verfahren. Als ersten Schritt auf dem Weg hin zu einer solchen Identitätsmanagement-Plattform beschreiben sie die Möglichkeiten einer Analyse, die sich auf das individuelle Verhalten eines Nutzers oder einer Sache stützt. Ausgewertet werden dabei Sensordaten mobiler Geräte, welche die Nutzer häufig bei sich tragen und umfassend einsetzen, also z.B. internetfähige Mobiltelefone, Fitness-Tracker und Smart Watches. Die Wissenschaftler beschreiben, wie solche Kleincomputer allein z.B. anhand der Analyse von Bewegungsmustern, Positionsund Netzverbindungsdaten kontinuierlich ein „Vertrauens-Niveau“ errechnen können. Mit diesem ermittelten „Trust Level“ kann jedes Gerät ständig die Wahrscheinlichkeit angeben, mit der sein aktueller Benutzer auch der tatsächliche Besitzer ist, dessen typische Verhaltensmuster es genauestens „kennt“. Wenn der aktuelle Wert des Vertrauens-Niveaus (nicht aber die biometrischen Einzeldaten) an eine externe Instanz wie einen Identitätsprovider übermittelt wird, kann dieser das Trust Level allen Diensten bereitstellen, welche der Anwender nutzt und darüber informieren will. Jeder Dienst ist in der Lage, selbst festzulegen, von welchem Vertrauens-Niveau an er einen Nutzer als authentifiziert ansieht. Erfährt er von einem unter das Limit gesunkenen Trust Level, kann der Identitätsprovider seine Nutzung und die anderer Services verweigern. Die besonderen Vorteile dieses Identitätsmanagement-Ansatzes liegen darin, dass er keine spezifische und teure Hardware benötigt, um spezifische Daten auszuwerten, sondern lediglich Smartphones und so genannte Wearables. Selbst Dinge wie Maschinen, die Daten über ihr eigenes Verhalten per Sensor-Chip ins Internet funken, können einbezogen werden. Die Daten werden kontinuierlich im Hintergrund erhoben, ohne dass sich jemand darum kümmern muss. Sie sind nur für die Berechnung eines Wahrscheinlichkeits-Messwerts von Belang und verlassen niemals das Gerät. Meldet sich ein Internetnutzer bei einem Dienst an, muss er sich nicht zunächst an ein vorher festgelegtes Geheimnis – z.B. ein Passwort – erinnern, sondern braucht nur die Weitergabe seines aktuellen Vertrauens-Wertes mit einem „OK“ freizugeben. Ändert sich das Nutzungsverhalten – etwa durch andere Bewegungen oder andere Orte des Einloggens ins Internet als die üblichen – wird dies schnell erkannt. Unbefugten kann dann sofort der Zugang zum Smartphone oder zu Internetdiensten gesperrt werden. Künftig kann die Auswertung von Verhaltens-Faktoren noch erweitert werden, indem z.B. Routinen an Werktagen, an Wochenenden oder im Urlaub erfasst werden. Der Vergleich mit den live erhobenen Daten zeigt dann an, ob das Verhalten in das übliche Muster passt, der Benutzer also mit höchster Wahrscheinlichkeit auch der ausgewiesene Besitzer des Geräts ist. Über die Techniken des Managements digitaler Identitäten und die damit verbundenen Herausforderungen gibt diese Studie einen umfassenden Überblick. Sie beschreibt zunächst, welche Arten von Angriffen es gibt, durch die digitale Identitäten gestohlen werden können. Sodann werden die unterschiedlichen Verfahren von Identitätsnachweisen vorgestellt. Schließlich liefert die Studie noch eine zusammenfassende Übersicht über die 15 wichtigsten Protokolle und technischen Standards für die Kommunikation zwischen den drei beteiligten Akteuren: Service Provider/Dienstanbieter, Identitätsprovider und Nutzer. Abschließend wird aktuelle Forschung des Hasso-Plattner-Instituts zum Identitätsmanagement vorgestellt.

Matching events and activities (2015)

Baier, Thomas

Nowadays, business processes are increasingly supported by IT services that produce massive amounts of event data during process execution. Aiming at a better process understanding and improvement, this event data can be used to analyze processes using process mining techniques. Process models can be automatically discovered and the execution can be checked for conformance to specified behavior. Moreover, existing process models can be enhanced and annotated with valuable information, for example for performance analysis. While the maturity of process mining algorithms is increasing and more tools are entering the market, process mining projects still face the problem of different levels of abstraction when comparing events with modeled business activities. Mapping the recorded events to activities of a given process model is essential for conformance checking, annotation and understanding of process discovery results. Current approaches try to abstract from events in an automated way that does not capture the required domain knowledge to fit business activities. Such techniques can be a good way to quickly reduce complexity in process discovery. Yet, they fail to enable techniques like conformance checking or model annotation, and potentially create misleading process discovery results by not using the known business terminology. In this thesis, we develop approaches that abstract an event log to the same level that is needed by the business. Typically, this abstraction level is defined by a given process model. Thus, the goal of this thesis is to match events from an event log to activities in a given process model. To accomplish this goal, behavioral and linguistic aspects of process models and event logs as well as domain knowledge captured in existing process documentation are taken into account to build semiautomatic matching approaches. The approaches establish a pre--processing for every available process mining technique that produces or annotates a process model, thereby reducing the manual effort for process analysts. While each of the presented approaches can be used in isolation, we also introduce a general framework for the integration of different matching approaches. The approaches have been evaluated in case studies with industry and using a large industry process model collection and simulated event logs. The evaluation demonstrates the effectiveness and efficiency of the approaches and their robustness towards nonconforming execution logs.

MDE settings in SAP : a descriptive field study (2012)

Hebig, Regina ; Giese, Holger

MDE techniques are more and more used in praxis. However, there is currently a lack of detailed reports about how different MDE techniques are integrated into the development and combined with each other. To learn more about such MDE settings, we performed a descriptive and exploratory field study with SAP, which is a worldwide operating company with around 50.000 employees and builds enterprise software applications. This technical report describes insights we got during this study. For example, we identified that MDE settings are subject to evolution. Finally, this report outlines directions for future research to provide practical advises for the application of MDE settings.

Model-driven engineering of adaptation engines for self-adaptive software : executable runtime megamodels (2013)

Vogel, Thomas ; Giese, Holger

The development of self-adaptive software requires the engineering of an adaptation engine that controls and adapts the underlying adaptable software by means of feedback loops. The adaptation engine often describes the adaptation by using runtime models representing relevant aspects of the adaptable software and particular activities such as analysis and planning that operate on these runtime models. To systematically address the interplay between runtime models and adaptation activities in adaptation engines, runtime megamodels have been proposed for self-adaptive software. A runtime megamodel is a specific runtime model whose elements are runtime models and adaptation activities. Thus, a megamodel captures the interplay between multiple models and between models and activities as well as the activation of the activities. In this article, we go one step further and present a modeling language for ExecUtable RuntimE MegAmodels (EUREMA) that considerably eases the development of adaptation engines by following a model-driven engineering approach. We provide a domain-specific modeling language and a runtime interpreter for adaptation engines, in particular for feedback loops. Megamodels are kept explicit and alive at runtime and by interpreting them, they are directly executed to run feedback loops. Additionally, they can be dynamically adjusted to adapt feedback loops. Thus, EUREMA supports development by making feedback loops, their runtime models, and adaptation activities explicit at a higher level of abstraction. Moreover, it enables complex solutions where multiple feedback loops interact or even operate on top of each other. Finally, it leverages the co-existence of self-adaptation and off-line adaptation for evolution.

Model-driven security in service-oriented architectures : leveraging security patterns to transform high-level security requirements to technical policies (2011)

Menzel, Michael

Service-oriented Architectures (SOA) facilitate the provision and orchestration of business services to enable a faster adoption to changing business demands. Web Services provide a technical foundation to implement this paradigm on the basis of XML-messaging. However, the enhanced flexibility of message-based systems comes along with new threats and risks. To face these issues, a variety of security mechanisms and approaches is supported by the Web Service specifications. The usage of these security mechanisms and protocols is configured by stating security requirements in security policies. However, security policy languages for SOA are complex and difficult to create due to the expressiveness of these languages. To facilitate and simplify the creation of security policies, this thesis presents a model-driven approach that enables the generation of complex security policies on the basis of simple security intentions. SOA architects can specify these intentions in system design models and are not required to deal with complex technical security concepts. The approach introduced in this thesis enables the enhancement of any system design modelling languages – for example FMC or BPMN – with security modelling elements. The syntax, semantics, and notion of these elements is defined by our security modelling language SecureSOA. The metamodel of this language provides extension points to enable the integration into system design modelling languages. In particular, this thesis demonstrates the enhancement of FMC block diagrams with SecureSOA. To enable the model-driven generation of security policies, a domain-independent policy model is introduced in this thesis. This model provides an abstraction layer for security policies. Mappings are used to perform the transformation from our model to security policy languages. However, expert knowledge is required to generate instances of this model on the basis of simple security intentions. Appropriate security mechanisms, protocols and options must be chosen and combined to fulfil these security intentions. In this thesis, a formalised system of security patterns is used to represent this knowledge and to enable an automated transformation process. Moreover, a domain-specific language is introduced to state security patterns in an accessible way. On the basis of this language, a system of security configuration patterns is provided to transform security intentions related to data protection and identity management. The formal semantics of the security pattern language enable the verification of the transformation process introduced in this thesis and prove the correctness of the pattern application. Finally, our SOA Security LAB is presented that demonstrates the application of our model-driven approach to facilitate a dynamic creation, configuration, and execution of secure Web Service-based composed applications.

Modeling and enacting complex data dependencies in business processes (2013)

Meyer, Andreas ; Pufahl, Luise ; Fahland, Dirk ; Weske, Mathias

Enacting business processes in process engines requires the coverage of control flow, resource assignments, and process data. While the first two aspects are well supported in current process engines, data dependencies need to be added and maintained manually by a process engineer. Thus, this task is error-prone and time-consuming. In this report, we address the problem of modeling processes with complex data dependencies, e.g., m:n relationships, and their automatic enactment from process models. First, we extend BPMN data objects with few annotations to allow data dependency handling as well as data instance differentiation. Second, we introduce a pattern-based approach to derive SQL queries from process models utilizing the above mentioned extensions. Therewith, we allow automatic enactment of data-aware BPMN process models. We implemented our approach for the Activiti process engine to show applicability.

Modeling and verifying dynamic evolving service-oriented architectures (2013)

Giese, Holger ; Becker, Basil

The service-oriented architecture supports the dynamic assembly and runtime reconfiguration of complex open IT landscapes by means of runtime binding of service contracts, launching of new components and termination of outdated ones. Furthermore, the evolution of these IT landscapes is not restricted to exchanging components with other ones using the same service contracts, as new services contracts can be added as well. However, current approaches for modeling and verification of service-oriented architectures do not support these important capabilities to their full extend.In this report we present an extension of the current OMG proposal for service modeling with UML - SoaML - which overcomes these limitations. It permits modeling services and their service contracts at different levels of abstraction, provides a formal semantics for all modeling concepts, and enables verifying critical properties. Our compositional and incremental verification approach allows for complex properties including communication parameters and time and covers besides the dynamic binding of service contracts and the replacement of components also the evolution of the systems by means of new service contracts. The modeling as well as verification capabilities of the presented approach are demonstrated by means of a supply chain example and the verification results of a first prototype are shown.

Modeling collaborations in adaptive systems of systems (2016)

Wätzoldt, Sebastian

Recently, due to an increasing demand on functionality and flexibility, beforehand isolated systems have become interconnected to gain powerful adaptive Systems of Systems (SoS) solutions with an overall robust, flexible and emergent behavior. The adaptive SoS comprises a variety of different system types ranging from small embedded to adaptive cyber-physical systems. On the one hand, each system is independent, follows a local strategy and optimizes its behavior to reach its goals. On the other hand, systems must cooperate with each other to enrich the overall functionality to jointly perform on the SoS level reaching global goals, which cannot be satisfied by one system alone. Due to difficulties of local and global behavior optimizations conflicts may arise between systems that have to be solved by the adaptive SoS. This thesis proposes a modeling language that facilitates the description of an adaptive SoS by considering the adaptation capabilities in form of feedback loops as first class entities. Moreover, this thesis adopts the Models@runtime approach to integrate the available knowledge in the systems as runtime models into the modeled adaptation logic. Furthermore, the modeling language focuses on the description of system interactions within the adaptive SoS to reason about individual system functionality and how it emerges via collaborations to an overall joint SoS behavior. Therefore, the modeling language approach enables the specification of local adaptive system behavior, the integration of knowledge in form of runtime models and the joint interactions via collaboration to place the available adaptive behavior in an overall layered, adaptive SoS architecture. Beside the modeling language, this thesis proposes analysis rules to investigate the modeled adaptive SoS, which enables the detection of architectural patterns as well as design flaws and pinpoints to possible system threats. Moreover, a simulation framework is presented, which allows the direct execution of the modeled SoS architecture. Therefore, the analysis rules and the simulation framework can be used to verify the interplay between systems as well as the modeled adaptation effects within the SoS. This thesis realizes the proposed concepts of the modeling language by mapping them to a state of the art standard from the automotive domain and thus, showing their applicability to actual systems. Finally, the modeling language approach is evaluated by remodeling up to date research scenarios from different domains, which demonstrates that the modeling language concepts are powerful enough to cope with a broad range of existing research problems.

Modeling collaborations in self-adaptive systems of systems (2015)

Wätzoldt, Sebastian ; Giese, Holger

An increasing demand on functionality and flexibility leads to an integration of beforehand isolated system solutions building a so-called System of Systems (SoS). Furthermore, the overall SoS should be adaptive to react on changing requirements and environmental conditions. Due SoS are composed of different independent systems that may join or leave the overall SoS at arbitrary point in times, the SoS structure varies during the systems lifetime and the overall SoS behavior emerges from the capabilities of the contained subsystems. In such complex system ensembles new demands of understanding the interaction among subsystems, the coupling of shared system knowledge and the influence of local adaptation strategies to the overall resulting system behavior arise. In this report, we formulate research questions with the focus of modeling interactions between system parts inside a SoS. Furthermore, we define our notion of important system types and terms by retrieving the current state of the art from literature. Having a common understanding of SoS, we discuss a set of typical SoS characteristics and derive general requirements for a collaboration modeling language. Additionally, we retrieve a broad spectrum of real scenarios and frameworks from literature and discuss how these scenarios cope with different characteristics of SoS. Finally, we discuss the state of the art for existing modeling languages that cope with collaborations for different system types such as SoS.

Multi-scale representations of virtual 3D city models (2012)

Glander, Tassilo

Virtual 3D city and landscape models are the main subject investigated in this thesis. They digitally represent urban space and have many applications in different domains, e.g., simulation, cadastral management, and city planning. Visualization is an elementary component of these applications. Photo-realistic visualization with an increasingly high degree of detail leads to fundamental problems for comprehensible visualization. A large number of highly detailed and textured objects within a virtual 3D city model may create visual noise and overload the users with information. Objects are subject to perspective foreshortening and may be occluded or not displayed in a meaningful way, as they are too small. In this thesis we present abstraction techniques that automatically process virtual 3D city and landscape models to derive abstracted representations. These have a reduced degree of detail, while essential characteristics are preserved. After introducing definitions for model, scale, and multi-scale representations, we discuss the fundamentals of map generalization as well as techniques for 3D generalization. The first presented technique is a cell-based generalization of virtual 3D city models. It creates abstract representations that have a highly reduced level of detail while maintaining essential structures, e.g., the infrastructure network, landmark buildings, and free spaces. The technique automatically partitions the input virtual 3D city model into cells based on the infrastructure network. The single building models contained in each cell are aggregated to abstracted cell blocks. Using weighted infrastructure elements, cell blocks can be computed on different hierarchical levels, storing the hierarchy relation between the cell blocks. Furthermore, we identify initial landmark buildings within a cell by comparing the properties of individual buildings with the aggregated properties of the cell. For each block, the identified landmark building models are subtracted using Boolean operations and integrated in a photo-realistic way. Finally, for the interactive 3D visualization we discuss the creation of the virtual 3D geometry and their appearance styling through colors, labeling, and transparency. We demonstrate the technique with example data sets. Additionally, we discuss applications of generalization lenses and transitions between abstract representations. The second technique is a real-time-rendering technique for geometric enhancement of landmark objects within a virtual 3D city model. Depending on the virtual camera distance, landmark objects are scaled to ensure their visibility within a specific distance interval while deforming their environment. First, in a preprocessing step a landmark hierarchy is computed, this is then used to derive distance intervals for the interactive rendering. At runtime, using the virtual camera distance, a scaling factor is computed and applied to each landmark. The scaling factor is interpolated smoothly at the interval boundaries using cubic Bézier splines. Non-landmark geometry that is near landmark objects is deformed with respect to a limited number of landmarks. We demonstrate the technique by applying it to a highly detailed virtual 3D city model and a generalized 3D city model. In addition we discuss an adaptation of the technique for non-linear projections and mobile devices. The third technique is a real-time rendering technique to create abstract 3D isocontour visualization of virtual 3D terrain models. The virtual 3D terrain model is visualized as a layered or stepped relief. The technique works without preprocessing and, as it is implemented using programmable graphics hardware, can be integrated with minimal changes into common terrain rendering techniques. Consequently, the computation is done in the rendering pipeline for each vertex, primitive, i.e., triangle, and fragment. For each vertex, the height is quantized to the nearest isovalue. For each triangle, the vertex configuration with respect to their isovalues is determined first. Using the configuration, the triangle is then subdivided. The subdivision forms a partial step geometry aligned with the triangle. For each fragment, the surface appearance is determined, e.g., depending on the surface texture, shading, and height-color-mapping. Flexible usage of the technique is demonstrated with applications from focus+context visualization, out-of-core terrain rendering, and information visualization. This thesis presents components for the creation of abstract representations of virtual 3D city and landscape models. Re-using visual language from cartography, the techniques enable users to build on their experience with maps when interpreting these representations. Simultaneously, characteristics of 3D geovirtual environments are taken into account by addressing and discussing, e.g., continuous scale, interaction, and perspective.

On discovering and incrementally updating inclusion dependencies (2020)

Shaabani, Nuhad

In today's world, many applications produce large amounts of data at an enormous rate. Analyzing such datasets for metadata is indispensable for effectively understanding, storing, querying, manipulating, and mining them. Metadata summarizes technical properties of a dataset which rang from basic statistics to complex structures describing data dependencies. One type of dependencies is inclusion dependency (IND), which expresses subset-relationships between attributes of datasets. Therefore, inclusion dependencies are important for many data management applications in terms of data integration, query optimization, schema redesign, or integrity checking. So, the discovery of inclusion dependencies in unknown or legacy datasets is at the core of any data profiling effort. For exhaustively detecting all INDs in large datasets, we developed S-indd++, a new algorithm that eliminates the shortcomings of existing IND-detection algorithms and significantly outperforms them. S-indd++ is based on a novel concept for the attribute clustering for efficiently deriving INDs. Inferring INDs from our attribute clustering eliminates all redundant operations caused by other algorithms. S-indd++ is also based on a novel partitioning strategy that enables discording a large number of candidates in early phases of the discovering process. Moreover, S-indd++ does not require to fit a partition into the main memory--this is a highly appreciable property in the face of ever-growing datasets. S-indd++ reduces up to 50% of the runtime of the state-of-the-art approach. None of the approach for discovering INDs is appropriate for the application on dynamic datasets; they can not update the INDs after an update of the dataset without reprocessing it entirely. To this end, we developed the first approach for incrementally updating INDs in frequently changing datasets. We achieved that by reducing the problem of incrementally updating INDs to the incrementally updating the attribute clustering from which all INDs are efficiently derivable. We realized the update of the clusters by designing new operations to be applied to the clusters after every data update. The incremental update of INDs reduces the time of the complete rediscovery by up to 99.999%. All existing algorithms for discovering n-ary INDs are based on the principle of candidate generation--they generate candidates and test their validity in the given data instance. The major disadvantage of this technique is the exponentially growing number of database accesses in terms of SQL queries required for validation. We devised Mind2, the first approach for discovering n-ary INDs without candidate generation. Mind2 is based on a new mathematical framework developed in this thesis for computing the maximum INDs from which all other n-ary INDs are derivable. The experiments showed that Mind2 is significantly more scalable and effective than hypergraph-based algorithms.

On the operationalization of graph queries with generalized discrimination networks (2016)

Beyhl, Thomas ; Blouin, Dominique ; Giese, Holger ; Lambers, Leen

Graph queries have lately gained increased interest due to application areas such as social networks, biological networks, or model queries. For the relational database case the relational algebra and generalized discrimination networks have been studied to find appropriate decompositions into subqueries and ordering of these subqueries for query evaluation or incremental updates of query results. For graph database queries however there is no formal underpinning yet that allows us to find such suitable operationalizations. Consequently, we suggest a simple operational concept for the decomposition of arbitrary complex queries into simpler subqueries and the ordering of these subqueries in form of generalized discrimination networks for graph queries inspired by the relational case. The approach employs graph transformation rules for the nodes of the network and thus we can employ the underlying theory. We further show that the proposed generalized discrimination networks have the same expressive power as nested graph conditions.

openHPI : das MOOC-Angebot des Hasso-Plattner-Instituts (2013)

Meinel, Christoph ; Willems, Christian

Die neue interaktive Online-Bildungsplattform openHPI (https://openHPI.de) des Hasso-Plattner-Instituts (HPI) bietet frei zugängliche und kostenlose Onlinekurse für interessierte Teilnehmer an, die sich mit Inhalten aus dem Bereich der Informationstechnologien und Informatik beschäftige¬n. Wie die seit 2011 zunächst von der Stanford University, später aber auch von anderen Elite-Universitäten der USA angeboten „Massive Open Online Courses“, kurz MOOCs genannt, bietet openHPI im Internet Lernvideos und weiterführenden Lesestoff in einer Kombination mit lernunterstützenden Selbsttests, Hausaufgaben und einem sozialen Diskussionsforum an und stimuliert die Ausbildung einer das Lernen fördernden virtuellen Lerngemeinschaft. Im Unterschied zu „traditionellen“ Vorlesungsportalen, wie z.B. dem tele-TASK Portal (http://www.tele-task.de), bei dem multimedial aufgezeichnete Vorlesungen zum Abruf bereit gestellt werden, bietet openHPI didaktisch aufbereitete Onlinekurse an. Diese haben einen festen Starttermin und bieten dann in einem austarierten Zeitplan von sechs aufeinanderfolgenden Kurswochen multimedial aufbereitete und wann immer möglich interaktive Lehrmaterialien. In jeder Woche wird ein Kapitel des Kursthemas behandelt. Dazu werden zu Wochenbeginn eine Reihe von Lehrvideos, Texten, Selbsttests und ein Hausaufgabenblatt bereitgestellt, mit denen sich die Kursteilnehmer in dieser Woche beschäftigen. Kombiniert sind die Angebote mit einer sozialen Diskussionsplattform, auf der sich die Teilnehmer mit den Kursbetreuern und anderen Teilnehmern austauschen, Fragen klären und weiterführende Themen diskutieren können. Natürlich entscheiden die Teilnehmer selbst über Art und Umfang ihrer Lernaktivitäten. Sie können in den Kurs eigene Beiträge einbringen, zum Beispiel durch Blogposts oder Tweets, auf die sie im Forum verweisen. Andere Lernende können diese dann kommentieren, diskutieren oder ihrerseits erweitern. Auf diese Weise werden die Lernenden, die Lehrenden und die angebotenen Lerninhalte in einer virtuellen Gemeinschaft, einem sozialen Lernnetzwerk miteinander verknüpft.

openHPI : the MOOC offer at Hasso Plattner Institute (2013)

Meinel, Christoph ; Willems, Christian

The new interactive online educational platform openHPI, (https://openHPI.de) from Hasso Plattner Institute (HPI), offers freely accessible courses at no charge for all who are interested in subjects in the field of information technology and computer science. Since 2011, “Massive Open Online Courses,” called MOOCs for short, have been offered, first at Stanford University and then later at other U.S. elite universities. Following suit, openHPI provides instructional videos on the Internet and further reading material, combined with learning-supportive self-tests, homework and a social discussion forum. Education is further stimulated by the support of a virtual learning community. In contrast to “traditional” lecture platforms, such as the tele-TASK portal (http://www.tele-task.de) where multimedia recorded lectures are available on demand, openHPI offers didactic online courses. The courses have a fixed start date and offer a balanced schedule of six consecutive weeks presented in multimedia and, whenever possible, interactive learning material. Each week, one chapter of the course subject is treated. In addition, a series of learning videos, texts, self-tests and homework exercises are provided to course participants at the beginning of the week. The course offering is combined with a social discussion platform where participants have the opportunity to enter into an exchange with course instructors and fellow participants. Here, for example, they can get answers to questions and discuss the topics in depth. The participants naturally decide themselves about the type and range of their learning activities. They can make personal contributions to the course, for example, in blog posts or tweets, which they can refer to in the forum. In turn, other participants have the chance to comment on, discuss or expand on what has been said. In this way, the learners become the teachers and the subject matter offered to a virtual community is linked to a social learning network.

1 to 100

Refine

Has Fulltext

Author

Year of publication

Document Type

Language

Is part of the Bibliography

Keywords

Institute

173 search hits