Refine
Has Fulltext
- yes (140) (remove)
Year of publication
Document Type
- Monograph/Edited Volume (87)
- Doctoral Thesis (41)
- Conference Proceeding (9)
- Postprint (2)
- Preprint (1)
Language
- English (140) (remove)
Keywords
- Hasso-Plattner-Institut (7)
- Cloud Computing (6)
- Datenintegration (6)
- Forschungskolleg (6)
- Hasso Plattner Institute (6)
- Klausurtagung (6)
- Modellierung (6)
- Service-oriented Systems Engineering (6)
- cloud computing (6)
- Forschungsprojekte (5)
Institute
- Hasso-Plattner-Institut für Digital Engineering gGmbH (140) (remove)
An increasing demand on functionality and flexibility leads to an integration of beforehand isolated system solutions building a so-called System of Systems (SoS). Furthermore, the overall SoS should be adaptive to react on changing requirements and environmental conditions. Due SoS are composed of different independent systems that may join or leave the overall SoS at arbitrary point in times, the SoS structure varies during the systems lifetime and the overall SoS behavior emerges from the capabilities of the contained subsystems. In such complex system ensembles new demands of understanding the interaction among subsystems, the coupling of shared system knowledge and the influence of local adaptation strategies to the overall resulting system behavior arise. In this report, we formulate research questions with the focus of modeling interactions between system parts inside a SoS. Furthermore, we define our notion of important system types and terms by retrieving the current state of the art from literature. Having a common understanding of SoS, we discuss a set of typical SoS characteristics and derive general requirements for a collaboration modeling language. Additionally, we retrieve a broad spectrum of real scenarios and frameworks from literature and discuss how these scenarios cope with different characteristics of SoS. Finally, we discuss the state of the art for existing modeling languages that cope with collaborations for different system types such as SoS.
Recently, due to an increasing demand on functionality and flexibility, beforehand isolated systems have become interconnected to gain powerful adaptive Systems of Systems (SoS) solutions with an overall robust, flexible and emergent behavior. The adaptive SoS comprises a variety of different system types ranging from small embedded to adaptive cyber-physical systems. On the one hand, each system is independent, follows a local strategy and optimizes its behavior to reach its goals. On the other hand, systems must cooperate with each other to enrich the overall functionality to jointly perform on the SoS level reaching global goals, which cannot be satisfied by one system alone. Due to difficulties of local and global behavior optimizations conflicts may arise between systems that have to be solved by the adaptive SoS.
This thesis proposes a modeling language that facilitates the description of an adaptive SoS by considering the adaptation capabilities in form of feedback loops as first class entities. Moreover, this thesis adopts the Models@runtime approach to integrate the available knowledge in the systems as runtime models into the modeled adaptation logic. Furthermore, the modeling language focuses on the description of system interactions within the adaptive SoS to reason about individual system functionality and how it emerges via collaborations to an overall joint SoS behavior. Therefore, the modeling language approach enables the specification of local adaptive system behavior, the integration of knowledge in form of runtime models and the joint interactions via collaboration to place the available adaptive behavior in an overall layered, adaptive SoS architecture.
Beside the modeling language, this thesis proposes analysis rules to investigate the modeled adaptive SoS, which enables the detection of architectural patterns as well as design flaws and pinpoints to possible system threats. Moreover, a simulation framework is presented, which allows the direct execution of the modeled SoS architecture. Therefore, the analysis rules and the simulation framework can be used to verify the interplay between systems as well as the modeled adaptation effects within the SoS. This thesis realizes the proposed concepts of the modeling language by mapping them to a state of the art standard from the automotive domain and thus, showing their applicability to actual systems. Finally, the modeling language approach is evaluated by remodeling up to date research scenarios from different domains, which demonstrates that the modeling language concepts are powerful enough to cope with a broad range of existing research problems.
STG decomposition is a promising approach to tackle the complexity problems arising in logic synthesis of speed independent circuits, a robust asynchronous (i.e. clockless) circuit type. Unfortunately, STG decomposition can result in components that in isolation have irreducible CSC conflicts. Generalising earlier work, it is shown how to resolve such conflicts by introducing internal communication between the components via structural techniques only.
Most of the microelectronic circuits fabricated today are synchronous, i.e. they are driven by one or several clock signals. Synchronous circuit design faces several fundamental challenges such as high-speed clock distribution, integration of multiple cores operating at different clock rates, reduction of power consumption and dealing with voltage, temperature, manufacturing and runtime variations. Asynchronous or clockless design plays a key role in alleviating these challenges, however the design and test of asynchronous circuits is much more difficult in comparison to their synchronous counterparts. A driving force for a widespread use of asynchronous technology is the availability of mature EDA (Electronic Design Automation) tools which provide an entire automated design flow starting from an HDL (Hardware Description Language) specification yielding the final circuit layout. Even though there was much progress in developing such EDA tools for asynchronous circuit design during the last two decades, the maturity level as well as the acceptance of them is still not comparable with tools for synchronous circuit design. In particular, logic synthesis (which implies the application of Boolean minimisation techniques) for the entire system's control path can significantly improve the efficiency of the resulting asynchronous implementation, e.g. in terms of chip area and performance. However, logic synthesis, in particular for asynchronous circuits, suffers from complexity problems. Signal Transitions Graphs (STGs) are labelled Petri nets which are a widely used to specify the interface behaviour of speed independent (SI) circuits - a robust subclass of asynchronous circuits. STG decomposition is a promising approach to tackle complexity problems like state space explosion in logic synthesis of SI circuits. The (structural) decomposition of STGs is guided by a partition of the output signals and generates a usually much smaller component STG for each partition member, i.e. a component STG with a much smaller state space than the initial specification. However, decomposition can result in component STGs that in isolation have so-called irreducible CSC conflicts (i.e. these components are not SI synthesisable anymore) even if the specification has none of them. A new approach is presented to avoid such conflicts by introducing internal communication between the components. So far, STG decompositions are guided by the finest output partitions, i.e. one output per component. However, this might not yield optimal circuit implementations. Efficient heuristics are presented to determine coarser partitions leading to improved circuits in terms of chip area. For the new algorithms correctness proofs are given and their implementations are incorporated into the decomposition tool DESIJ. The presented techniques are successfully applied to some benchmarks - including 'real-life' specifications arising in the context of control resynthesis - which delivered promising results.
Developing large software projects is a complicated task and can be demanding for developers. Continuous integration is common practice for reducing complexity. By integrating and testing changes often, changesets are kept small and therefore easily comprehensible. Travis CI is a service that offers continuous integration and continuous deployment in the cloud. Software projects are build, tested, and deployed using the Travis CI infrastructure without interrupting the development process. This report describes how Travis CI works, presents how time-driven, periodic building is implemented as well as how CI data visualization can be done, and proposes a way of dealing with dependency problems.
Business Process Management (BPM) emerged as a means to control, analyse, and optimise business operations. Conceptual models are of central importance for BPM. Most prominently, process models define the behaviour that is performed to achieve a business value. In essence, a process model is a mapping of properties of the original business process to the model, created for a purpose. Different modelling purposes, therefore, result in different models of a business process. Against this background, the misalignment of process models often observed in the field of BPM is no surprise. Even if the same business scenario is considered, models created for strategic decision making differ in content significantly from models created for process automation. Despite their differences, process models that refer to the same business process should be consistent, i.e., free of contradictions. Apparently, there is a trade-off between strictness of a notion of consistency and appropriateness of process models serving different purposes. Existing work on consistency analysis builds upon behaviour equivalences and hierarchical refinements between process models. Hence, these approaches are computationally hard and do not offer the flexibility to gradually relax consistency requirements towards a certain setting. This thesis presents a framework for the analysis of behaviour consistency that takes a fundamentally different approach. As a first step, an alignment between corresponding elements of related process models is constructed. Then, this thesis conducts behavioural analysis grounded on a relational abstraction of the behaviour of a process model, its behavioural profile. Different variants of these profiles are proposed, along with efficient computation techniques for a broad class of process models. Using behavioural profiles, consistency of an alignment between process models is judged by different notions and measures. The consistency measures are also adjusted to assess conformance of process logs that capture the observed execution of a process. Further, this thesis proposes various complementary techniques to support consistency management. It elaborates on how to implement consistent change propagation between process models, addresses the exploration of behavioural commonalities and differences, and proposes a model synthesis for behavioural profiles.
When realizing a programming language as VM, implementing behavior as part of the VM, as primitive, usually results in reduced execution times. But supporting and developing primitive functions requires more effort than maintaining and using code in the hosted language since debugging is harder, and the turn-around times for VM parts are higher. Furthermore, source artifacts of primitive functions are seldom reused in new implementations of the same language. And if they are reused, the existing API usually is emulated, reducing the performance gains. Because of recent results in tracing dynamic compilation, the trade-off between performance and ease of implementation, reuse, and changeability might now be decided adversely.
In this work, we investigate the trade-offs when creating primitives, and in particular how large a difference remains between primitive and hosted function run times in VMs with tracing just-in-time compiler. To that end, we implemented the algorithmic primitive BitBlt three times for RSqueak/VM. RSqueak/VM is a Smalltalk VM utilizing the PyPy RPython toolchain. We compare primitive implementations in C, RPython, and Smalltalk, showing that due to the tracing just-in-time compiler, the performance gap has lessened by one magnitude to one magnitude.
The exponential expanding of the numbers of web sites and Internet users makes WWW the most important global information resource. From information publishing and electronic commerce to entertainment and social networking, the Web allows an inexpensive and efficient access to the services provided by individuals and institutions. The basic units for distributing these services are the web sites scattered throughout the world. However, the extreme fragility of web services and content, the high competence between similar services supplied by different sites, and the wide geographic distributions of the web users drive the urgent requirement from the web managers to track and understand the usage interest of their web customers. This thesis, "X-tracking the Usage Interest on Web Sites", aims to fulfill this requirement. "X" stands two meanings: one is that the usage interest differs from various web sites, and the other is that usage interest is depicted from multi aspects: internal and external, structural and conceptual, objective and subjective. "Tracking" shows that our concentration is on locating and measuring the differences and changes among usage patterns. This thesis presents the methodologies on discovering usage interest on three kinds of web sites: the public information portal site, e-learning site that provides kinds of streaming lectures and social site that supplies the public discussions on IT issues. On different sites, we concentrate on different issues related with mining usage interest. The educational information portal sites were the first implementation scenarios on discovering usage patterns and optimizing the organization of web services. In such cases, the usage patterns are modeled as frequent page sets, navigation paths, navigation structures or graphs. However, a necessary requirement is to rebuild the individual behaviors from usage history. We give a systematic study on how to rebuild individual behaviors. Besides, this thesis shows a new strategy on building content clusters based on pair browsing retrieved from usage logs. The difference between such clusters and the original web structure displays the distance between the destinations from usage side and the expectations from design side. Moreover, we study the problem on tracking the changes of usage patterns in their life cycles. The changes are described from internal side integrating conceptual and structure features, and from external side for the physical features; and described from local side measuring the difference between two time spans, and global side showing the change tendency along the life cycle. A platform, Web-Cares, is developed to discover the usage interest, to measure the difference between usage interest and site expectation and to track the changes of usage patterns. E-learning site provides the teaching materials such as slides, recorded lecture videos and exercise sheets. We focus on discovering the learning interest on streaming lectures, such as real medias, mp4 and flash clips. Compared to the information portal site, the usage on streaming lectures encapsulates the variables such as viewing time and actions during learning processes. The learning interest is discovered in the form of answering 6 questions, which covers finding the relations between pieces of lectures and the preference among different forms of lectures. We prefer on detecting the changes of learning interest on the same course from different semesters. The differences on the content and structure between two courses leverage the changes on the learning interest. We give an algorithm on measuring the difference on learning interest integrated with similarity comparison between courses. A search engine, TASK-Moniminer, is created to help the teacher query the learning interest on their streaming lectures on tele-TASK site. Social site acts as an online community attracting web users to discuss the common topics and share their interesting information. Compared to the public information portal site and e-learning web site, the rich interactions among users and web content bring the wider range of content quality, on the other hand, provide more possibilities to express and model usage interest. We propose a framework on finding and recommending high reputation articles in a social site. We observed that the reputation is classified into global and local categories; the quality of the articles having high reputation is related with the content features. Based on these observations, our framework is implemented firstly by finding the articles having global or local reputation, and secondly clustering articles based on their content relations, and then the articles are selected and recommended from each cluster based on their reputation ranks.
The development of self-adaptive software requires the engineering of an adaptation engine that controls and adapts the underlying adaptable software by means of feedback loops. The adaptation engine often describes the adaptation by using runtime models representing relevant aspects of the adaptable software and particular activities such as analysis and planning that operate on these runtime models. To systematically address the interplay between runtime models and adaptation activities in adaptation engines, runtime megamodels have been proposed for self-adaptive software. A runtime megamodel is a specific runtime model whose elements are runtime models and adaptation activities. Thus, a megamodel captures the interplay between multiple models and between models and activities as well as the activation of the activities. In this article, we go one step further and present a modeling language for ExecUtable RuntimE MegAmodels (EUREMA) that considerably eases the development of adaptation engines by following a model-driven engineering approach. We provide a domain-specific modeling language and a runtime interpreter for adaptation engines, in particular for feedback loops. Megamodels are kept explicit and alive at runtime and by interpreting them, they are directly executed to run feedback loops. Additionally, they can be dynamically adjusted to adapt feedback loops. Thus, EUREMA supports development by making feedback loops, their runtime models, and adaptation activities explicit at a higher level of abstraction. Moreover, it enables complex solutions where multiple feedback loops interact or even operate on top of each other. Finally, it leverages the co-existence of self-adaptation and off-line adaptation for evolution.
Software maintenance encompasses any changes made to a software system after its initial deployment and is thereby one of the key phases in the typical software-engineering lifecycle. In software maintenance, we primarily need to understand structural and behavioral aspects, which are difficult to obtain, e.g., by code reading. Software analysis is therefore a vital tool for maintaining these systems: It provides - the preferably automated - means to extract and evaluate information from their artifacts such as software structure, runtime behavior, and related processes. However, such analysis typically results in massive raw data, so that even experienced engineers face difficulties directly examining, assessing, and understanding these data. Among other things, they require tools with which to explore the data if no clear question can be formulated beforehand. For this, software analysis and visualization provide its users with powerful interactive means. These enable the automation of tasks and, particularly, the acquisition of valuable and actionable insights into the raw data. For instance, one means for exploring runtime behavior is trace visualization. This thesis aims at extending and improving the tool set for visual software analysis by concentrating on several open challenges in the fields of dynamic and static analysis of software systems. This work develops a series of concepts and tools for the exploratory visualization of the respective data to support users in finding and retrieving information on the system artifacts concerned. This is a difficult task, due to the lack of appropriate visualization metaphors; in particular, the visualization of complex runtime behavior poses various questions and challenges of both a technical and conceptual nature. This work focuses on a set of visualization techniques for visually representing control-flow related aspects of software traces from shared-memory software systems: A trace-visualization concept based on icicle plots aids in understanding both single-threaded as well as multi-threaded runtime behavior on the function level. The concept’s extensibility further allows the visualization and analysis of specific aspects of multi-threading such as synchronization, the correlation of such traces with data from static software analysis, and a comparison between traces. Moreover, complementary techniques for simultaneously analyzing system structures and the evolution of related attributes are proposed. These aim at facilitating long-term planning of software architecture and supporting management decisions in software projects by extensions to the circular-bundle-view technique: An extension to 3-dimensional space allows for the use of additional variables simultaneously; interaction techniques allow for the modification of structures in a visual manner. The concepts and techniques presented here are generic and, as such, can be applied beyond software analysis for the visualization of similarly structured data. The techniques' practicability is demonstrated by several qualitative studies using subject data from industry-scale software systems. The studies provide initial evidence that the techniques' application yields useful insights into the subject data and its interrelationships in several scenarios.
Interactive rendering techniques for focus+context visualization of 3D geovirtual environments
(2013)
This thesis introduces a collection of new real-time rendering techniques and applications for focus+context visualization of interactive 3D geovirtual environments such as virtual 3D city and landscape models. These environments are generally characterized by a large number of objects and are of high complexity with respect to geometry and textures. For these reasons, their interactive 3D rendering represents a major challenge. Their 3D depiction implies a number of weaknesses such as occlusions, cluttered image contents, and partial screen-space usage. To overcome these limitations and, thus, to facilitate the effective communication of geo-information, principles of focus+context visualization can be used for the design of real-time 3D rendering techniques for 3D geovirtual environments (see Figure). In general, detailed views of a 3D geovirtual environment are combined seamlessly with abstracted views of the context within a single image. To perform the real-time image synthesis required for interactive visualization, dedicated parallel processors (GPUs) for rasterization of computer graphics primitives are used. For this purpose, the design and implementation of appropriate data structures and rendering pipelines are necessary. The contribution of this work comprises the following five real-time rendering methods: • The rendering technique for 3D generalization lenses enables the combination of different 3D city geometries (e.g., generalized versions of a 3D city model) in a single image in real time. The method is based on a generalized and fragment-precise clipping approach, which uses a compressible, raster-based data structure. It enables the combination of detailed views in the focus area with the representation of abstracted variants in the context area. • The rendering technique for the interactive visualization of dynamic raster data in 3D geovirtual environments facilitates the rendering of 2D surface lenses. It enables a flexible combination of different raster layers (e.g., aerial images or videos) using projective texturing for decoupling image and geometry data. Thus, various overlapping and nested 2D surface lenses of different contents can be visualized interactively. • The interactive rendering technique for image-based deformation of 3D geovirtual environments enables the real-time image synthesis of non-planar projections, such as cylindrical and spherical projections, as well as multi-focal 3D fisheye-lenses and the combination of planar and non-planar projections. • The rendering technique for view-dependent multi-perspective views of 3D geovirtual environments, based on the application of global deformations to the 3D scene geometry, can be used for synthesizing interactive panorama maps to combine detailed views close to the camera (focus) with abstract views in the background (context). This approach reduces occlusions, increases the usage the available screen space, and reduces the overload of image contents. • The object-based and image-based rendering techniques for highlighting objects and focus areas inside and outside the view frustum facilitate preattentive perception. The concepts and implementations of interactive image synthesis for focus+context visualization and their selected applications enable a more effective communication of spatial information, and provide building blocks for design and development of new applications and systems in the field of 3D geovirtual environments.
In the field of disk-based parallel database management systems exists a great variety of solutions based on a shared-storage or a shared-nothing architecture. In contrast, main memory-based parallel database management systems are dominated solely by the shared-nothing approach as it preserves the in-memory performance advantage by processing data locally on each server. We argue that this unilateral development is going to cease due to the combination of the following three trends: a) Nowadays network technology features remote direct memory access (RDMA) and narrows the performance gap between accessing main memory inside a server and of a remote server to and even below a single order of magnitude. b) Modern storage systems scale gracefully, are elastic, and provide high-availability. c) A modern storage system such as Stanford's RAMCloud even keeps all data resident in main memory. Exploiting these characteristics in the context of a main-memory parallel database management system is desirable. The advent of RDMA-enabled network technology makes the creation of a parallel main memory DBMS based on a shared-storage approach feasible.
This thesis describes building a columnar database on shared main memory-based storage. The thesis discusses the resulting architecture (Part I), the implications on query processing (Part II), and presents an evaluation of the resulting solution in terms of performance, high-availability, and elasticity (Part III).
In our architecture, we use Stanford's RAMCloud as shared-storage, and the self-designed and developed in-memory AnalyticsDB as relational query processor on top. AnalyticsDB encapsulates data access and operator execution via an interface which allows seamless switching between local and remote main memory, while RAMCloud provides not only storage capacity, but also processing power. Combining both aspects allows pushing-down the execution of database operators into the storage system. We describe how the columnar data processed by AnalyticsDB is mapped to RAMCloud's key-value data model and how the performance advantages of columnar data storage can be preserved.
The combination of fast network technology and the possibility to execute database operators in the storage system opens the discussion for site selection. We construct a system model that allows the estimation of operator execution costs in terms of network transfer, data processed in memory, and wall time. This can be used for database operators that work on one relation at a time - such as a scan or materialize operation - to discuss the site selection problem (data pull vs. operator push). Since a database query translates to the execution of several database operators, it is possible that the optimal site selection varies per operator. For the execution of a database operator that works on two (or more) relations at a time, such as a join, the system model is enriched by additional factors such as the chosen algorithm (e.g. Grace- vs. Distributed Block Nested Loop Join vs. Cyclo-Join), the data partitioning of the respective relations, and their overlapping as well as the allowed resource allocation.
We present an evaluation on a cluster with 60 nodes where all nodes are connected via RDMA-enabled network equipment. We show that query processing performance is about 2.4x slower if everything is done via the data pull operator execution strategy (i.e. RAMCloud is being used only for data access) and about 27% slower if operator execution is also supported inside RAMCloud (in comparison to operating only on main memory inside a server without any network communication at all). The fast-crash recovery feature of RAMCloud can be leveraged to provide high-availability, e.g. a server crash during query execution only delays the query response for about one second. Our solution is elastic in a way that it can adapt to changing workloads a) within seconds, b) without interruption of the ongoing query processing, and c) without manual intervention.
Complexity in software systems is a major factor driving development and maintenance costs. To master this complexity, software is divided into modules that can be developed and tested separately. In order to support this separation of modules, each module should provide a clean and concise public interface. Therefore, the ability to selectively hide functionality using access control is an important feature in a programming language intended for complex software systems.
Software systems are increasingly distributed, adding not only to their inherent complexity, but also presenting security challenges. The object-capability approach addresses these challenges by defining language properties providing only minimal capabilities to objects. One programming language that is based on the object-capability approach is Newspeak, a dynamic programming language designed for modularity and security. The Newspeak specification describes access control as one of Newspeak’s properties, because it is a requirement for the object-capability approach. However, access control, as defined in the Newspeak specification, is currently not enforced in its implementation.
This work introduces an access control implementation for Newspeak, enabling the security of object-capabilities and enhancing modularity. We describe our implementation of access control for Newspeak. We adapted the runtime environment, the reflective system, the compiler toolchain, and the virtual machine. Finally, we describe a migration strategy for the existing Newspeak code base, so that our access control implementation can be integrated with minimal effort.
Virtualized cloud data centers provide on-demand resources, enable agile resource provisioning, and host heterogeneous applications with different resource requirements. These data centers consume enormous amounts of energy, increasing operational expenses, inducing high thermal inside data centers, and raising carbon dioxide emissions. The increase in energy consumption can result from ineffective resource management that causes inefficient resource utilization. This dissertation presents detailed models and novel techniques and algorithms for virtual resource management in cloud data centers. The proposed techniques take into account Service Level Agreements (SLAs) and workload heterogeneity in terms of memory access demand and communication patterns of web applications and High Performance Computing (HPC) applications. To evaluate our proposed techniques, we use simulation and real workload traces of web applications and HPC applications and compare our techniques against the other recently proposed techniques using several performance metrics. The major contributions of this dissertation are the following: proactive resource provisioning technique based on robust optimization to increase the hosts' availability for hosting new VMs while minimizing the idle energy consumption. Additionally, this technique mitigates undesirable changes in the power state of the hosts by which the hosts' reliability can be enhanced in avoiding failure during a power state change. The proposed technique exploits the range-based prediction algorithm for implementing robust optimization, taking into consideration the uncertainty of demand. An adaptive range-based prediction for predicting workload with high fluctuations in the short-term. The range prediction is implemented in two ways: standard deviation and median absolute deviation. The range is changed based on an adaptive confidence window to cope with the workload fluctuations. A robust VM consolidation for efficient energy and performance management to achieve equilibrium between energy and performance trade-offs. Our technique reduces the number of VM migrations compared to recently proposed techniques. This also contributes to a reduction in energy consumption by the network infrastructure. Additionally, our technique reduces SLA violations and the number of power state changes. A generic model for the network of a data center to simulate the communication delay and its impact on VM performance, as well as network energy consumption. In addition, a generic model for a memory-bus of a server, including latency and energy consumption models for different memory frequencies. This allows simulating the memory delay and its influence on VM performance, as well as memory energy consumption. Communication-aware and energy-efficient consolidation for parallel applications to enable the dynamic discovery of communication patterns and reschedule VMs using migration based on the determined communication patterns. A novel dynamic pattern discovery technique is implemented, based on signal processing of network utilization of VMs instead of using the information from the hosts' virtual switches or initiation from VMs. The result shows that our proposed approach reduces the network's average utilization, achieves energy savings due to reducing the number of active switches, and provides better VM performance compared to CPU-based placement. Memory-aware VM consolidation for independent VMs, which exploits the diversity of VMs' memory access to balance memory-bus utilization of hosts. The proposed technique, Memory-bus Load Balancing (MLB), reactively redistributes VMs according to their utilization of a memory-bus using VM migration to improve the performance of the overall system. Furthermore, Dynamic Voltage and Frequency Scaling (DVFS) of the memory and the proposed MLB technique are combined to achieve better energy savings.
Aspect-oriented middleware is a promising technology for the realisation of dynamic reconfiguration in heterogeneous distributed systems. However, like other dynamic reconfiguration approaches, AO-middleware-based reconfiguration requires that the consistency of the system is maintained across reconfigurations. AO-middleware-based reconfiguration is an ongoing research topic and several consistency approaches have been proposed. However, most of these approaches tend to be targeted at specific contexts, whereas for distributed systems it is crucial to cover a wide range of operating conditions. In this paper we propose an approach that offers distributed, dynamic reconfiguration in a consistent manner, and features a flexible framework-based consistency management approach to cover a wide range of operating conditions. We evaluate our approach by investigating the configurability and transparency of our approach and also quantify the performance overheads of the associated consistency mechanisms.
The Semantic Web provides information contained in the World Wide Web as machine-readable facts. In comparison to a keyword-based inquiry, semantic search enables a more sophisticated exploration of web documents. By clarifying the meaning behind entities, search results are more precise and the semantics simultaneously enable an exploration of semantic relationships. However, unlike keyword searches, a semantic entity-focused search requires that web documents are annotated with semantic representations of common words and named entities. Manual semantic annotation of (web) documents is time-consuming; in response, automatic annotation services have emerged in recent years. These annotation services take continuous text as input, detect important key terms and named entities and annotate them with semantic entities contained in widely used semantic knowledge bases, such as Freebase or DBpedia. Metadata of video documents require special attention. Semantic analysis approaches for continuous text cannot be applied, because information of a context in video documents originates from multiple sources possessing different reliabilities and characteristics. This thesis presents a semantic analysis approach consisting of a context model and a disambiguation algorithm for video metadata. The context model takes into account the characteristics of video metadata and derives a confidence value for each metadata item. The confidence value represents the level of correctness and ambiguity of the textual information of the metadata item. The lower the ambiguity and the higher the prospective correctness, the higher the confidence value. The metadata items derived from the video metadata are analyzed in a specific order from high to low confidence level. Previously analyzed metadata are used as reference points in the context for subsequent disambiguation. The contextually most relevant entity is identified by means of descriptive texts and semantic relationships to the context. The context is created dynamically for each metadata item, taking into account the confidence value and other characteristics. The proposed semantic analysis follows two hypotheses: metadata items of a context should be processed in descendent order of their confidence value, and the metadata that pertains to a context should be limited by content-based segmentation boundaries. The evaluation results support the proposed hypotheses and show increased recall and precision for annotated entities, especially for metadata that originates from sources with low reliability. The algorithms have been evaluated against several state-of-the-art annotation approaches. The presented semantic analysis process is integrated into a video analysis framework and has been successfully applied in several projects for the purpose of semantic video exploration of videos.
This work introduces concepts and corresponding tool support to enable a complementary approach in dealing with recovery. Programmers need to recover a development state, or a part thereof, when previously made changes reveal undesired implications. However, when the need arises suddenly and unexpectedly, recovery often involves expensive and tedious work. To avoid tedious work, literature recommends keeping away from unexpected recovery demands by following a structured and disciplined approach, which consists of the application of various best practices including working only on one thing at a time, performing small steps, as well as making proper use of versioning and testing tools. However, the attempt to avoid unexpected recovery is both time-consuming and error-prone. On the one hand, it requires disproportionate effort to minimize the risk of unexpected situations. On the other hand, applying recommended practices selectively, which saves time, can hardly avoid recovery. In addition, the constant need for foresight and self-control has unfavorable implications. It is exhaustive and impedes creative problem solving. This work proposes to make recovery fast and easy and introduces corresponding support called CoExist. Such dedicated support turns situations of unanticipated recovery from tedious experiences into pleasant ones. It makes recovery fast and easy to accomplish, even if explicit commits are unavailable or tests have been ignored for some time. When mistakes and unexpected insights are no longer associated with tedious corrective actions, programmers are encouraged to change source code as a means to reason about it, as opposed to making changes only after structuring and evaluating them mentally. This work further reports on an implementation of the proposed tool support in the Squeak/Smalltalk development environment. The development of the tools has been accompanied by regular performance and usability tests. In addition, this work investigates whether the proposed tools affect programmers’ performance. In a controlled lab study, 22 participants improved the design of two different applications. Using a repeated measurement setup, the study examined the effect of providing CoExist on programming performance. The result of analyzing 88 hours of programming suggests that built-in recovery support as provided with CoExist positively has a positive effect on programming performance in explorative programming tasks.
Business process models are abstractions of concrete operational procedures that occur in the daily business of organizations. To cope with the complexity of these models, business process model abstraction has been introduced recently. Its goal is to derive from a detailed process model several abstract models that provide a high-level understanding of the process. While techniques for constructing abstract models are reported in the literature, little is known about the relationships between process instances and abstract models. In this paper we show how the state of an abstract activity can be calculated from the states of related, detailed process activities as they happen. The approach uses activity state propagation. With state uniqueness and state transition correctness we introduce formal properties that improve the understanding of state propagation. Algorithms to check these properties are devised. Finally, we use behavioral profiles to identify and classify behavioral inconsistencies in abstract process models that might occur, once activity state propagation is used.
Business process management experiences a large uptake by the industry, and process models play an important role in the analysis and improvement of processes. While an increasing number of staff becomes involved in actual modeling practice, it is crucial to assure model quality and homogeneity along with providing suitable aids for creating models. In this paper we consider the problem of offering recommendations to the user during the act of modeling. Our key contribution is a concept for defining and identifying so-called action patterns - chunks of actions often appearing together in business processes. In particular, we specify action patterns and demonstrate how they can be identified from existing process model repositories using association rule mining techniques. Action patterns can then be used to suggest additional actions for a process model. Our approach is challenged by applying it to the collection of process models from the SAP Reference Model.
Business process management aims at capturing, understanding, and improving work in organizations. The central artifacts are process models, which serve different purposes. Detailed process models are used to analyze concrete working procedures, while high-level models show, for instance, handovers between departments. To provide different views on process models, business process model abstraction has emerged. While several approaches have been proposed, a number of abstraction use case that are both relevant for industry and scientifically challenging are yet to be addressed. In this paper we systematically develop, classify, and consolidate different use cases for business process model abstraction. The reported work is based on a study with BPM users in the health insurance sector and validated with a BPM consultancy company and a large BPM vendor. The identified fifteen abstraction use cases reflect the industry demand. The related work on business process model abstraction is evaluated against the use cases, which leads to a research agenda.
Business process models are used within a range of organizational initiatives, where every stakeholder has a unique perspective on a process and demands the respective model. As a consequence, multiple process models capturing the very same business process coexist. Keeping such models in sync is a challenge within an ever changing business environment: once a process is changed, all its models have to be updated. Due to a large number of models and their complex relations, model maintenance becomes error-prone and expensive. Against this background, business process model abstraction emerged as an operation reducing the number of stored process models and facilitating model management. Business process model abstraction is an operation preserving essential process properties and leaving out insignificant details in order to retain information relevant for a particular purpose. Process model abstraction has been addressed by several researchers. The focus of their studies has been on particular use cases and model transformations supporting these use cases. This thesis systematically approaches the problem of business process model abstraction shaping the outcome into a framework. We investigate the current industry demand in abstraction summarizing it in a catalog of business process model abstraction use cases. The thesis focuses on one prominent use case where the user demands a model with coarse-grained activities and overall process ordering constraints. We develop model transformations that support this use case starting with the transformations based on process model structure analysis. Further, abstraction methods considering the semantics of process model elements are investigated. First, we suggest how semantically related activities can be discovered in process models-a barely researched challenge. The thesis validates the designed abstraction methods against sets of industrial process models and discusses the method implementation aspects. Second, we develop a novel model transformation, which combined with the related activity discovery allows flexible non-hierarchical abstraction. In this way this thesis advocates novel model transformations that facilitate business process model management and provides the foundations for innovative tool support.
In today's world, many applications produce large amounts of data at an enormous rate. Analyzing such datasets for metadata is indispensable for effectively understanding, storing, querying, manipulating, and mining them. Metadata summarizes technical properties of a dataset which rang from basic statistics to complex structures describing data dependencies. One type of dependencies is inclusion dependency (IND), which expresses subset-relationships between attributes of datasets. Therefore, inclusion dependencies are important for many data management applications in terms of data integration, query optimization, schema redesign, or integrity checking. So, the discovery of inclusion dependencies in unknown or legacy datasets is at the core of any data profiling effort.
For exhaustively detecting all INDs in large datasets, we developed S-indd++, a new algorithm that eliminates the shortcomings of existing IND-detection algorithms and significantly outperforms them. S-indd++ is based on a novel concept for the attribute clustering for efficiently deriving INDs. Inferring INDs from our attribute clustering eliminates all redundant operations caused by other algorithms. S-indd++ is also based on a novel partitioning strategy that enables discording a large number of candidates in early phases of the discovering process. Moreover, S-indd++ does not require to fit a partition into the main memory--this is a highly appreciable property in the face of ever-growing datasets. S-indd++ reduces up to 50% of the runtime of the state-of-the-art approach.
None of the approach for discovering INDs is appropriate for the application on dynamic datasets; they can not update the INDs after an update of the dataset without reprocessing it entirely. To this end, we developed the first approach for incrementally updating INDs in frequently changing datasets. We achieved that by reducing the problem of incrementally updating INDs to the incrementally updating the attribute clustering from which all INDs are efficiently derivable. We realized the update of the clusters by designing new operations to be applied to the clusters after every data update. The incremental update of INDs reduces the time of the complete rediscovery by up to 99.999%.
All existing algorithms for discovering n-ary INDs are based on the principle of candidate generation--they generate candidates and test their validity in the given data instance. The major disadvantage of this technique is the exponentially growing number of database accesses in terms of SQL queries required for validation. We devised Mind2, the first approach for discovering n-ary INDs without candidate generation. Mind2 is based on a new mathematical framework developed in this thesis for computing the maximum INDs from which all other n-ary INDs are derivable. The experiments showed that Mind2 is significantly more scalable and effective than hypergraph-based algorithms.
Geospatial data has become a natural part of a growing number of information systems and services in the economy, society, and people's personal lives. In particular, virtual 3D city and landscape models constitute valuable information sources within a wide variety of applications such as urban planning, navigation, tourist information, and disaster management. Today, these models are often visualized in detail to provide realistic imagery. However, a photorealistic rendering does not automatically lead to high image quality, with respect to an effective information transfer, which requires important or prioritized information to be interactively highlighted in a context-dependent manner.
Approaches in non-photorealistic renderings particularly consider a user's task and camera perspective when attempting optimal expression, recognition, and communication of important or prioritized information. However, the design and implementation of non-photorealistic rendering techniques for 3D geospatial data pose a number of challenges, especially when inherently complex geometry, appearance, and thematic data must be processed interactively. Hence, a promising technical foundation is established by the programmable and parallel computing architecture of graphics processing units.
This thesis proposes non-photorealistic rendering techniques that enable both the computation and selection of the abstraction level of 3D geospatial model contents according to user interaction and dynamically changing thematic information. To achieve this goal, the techniques integrate with hardware-accelerated rendering pipelines using shader technologies of graphics processing units for real-time image synthesis. The techniques employ principles of artistic rendering, cartographic generalization, and 3D semiotics—unlike photorealistic rendering—to synthesize illustrative renditions of geospatial feature type entities such as water surfaces, buildings, and infrastructure networks. In addition, this thesis contributes a generic system that enables to integrate different graphic styles—photorealistic and non-photorealistic—and provide their seamless transition according to user tasks, camera view, and image resolution.
Evaluations of the proposed techniques have demonstrated their significance to the field of geospatial information visualization including topics such as spatial perception, cognition, and mapping. In addition, the applications in illustrative and focus+context visualization have reflected their potential impact on optimizing the information transfer regarding factors such as cognitive load, integration of non-realistic information, visualization of uncertainty, and visualization on small displays.
Nowadays, model-driven engineering (MDE) promises to ease software development by decreasing the inherent complexity of classical software development. In order to deliver on this promise, MDE increases the level of abstraction and automation, through a consideration of domain-specific models (DSMs) and model operations (e.g. model transformations or code generations). DSMs conform to domain-specific modeling languages (DSMLs), which increase the level of abstraction, and model operations are first-class entities of software development because they increase the level of automation. Nevertheless, MDE has to deal with at least two new dimensions of complexity, which are basically caused by the increased linguistic and technological heterogeneity. The first dimension of complexity is setting up an MDE environment, an activity comprised of the implementation or selection of DSMLs and model operations. Setting up an MDE environment is both time-consuming and error-prone because of the implementation or adaptation of model operations. The second dimension of complexity is concerned with applying MDE for actual software development. Applying MDE is challenging because a collection of DSMs, which conform to potentially heterogeneous DSMLs, are required to completely specify a complex software system. A single DSML can only be used to describe a specific aspect of a software system at a certain level of abstraction and from a certain perspective. Additionally, DSMs are usually not independent but instead have inherent interdependencies, reflecting (partial) similar aspects of a software system at different levels of abstraction or from different perspectives. A subset of these dependencies are applications of various model operations, which are necessary to keep the degree of automation high. This becomes even worse when addressing the first dimension of complexity. Due to continuous changes, all kinds of dependencies, including the applications of model operations, must also be managed continuously. This comprises maintaining the existence of these dependencies and the appropriate (re-)application of model operations. The contribution of this thesis is an approach that combines traceability and model management to address the aforementioned challenges of configuring and applying MDE for software development. The approach is considered as a traceability approach because it supports capturing and automatically maintaining dependencies between DSMs. The approach is considered as a model management approach because it supports managing the automated (re-)application of heterogeneous model operations. In addition, the approach is considered as a comprehensive model management. Since the decomposition of model operations is encouraged to alleviate the first dimension of complexity, the subsequent composition of model operations is required to counteract their fragmentation. A significant portion of this thesis concerns itself with providing a method for the specification of decoupled yet still highly cohesive complex compositions of heterogeneous model operations. The approach supports two different kinds of compositions - data-flow compositions and context compositions. Data-flow composition is used to define a network of heterogeneous model operations coupled by sharing input and output DSMs alone. Context composition is related to a concept used in declarative model transformation approaches to compose individual model transformation rules (units) at any level of detail. In this thesis, context composition provides the ability to use a collection of dependencies as context for the composition of other dependencies, including model operations. In addition, the actual implementation of model operations, which are going to be composed, do not need to implement any composition concerns. The approach is realized by means of a formalism called an executable and dynamic hierarchical megamodel, based on the original idea of megamodels. This formalism supports specifying compositions of dependencies (traceability and model operations). On top of this formalism, traceability is realized by means of a localization concept, and model management by means of an execution concept.
Cost models are an essential part of database systems, as they are the basis of query performance optimization. Based on predictions made by cost models, the fastest query execution plan can be chosen and executed or algorithms can be tuned and optimised. In-memory databases shifts the focus from disk to main memory accesses and CPU costs, compared to disk based systems where input and output costs dominate the overall costs and other processing costs are often neglected. However, modelling memory accesses is fundamentally different and common models do not apply anymore. This work presents a detailed parameter evaluation for the plan operators scan with equality selection, scan with range selection, positional lookup and insert in in-memory column stores. Based on this evaluation, a cost model based on cache misses for estimating the runtime of the considered plan operators using different data structures is developed. Considered are uncompressed columns, bit compressed and dictionary encoded columns with sorted and unsorted dictionaries. Furthermore, tree indices on the columns and dictionaries are discussed. Finally, partitioned columns consisting of one partition with a sorted and one with an unsorted dictionary are investigated. New values are inserted in the unsorted dictionary partition and moved periodically by a merge process to the sorted partition. An efficient attribute merge algorithm is described, supporting the update performance required to run enterprise applications on read-optimised databases. Further, a memory traffic based cost model for the merge process is provided.
Transmorphic
(2016)
Defining Graphical User Interfaces (GUIs) through functional abstractions can reduce the complexity that arises from mutable abstractions. Recent examples, such as Facebook's React GUI framework have shown, how modelling the view as a functional projection from the application state to a visual representation can reduce the number of interacting objects and thus help to improve the reliabiliy of the system. This however comes at the price of a more rigid, functional framework where programmers are forced to express visual entities with functional abstractions, detached from the way one intuitively thinks about the physical world.
In contrast to that, the GUI Framework Morphic allows interactions in the graphical domain, such as grabbing, dragging or resizing of elements to evolve an application at runtime, providing liveness and directness in the development workflow. Modelling each visual entity through mutable abstractions however makes it difficult to ensure correctness when GUIs start to grow more complex. Furthermore, by evolving morphs at runtime through direct manipulation we diverge more and more from the symbolic description that corresponds to the morph. Given that both of these approaches have their merits and problems, is there a way to combine them in a meaningful way that preserves their respective benefits?
As a solution for this problem, we propose to lift Morphic's concept of direct manipulation from the mutation of state to the transformation of source code. In particular, we will explore the design, implementation and integration of a bidirectional mapping between the graphical representation and a functional and declarative symbolic description of a graphical user interface within a self hosted development environment. We will present Transmorphic, a functional take on the Morphic GUI Framework, where the visual and structural properties of morphs are defined in a purely functional, declarative fashion. In Transmorphic, the developer is able to assemble different morphs at runtime through direct manipulation which is automatically translated into changes in the code of the application. In this way, the comprehensiveness and predictability of direct manipulation can be used in the context of a purely functional GUI, while the effects of the manipulation are reflected in a medium that is always in reach for the programmer and can even be used to incorporate the source transformations into the source files of the application.
Graphs are ubiquitous in Computer Science. For this reason, in many areas, it is very important to have the means to express and reason about graph properties. In particular, we want to be able to check automatically if a given graph property is satisfiable. Actually, in most application scenarios it is desirable to be able to explore graphs satisfying the graph property if they exist or even to get a complete and compact overview of the graphs satisfying the graph property.
We show that the tableau-based reasoning method for graph properties as introduced by Lambers and Orejas paves the way for a symbolic model generation algorithm for graph properties. Graph properties are formulated in a dedicated logic making use of graphs and graph morphisms, which is equivalent to firstorder logic on graphs as introduced by Courcelle. Our parallelizable algorithm gradually generates a finite set of so-called symbolic models, where each symbolic model describes a set of finite graphs (i.e., finite models) satisfying the graph property. The set of symbolic models jointly describes all finite models for the graph property (complete) and does not describe any finite graph violating the graph property (sound). Moreover, no symbolic model is already covered by another one (compact). Finally, the algorithm is able to generate from each symbolic model a minimal finite model immediately and allows for an exploration of further finite models. The algorithm is implemented in the new tool AutoGraph.
Parts without a whole?
(2015)
This explorative study gives a descriptive overview of what organizations do and experience when they say they practice design thinking. It looks at how the concept has been appropriated in organizations and also describes patterns of design thinking adoption. The authors use a mixed-method research design fed by two sources: questionnaire data and semi-structured personal expert interviews. The study proceeds in six parts: (1) design thinking¹s entry points into organizations; (2) understandings of the descriptor; (3) its fields of application and organizational localization; (4) its perceived impact; (5) reasons for its discontinuation or failure; and (6) attempts to measure its success. In conclusion the report challenges managers to be more conscious of their current design thinking practice. The authors suggest a co-evolution of the concept¹s introduction with innovation capability building and the respective changes in leadership approaches. It is argued that this might help in unfolding design thinking¹s hidden potentials as well as preventing unintended side-effects such as discontented teams or the dwindling authority of managers.
Companies strive to improve their business processes in order to remain competitive. Process mining aims to infer meaningful insights from process-related data and attracted the attention of practitioners, tool-vendors, and researchers in recent years. Traditionally, event logs are assumed to describe the as-is situation. But this is not necessarily the case in environments where logging may be compromised due to manual logging. For example, hospital staff may need to manually enter information regarding the patient’s treatment. As a result, events or timestamps may be missing or incorrect. In this paper, we make use of process knowledge captured in process models, and provide a method to repair missing events in the logs. This way, we facilitate analysis of incomplete logs. We realize the repair by combining stochastic Petri nets, alignments, and Bayesian networks. We evaluate the results using both synthetic data and real event data from a Dutch hospital.
Organizations try to gain competitive advantages, and to increase customer satisfaction. To ensure the quality and efficiency of their business processes, they perform business process management. An important part of process management that happens on the daily operational level is process controlling. A prerequisite of controlling is process monitoring, i.e., keeping track of the performed activities in running process instances. Only by process monitoring can business analysts detect delays and react to deviations from the expected or guaranteed performance of a process instance. To enable monitoring, process events need to be collected from the process environment. When a business process is orchestrated by a process execution engine, monitoring is available for all orchestrated process activities. Many business processes, however, do not lend themselves to automatic orchestration, e.g., because of required freedom of action. This situation is often encountered in hospitals, where most business processes are manually enacted. Hence, in practice it is often inefficient or infeasible to document and monitor every process activity. Additionally, manual process execution and documentation is prone to errors, e.g., documentation of activities can be forgotten. Thus, organizations face the challenge of process events that occur, but are not observed by the monitoring environment. These unobserved process events can serve as basis for operational process decisions, even without exact knowledge of when they happened or when they will happen. An exemplary decision is whether to invest more resources to manage timely completion of a case, anticipating that the process end event will occur too late. This thesis offers means to reason about unobserved process events in a probabilistic way. We address decisive questions of process managers (e.g., "when will the case be finished?", or "when did we perform the activity that we forgot to document?") in this thesis. As main contribution, we introduce an advanced probabilistic model to business process management that is based on a stochastic variant of Petri nets. We present a holistic approach to use the model effectively along the business process lifecycle. Therefore, we provide techniques to discover such models from historical observations, to predict the termination time of processes, and to ensure quality by missing data management. We propose mechanisms to optimize configuration for monitoring and prediction, i.e., to offer guidance in selecting important activities to monitor. An implementation is provided as a proof of concept. For evaluation, we compare the accuracy of the approach with that of state-of-the-art approaches using real process data of a hospital. Additionally, we show its more general applicability in other domains by applying the approach on process data from logistics and finance.
Contents: Artem Polyvanny, Sergey Smirnow, and Mathias Weske The Triconnected Abstraction of Process Models 1 Introduction 2 Business Process Model Abstraction 3 Preliminaries 4 Triconnected Decomposition 4.1 Basic Approach for Process Component Discovery 4.2 SPQR-Tree Decomposition 4.3 SPQR-Tree Fragments in the Context of Process Models 5 Triconnected Abstraction 5.1 Abstraction Rules 5.2 Abstraction Algorithm 6 Related Work and Conclusions
This contribution presents a quantitative evaluation procedure for Information Retrieval models and the results of this procedure applied on the enhanced Topic-based Vector Space Model (eTVSM). Since the eTVSM is an ontology-based model, its effectiveness heavily depends on the quality of the underlaying ontology. Therefore the model has been tested with different ontologies to evaluate the impact of those ontologies on the effectiveness of the eTVSM. On the highest level of abstraction, the following results have been observed during our evaluation: First, the theoretically deduced statement that the eTVSM has a similar effecitivity like the classic Vector Space Model if a trivial ontology (every term is a concept and it is independet of any other concepts) is used has been approved. Second, we were able to show that the effectiveness of the eTVSM raises if an ontology is used which is only able to resolve synonyms. We were able to derive such kind of ontology automatically from the WordNet ontology. Third, we observed that more powerful ontologies automatically derived from the WordNet, dramatically dropped the effectiveness of the eTVSM model even clearly below the effectiveness level of the Vector Space Model. Fourth, we were able to show that a manually created and optimized ontology is able to raise the effectiveness of the eTVSM to a level which is clearly above the best effectiveness levels we have found in the literature for the Latent Semantic Index model with compareable document sets.
Structuring process models
(2012)
One can fairly adopt the ideas of Donald E. Knuth to conclude that process modeling is both a science and an art. Process modeling does have an aesthetic sense. Similar to composing an opera or writing a novel, process modeling is carried out by humans who undergo creative practices when engineering a process model. Therefore, the very same process can be modeled in a myriad number of ways. Once modeled, processes can be analyzed by employing scientific methods. Usually, process models are formalized as directed graphs, with nodes representing tasks and decisions, and directed arcs describing temporal constraints between the nodes. Common process definition languages, such as Business Process Model and Notation (BPMN) and Event-driven Process Chain (EPC) allow process analysts to define models with arbitrary complex topologies. The absence of structural constraints supports creativity and productivity, as there is no need to force ideas into a limited amount of available structural patterns. Nevertheless, it is often preferable that models follow certain structural rules. A well-known structural property of process models is (well-)structuredness. A process model is (well-)structured if and only if every node with multiple outgoing arcs (a split) has a corresponding node with multiple incoming arcs (a join), and vice versa, such that the set of nodes between the split and the join induces a single-entry-single-exit (SESE) region; otherwise the process model is unstructured. The motivations for well-structured process models are manifold: (i) Well-structured process models are easier to layout for visual representation as their formalizations are planar graphs. (ii) Well-structured process models are easier to comprehend by humans. (iii) Well-structured process models tend to have fewer errors than unstructured ones and it is less probable to introduce new errors when modifying a well-structured process model. (iv) Well-structured process models are better suited for analysis with many existing formal techniques applicable only for well-structured process models. (v) Well-structured process models are better suited for efficient execution and optimization, e.g., when discovering independent regions of a process model that can be executed concurrently. Consequently, there are process modeling languages that encourage well-structured modeling, e.g., Business Process Execution Language (BPEL) and ADEPT. However, the well-structured process modeling implies some limitations: (i) There exist processes that cannot be formalized as well-structured process models. (ii) There exist processes that when formalized as well-structured process models require a considerable duplication of modeling constructs. Rather than expecting well-structured modeling from start, we advocate for the absence of structural constraints when modeling. Afterwards, automated methods can suggest, upon request and whenever possible, alternative formalizations that are "better" structured, preferably well-structured. In this thesis, we study the problem of automatically transforming process models into equivalent well-structured models. The developed transformations are performed under a strong notion of behavioral equivalence which preserves concurrency. The findings are implemented in a tool, which is publicly available.
The correction of software failures tends to be very cost-intensive because their debugging is an often time-consuming development activity. During this activity, developers largely attempt to understand what causes failures: Starting with a test case that reproduces the observable failure they have to follow failure causes on the infection chain back to the root cause (defect). This idealized procedure requires deep knowledge of the system and its behavior because failures and defects can be far apart from each other. Unfortunately, common debugging tools are inadequate for systematically investigating such infection chains in detail. Thus, developers have to rely primarily on their intuition and the localization of failure causes is not time-efficient. To prevent debugging by disorganized trial and error, experienced developers apply the scientific method and its systematic hypothesis-testing. However, even when using the scientific method, the search for failure causes can still be a laborious task. First, lacking expertise about the system makes it hard to understand incorrect behavior and to create reasonable hypotheses. Second, contemporary debugging approaches provide no or only partial support for the scientific method. In this dissertation, we present test-driven fault navigation as a debugging guide for localizing reproducible failures with the scientific method. Based on the analysis of passing and failing test cases, we reveal anomalies and integrate them into a breadth-first search that leads developers to defects. This systematic search consists of four specific navigation techniques that together support the creation, evaluation, and refinement of failure cause hypotheses for the scientific method. First, structure navigation localizes suspicious system parts and restricts the initial search space. Second, team navigation recommends experienced developers for helping with failures. Third, behavior navigation allows developers to follow emphasized infection chains back to root causes. Fourth, state navigation identifies corrupted state and reveals parts of the infection chain automatically. We implement test-driven fault navigation in our Path Tools framework for the Squeak/Smalltalk development environment and limit its computation cost with the help of our incremental dynamic analysis. This lightweight dynamic analysis ensures an immediate debugging experience with our tools by splitting the run-time overhead over multiple test runs depending on developers’ needs. Hence, our test-driven fault navigation in combination with our incremental dynamic analysis answers important questions in a short time: where to start debugging, who understands failure causes best, what happened before failures, and which state properties are infected.
There are two common approaches to implement a virtual machine (VM) for a dynamic object-oriented language. On the one hand, it can be implemented in a C-like language for best performance and maximum control over the resulting executable. On the other hand, it can be implemented in a language such as Java that allows for higher-level abstractions. These abstractions, such as proper object-oriented modularization, automatic memory management, or interfaces, are missing in C-like languages but they can simplify the implementation of prevalent but complex concepts in VMs, such as garbage collectors (GCs) or just-in-time compilers (JITs). Yet, the implementation of a dynamic object-oriented language in Java eventually results in two VMs on top of each other (double stack), which impedes performance. For statically typed languages, the Maxine VM solves this problem; it is written in Java but can be executed without a Java virtual machine (JVM). However, it is currently not possible to execute dynamic object-oriented languages in Maxine. This work presents an approach to bringing object models and execution models of dynamic object-oriented languages to the Maxine VM and the application of this approach to Squeak/Smalltalk. The representation of objects in and the execution of dynamic object-oriented languages pose certain challenges to the Maxine VM that lacks certain variation points necessary to enable an effortless and straightforward implementation of dynamic object-oriented languages' execution models. The implementation of Squeak/Smalltalk in Maxine as a feasibility study is to unveil such missing variation points.
Large open-source software projects involve developers with a wide variety of backgrounds and expertise. Such software projects furthermore include many internal APIs that developers must understand and use properly. According to the intended purpose of these APIs, they are more or less frequently used, and used by developers with more or less expertise. In this paper, we study the impact of usage patterns and developer expertise on the rate of defects occurring in the use of internal APIs. For this preliminary study, we focus on memory management APIs in the Linux kernel, as the use of these has been shown to be highly error prone in previous work. We study defect rates and developer expertise, to consider e.g., whether widely used APIs are more defect prone because they are used by less experienced developers, or whether defects in widely used APIs are more likely to be fixed.
Dynamics in urban environments encompasses complex processes and phenomena such as related to movement (e.g.,traffic, people) and development (e.g., construction, settlement). This paper presents novel methods for creating human-centric illustrative maps for visualizing the movement dynamics in virtual 3D environments. The methods allow a viewer to gain rapid insight into traffic density and flow. The illustrative maps represent vehicle behavior as light threads. Light threads are a familiar visual metaphor caused by moving light sources producing streaks in a long-exposure photograph. A vehicle’s front and rear lights produce light threads that convey its direction of motion as well as its velocity and acceleration. The accumulation of light threads allows a viewer to quickly perceive traffic flow and density. The light-thread technique is a key element to effective visualization systems for analytic reasoning, exploration, and monitoring of geospatial processes.
IT systems for healthcare are a complex and exciting field. One the one hand, there is a vast number of improvements and work alleviations that computers can bring to everyday healthcare. Some ways of treatment, diagnoses and organisational tasks were even made possible by computer usage in the first place. On the other hand, there are many factors that encumber computer usage and make development of IT systems for healthcare a challenging, sometimes even frustrating task. These factors are not solely technology-related, but just as well social or economical conditions. This report describes some of the idiosyncrasies of IT systems in the healthcare domain, with a special focus on legal regulations, standards and security.
Personal fabrication tools, such as 3D printers, are on the way of enabling a future in which non-technical users will be able to create custom objects. However, while the hardware is there, the current interaction model behind existing design tools is not suitable for non-technical users. Today, 3D printers are operated by fabricating the object in one go, which tends to take overnight due to the slow 3D printing technology. Consequently, the current interaction model requires users to think carefully before printing as every mistake may imply another overnight print. Planning every step ahead, however, is not feasible for non-technical users as they lack the experience to reason about the consequences of their design decisions.
In this dissertation, we propose changing the interaction model around personal fabrication tools to better serve this user group. We draw inspiration from personal computing and argue that the evolution of personal fabrication may resemble the evolution of personal computing: Computing started with machines that executed a program in one go before returning the result to the user. By decreasing the interaction unit to single requests, turn-taking systems such as the command line evolved, which provided users with feedback after every input. Finally, with the introduction of direct-manipulation interfaces, users continuously interacted with a program receiving feedback about every action in real-time. In this dissertation, we explore whether these interaction concepts can be applied to personal fabrication as well.
We start with fabricating an object in one go and investigate how to tighten the feedback-cycle on an object-level: We contribute a method called low-fidelity fabrication, which saves up to 90% fabrication time by creating objects as fast low-fidelity previews, which are sufficient to evaluate key design aspects. Depending on what is currently being tested, we propose different conversions that enable users to focus on different parts: faBrickator allows for a modular design in the early stages of prototyping; when users move on WirePrint allows quickly testing an object's shape, while Platener allows testing an object's technical function. We present an interactive editor for each technique and explain the underlying conversion algorithms.
By interacting on smaller units, such as a single element of an object, we explore what it means to transition from systems that fabricate objects in one go to turn-taking systems. We start with a 2D system called constructable: Users draw with a laser pointer onto the workpiece inside a laser cutter. The drawing is captured with an overhead camera. As soon as the the user finishes drawing an element, such as a line, the constructable system beautifies the path and cuts it--resulting in physical output after every editing step. We extend constructable towards 3D editing by developing a novel laser-cutting technique for 3D objects called LaserOrigami that works by heating up the workpiece with the defocused laser until the material becomes compliant and bends down under gravity. While constructable and LaserOrigami allow for fast physical feedback, the interaction is still best described as turn-taking since it consists of two discrete steps: users first create an input and afterwards the system provides physical output.
By decreasing the interaction unit even further to a single feature, we can achieve real-time physical feedback: Input by the user and output by the fabrication device are so tightly coupled that no visible lag exists. This allows us to explore what it means to transition from turn-taking interfaces, which only allow exploring one option at a time, to direct manipulation interfaces with real-time physical feedback, which allow users to explore the entire space of options continuously with a single interaction. We present a system called FormFab, which allows for such direct control. FormFab is based on the same principle as LaserOrigami: It uses a workpiece that when warmed up becomes compliant and can be reshaped. However, FormFab achieves the reshaping not based on gravity, but through a pneumatic system that users can control interactively. As users interact, they see the shape change in real-time.
We conclude this dissertation by extrapolating the current evolution into a future in which large numbers of people use the new technology to create objects. We see two additional challenges on the horizon: sustainability and intellectual property. We investigate sustainability by demonstrating how to print less and instead patch physical objects. We explore questions around intellectual property with a system called Scotty that transfers objects without creating duplicates, thereby preserving the designer's copyright.
Process models specify behavioral execution constraints between activities as well as between activities and data objects. A data object is characterized by its states and state transitions represented as object life cycle. For process execution, all behavioral execution constraints must be correct. Correctness can be verified via soundness checking which currently only considers control flow information. For data correctness, conformance between a process model and its object life cycles is checked. Current approaches abstract from dependencies between multiple data objects and require fully specified process models although, in real-world process repositories, often underspecified models are found. Coping with these issues, we introduce the concept of synchronized object life cycles and we define a mapping of data constraints of a process model to Petri nets extending an existing mapping. Further, we apply the notion of weak conformance to process models to tell whether each time an activity needs to access a data object in a particular state, it is guaranteed that the data object is in or can reach the expected state. Then, we introduce an algorithm for an integrated verification of control flow correctness and weak data conformance using soundness checking.
Enacting business processes in process engines requires the coverage of control flow, resource assignments, and process data. While the first two aspects are well supported in current process engines, data dependencies need to be added and maintained manually by a process engineer. Thus, this task is error-prone and time-consuming. In this report, we address the problem of modeling processes with complex data dependencies, e.g., m:n relationships, and their automatic enactment from process models. First, we extend BPMN data objects with few annotations to allow data dependency handling as well as data instance differentiation. Second, we introduce a pattern-based approach to derive SQL queries from process models utilizing the above mentioned extensions. Therewith, we allow automatic enactment of data-aware BPMN process models. We implemented our approach for the Activiti process engine to show applicability.
Business process management (BPM) is a systematic and structured approach to model, analyze, control, and execute business operations also referred to as business processes that get carried out to achieve business goals. Central to BPM are conceptual models. Most prominently, process models describe which tasks are to be executed by whom utilizing which information to reach a business goal. Process models generally cover the perspectives of control flow, resource, data flow, and information systems.
Execution of business processes leads to the work actually being carried out. Automating them increases the efficiency and is usually supported by process engines. This, though, requires the coverage of control flow, resource assignments, and process data. While the first two perspectives are well supported in current process engines, data handling needs to be implemented and maintained manually. However, model-driven data handling promises to ease implementation, reduces the error-proneness through graphical visualization, and reduces development efforts through code generation.
This thesis addresses the modeling, analysis, and execution of data in business processes and presents a novel approach to execute data-annotated process models entirely model-driven. As a first step and formal grounding for the process execution, a conceptual framework for the integration of processes and data is introduced. This framework is complemented by operational semantics through a Petri net mapping extended with data considerations. Model-driven data execution comprises the handling of complex data dependencies, process data, and data exchange in case of communication between multiple process participants. This thesis introduces concepts from the database domain into BPM to enable the distinction of data operations, to specify relations between data objects of the same as well as of different types, to correlate modeled data nodes as well as received messages to the correct run-time process instances, and to generate messages for inter-process communication. The underlying approach, which is not limited to a particular process description language, has been implemented as proof-of-concept.
Automation of data handling in business processes requires data-annotated and correct process models. Targeting the former, algorithms are introduced to extract information about data nodes, their states, and data dependencies from control information and to annotate the process model accordingly. Usually, not all required information can be extracted from control flow information, since some data manipulations are not specified. This requires further refinement of the process model. Given a set of object life cycles specifying allowed data manipulations, automated refinement of the process model towards containment of all data manipulations is enabled. Process models are an abstraction focusing on specific aspects in detail, e.g., the control flow and the data flow views are often represented through activity-centric and object-centric process models. This thesis introduces algorithms for roundtrip transformations enabling the stakeholder to add information to the process model in the view being most appropriate.
Targeting process model correctness, this thesis introduces the notion of weak conformance that checks for consistency between given object life cycles and the process model such that the process model may only utilize data manipulations specified directly or indirectly in an object life cycle. The notion is computed via soundness checking of a hybrid representation integrating control flow and data flow correctness checking. Making a process model executable, identified violations must be corrected. Therefore, an approach is proposed that identifies for each violation multiple, alternative changes to the process model or the object life cycles.
Utilizing the results of this thesis, business processes can be executed entirely model-driven from the data perspective in addition to the control flow and resource perspectives already supported before. Thereby, the model creation is supported by algorithms partly automating the creation process while model consistency is ensured by data correctness checks.
Service-oriented Architectures (SOA) facilitate the provision and orchestration of business services to enable a faster adoption to changing business demands. Web Services provide a technical foundation to implement this paradigm on the basis of XML-messaging. However, the enhanced flexibility of message-based systems comes along with new threats and risks. To face these issues, a variety of security mechanisms and approaches is supported by the Web Service specifications. The usage of these security mechanisms and protocols is configured by stating security requirements in security policies. However, security policy languages for SOA are complex and difficult to create due to the expressiveness of these languages. To facilitate and simplify the creation of security policies, this thesis presents a model-driven approach that enables the generation of complex security policies on the basis of simple security intentions. SOA architects can specify these intentions in system design models and are not required to deal with complex technical security concepts. The approach introduced in this thesis enables the enhancement of any system design modelling languages – for example FMC or BPMN – with security modelling elements. The syntax, semantics, and notion of these elements is defined by our security modelling language SecureSOA. The metamodel of this language provides extension points to enable the integration into system design modelling languages. In particular, this thesis demonstrates the enhancement of FMC block diagrams with SecureSOA. To enable the model-driven generation of security policies, a domain-independent policy model is introduced in this thesis. This model provides an abstraction layer for security policies. Mappings are used to perform the transformation from our model to security policy languages. However, expert knowledge is required to generate instances of this model on the basis of simple security intentions. Appropriate security mechanisms, protocols and options must be chosen and combined to fulfil these security intentions. In this thesis, a formalised system of security patterns is used to represent this knowledge and to enable an automated transformation process. Moreover, a domain-specific language is introduced to state security patterns in an accessible way. On the basis of this language, a system of security configuration patterns is provided to transform security intentions related to data protection and identity management. The formal semantics of the security pattern language enable the verification of the transformation process introduced in this thesis and prove the correctness of the pattern application. Finally, our SOA Security LAB is presented that demonstrates the application of our model-driven approach to facilitate a dynamic creation, configuration, and execution of secure Web Service-based composed applications.
The new interactive online educational platform openHPI, (https://openHPI.de) from Hasso Plattner Institute (HPI), offers freely accessible courses at no charge for all who are interested in subjects in the field of information technology and computer science. Since 2011, “Massive Open Online Courses,” called MOOCs for short, have been offered, first at Stanford University and then later at other U.S. elite universities. Following suit, openHPI provides instructional videos on the Internet and further reading material, combined with learning-supportive self-tests, homework and a social discussion forum. Education is further stimulated by the support of a virtual learning community. In contrast to “traditional” lecture platforms, such as the tele-TASK portal (http://www.tele-task.de) where multimedia recorded lectures are available on demand, openHPI offers didactic online courses. The courses have a fixed start date and offer a balanced schedule of six consecutive weeks presented in multimedia and, whenever possible, interactive learning material. Each week, one chapter of the course subject is treated. In addition, a series of learning videos, texts, self-tests and homework exercises are provided to course participants at the beginning of the week. The course offering is combined with a social discussion platform where participants have the opportunity to enter into an exchange with course instructors and fellow participants. Here, for example, they can get answers to questions and discuss the topics in depth. The participants naturally decide themselves about the type and range of their learning activities. They can make personal contributions to the course, for example, in blog posts or tweets, which they can refer to in the forum. In turn, other participants have the chance to comment on, discuss or expand on what has been said. In this way, the learners become the teachers and the subject matter offered to a virtual community is linked to a social learning network.
Erster Deutscher IPv6 Gipfel
(2008)
Inhalt: KOMMUNIQUÉ GRUßWORT PROGRAMM HINTERGRÜNDE UND FAKTEN REFERENTEN: BIOGRAFIE & VOTRAGSZUSAMMENFASSUNG 1.) DER ERSTE DEUTSCHE IPV6 GIPFEL AM HASSO PLATTNER INSTITUT IN POTSDAM - PROF. DR. CHRISTOPH MEINEL - VIVIANE REDING 2.) IPV6, ITS TIME HAS COME - VINTON CERF 3.) DIE BEDEUTUNG VON IPV6 FÜR DIE ÖFFENTLICHE VERWALTUNG IN DEUTSCHLAND - MARTIN SCHALLBRUCH 4.) TOWARDS THE FUTURE OF THE INTERNET - PROF. DR. LUTZ HEUSER 5.) IPV6 STRATEGY & DEPLOYMENT STATUS IN JAPAN - HIROSHI MIYATA 6.) IPV6 STRATEGY & DEPLOYMENT STATUS IN CHINA - PROF. WU HEQUAN 7.) IPV6 STRATEGY AND DEPLOYMENT STATUS IN KOREA - DR. EUNSOOK KIM 8.) IPV6 DEPLOYMENT EXPERIENCES IN GREEK SCHOOL NETWORK - ATHANASSIOS LIAKOPOULOS 9.) IPV6 NETWORK MOBILITY AND IST USAGE - JEAN-MARIE BONNIN 10.) IPV6 - RÜSTZEUG FÜR OPERATOR & ISP IPV6 DEPLOYMENT UND STRATEGIEN DER DEUTSCHEN TELEKOM - HENNING GROTE 11.) VIEW FROM THE IPV6 DEPLOYMENT FRONTLINE - YVES POPPE 12.) DEPLOYING IPV6 IN MOBILE ENVIRONMENTS - WOLFGANG FRITSCHE 13.) PRODUCTION READY IPV6 FROM CUSTOMER LAN TO THE INTERNET - LUTZ DONNERHACKE 14.) IPV6 - DIE BASIS FÜR NETZWERKZENTRIERTE OPERATIONSFÜHRUNG (NETOPFÜ) IN DER BUNDESWEHR HERAUSFORDERUNGEN - ANWENDUNGSFALLBETRACHTUNGEN - AKTIVITÄTEN - CARSTEN HATZIG 15.) WINDOWS VISTA & IPV6 - BERND OURGHANLIAN 16.) IPV6 & HOME NETWORKING TECHINCAL AND BUSINESS CHALLENGES - DR. TAYEB BEN MERIEM 17.) DNS AND DHCP FOR DUAL STACK NETWORKS - LAWRENCE HUGHES 18.) CAR INDUSTRY: GERMAN EXPERIENCE WITH IPV6 - AMARDEO SARMA 19.) IPV6 & AUTONOMIC NETWORKING - RANGANAI CHAPARADZA 20.) P2P & GRID USING IPV6 AND MOBILE IPV6 - DR. LATIF LADID
Design and Implementation of service-oriented architectures imposes a huge number of research questions from the fields of software engineering, system analysis and modeling, adaptability, and application integration. Component orientation and web services are two approaches for design and realization of complex web-based system. Both approaches allow for dynamic application adaptation as well as integration of enterprise application. Commonly used technologies, such as J2EE and .NET, form de facto standards for the realization of complex distributed systems. Evolution of component systems has lead to web services and service-based architectures. This has been manifested in a multitude of industry standards and initiatives such as XML, WSDL UDDI, SOAP, etc. All these achievements lead to a new and promising paradigm in IT systems engineering which proposes to design complex software solutions as collaboration of contractually defined software services. Service-Oriented Systems Engineering represents a symbiosis of best practices in object-orientation, component-based development, distributed computing, and business process management. It provides integration of business and IT concerns. The annual Ph.D. Retreat of the Research School provides each member the opportunity to present his/her current state of their research and to give an outline of a prospective Ph.D. thesis. Due to the interdisciplinary structure of the Research Scholl, this technical report covers a wide range of research topics. These include but are not limited to: Self-Adaptive Service-Oriented Systems, Operating System Support for Service-Oriented Systems, Architecture and Modeling of Service-Oriented Systems, Adaptive Process Management, Services Composition and Workflow Planning, Security Engineering of Service-Based IT Systems, Quantitative Analysis and Optimization of Service-Oriented Systems, Service-Oriented Systems in 3D Computer Graphics sowie Service-Oriented Geoinformatics.
Today, software has become an intrinsic part of complex distributed embedded real-time systems. The next generation of embedded real-time systems will interconnect the today unconnected systems via complex software parts and the service-oriented paradigm. Therefore besides timed behavior and probabilistic behaviour also structure dynamics, where the architecture can be subject to changes at run-time, e.g. when dynamic binding of service end-points is employed or complex collaborations are established dynamically, is required. However, a modeling and analysis approach that combines all these necessary aspects does not exist so far.
To fill the identified gap, we propose Probabilistic Timed Graph Transformation Systems (PTGTSs) as a high-level description language that supports all the necessary aspects of structure dynamics, timed behavior, and probabilistic behavior. We introduce the formal model of PTGTSs in this paper and present a mapping of models with finite state spaces to probabilistic timed automata (PTA) that allows to use the PRISM model checker to analyze PTGTS models with respect to PTCTL properties.
In current practice, business processes modeling is done by trained method experts. Domain experts are interviewed to elicit their process information but not involved in modeling. We created a haptic toolkit for process modeling that can be used in process elicitation sessions with domain experts. We hypothesize that this leads to more effective process elicitation. This paper brakes down "effective elicitation" to 14 operationalized hypotheses. They are assessed in a controlled experiment using questionnaires, process model feedback tests and video analysis. The experiment compares our approach to structured interviews in a repeated measurement design. We executed the experiment with 17 student clerks from a trade school. They represent potential users of the tool. Six out of fourteen hypotheses showed significant difference due to the method applied. Subjects reported more fun and more insights into process modeling with tangible media. Video analysis showed significantly more reviews and corrections applied during process elicitation. Moreover, people take more time to talk and think about their processes. We conclude that tangible media creates a different working mode for people in process elicitation with fun, new insights and instant feedback on preliminary results.
The term Linked Data refers to connected information sources comprising structured data about a wide range of topics and for a multitude of applications. In recent years, the conceptional and technical foundations of Linked Data have been formalized and refined. To this end, well-known technologies have been established, such as the Resource Description Framework (RDF) as a Linked Data model or the SPARQL Protocol and RDF Query Language (SPARQL) for retrieving this information. Whereas most research has been conducted in the area of generating and publishing Linked Data, this thesis presents novel approaches for improved management. In particular, we illustrate new methods for analyzing and processing SPARQL queries. Here, we present two algorithms suitable for identifying structural relationships between these queries. Both algorithms are applied to a large number of real-world requests to evaluate the performance of the approaches and the quality of their results. Based on this, we introduce different strategies enabling optimized access of Linked Data sources. We demonstrate how the presented approach facilitates effective utilization of SPARQL endpoints by prefetching results relevant for multiple subsequent requests. Furthermore, we contribute a set of metrics for determining technical characteristics of such knowledge bases. To this end, we devise practical heuristics and validate them through thorough analysis of real-world data sources. We discuss the findings and evaluate their impact on utilizing the endpoints. Moreover, we detail the adoption of a scalable infrastructure for improving Linked Data discovery and consumption. As we outline in an exemplary use case, this platform is eligible both for processing and provisioning the corresponding information.
1 Introduction 1.1 Project formulation 1.2 Our contribution 2 Pedagogical Aspect 4 2.1 Modern teaching 2.2 Our Contribution 2.2.1 Autonomous and exploratory learning 2.2.2 Human machine interaction 2.2.3 Short multimedia clips 3 Ontology Aspect 3.1 Ontology driven expert systems 3.2 Our contribution 3.2.1 Ontology language 3.2.2 Concept Taxonomy 3.2.3 Knowledge base annotation 3.2.4 Description Logics 4 Natural language approach 4.1 Natural language processing in computer science 4.2 Our contribution 4.2.1 Explored strategies 4.2.2 Word equivalence 4.2.3 Semantic interpretation 4.2.4 Various problems 5 Information Retrieval Aspect 5.1 Modern information retrieval 5.2 Our contribution 5.2.1 Semantic query generation 5.2.2 Semantic relatedness 6 Implementation 6.1 Prototypes 6.2 Semantic layer architecture 6.3 Development 7 Experiments 7.1 Description of the experiments 7.2 General characteristics of the three sessions, instructions and procedure 7.3 First Session 7.4 Second Session 7.5 Third Session 7.6 Discussion and conclusion 8 Conclusion and future work 8.1 Conclusion 8.2 Open questions A Description Logics B Probabilistic context-free grammars
Roughly every third Wikipedia article contains an infobox - a table that displays important facts about the subject in attribute-value form. The schema of an infobox, i.e., the attributes that can be expressed for a concept, is defined by an infobox template. Often, authors do not specify all template attributes, resulting in incomplete infoboxes. With iPopulator, we introduce a system that automatically populates infoboxes of Wikipedia articles by extracting attribute values from the article's text. In contrast to prior work, iPopulator detects and exploits the structure of attribute values for independently extracting value parts. We have tested iPopulator on the entire set of infobox templates and provide a detailed analysis of its effectiveness. For instance, we achieve an average extraction precision of 91% for 1,727 distinct infobox template attributes.
Given a large set of records in a database and a query record, similarity search aims to find all records sufficiently similar to the query record. To solve this problem, two main aspects need to be considered: First, to perform effective search, the set of relevant records is defined using a similarity measure. Second, an efficient access method is to be found that performs only few database accesses and comparisons using the similarity measure. This thesis solves both aspects with an emphasis on the latter. In the first part of this thesis, a frequency-aware similarity measure is introduced. Compared record pairs are partitioned according to frequencies of attribute values. For each partition, a different similarity measure is created: machine learning techniques combine a set of base similarity measures into an overall similarity measure. After that, a similarity index for string attributes is proposed, the State Set Index (SSI), which is based on a trie (prefix tree) that is interpreted as a nondeterministic finite automaton. For processing range queries, the notion of query plans is introduced in this thesis to describe which similarity indexes to access and which thresholds to apply. The query result should be as complete as possible under some cost threshold. Two query planning variants are introduced: (1) Static planning selects a plan at compile time that is used for all queries. (2) Query-specific planning selects a different plan for each query. For answering top-k queries, the Bulk Sorted Access Algorithm (BSA) is introduced, which retrieves large chunks of records from the similarity indexes using fixed thresholds, and which focuses its efforts on records that are ranked high in more than one attribute and thus promising candidates. The described components form a complete similarity search system. Based on prototypical implementations, this thesis shows comparative evaluation results for all proposed approaches on different real-world data sets, one of which is a large person data set from a German credit rating agency.
In the early days of computer graphics, research was mainly driven by the goal to create realistic synthetic imagery. By contrast, non-photorealistic computer graphics, established as its own branch of computer graphics in the early 1990s, is mainly motivated by concepts and principles found in traditional art forms, such as painting, illustration, and graphic design, and it investigates concepts and techniques that abstract from reality using expressive, stylized, or illustrative rendering techniques. This thesis focuses on the artistic stylization of two-dimensional content and presents several novel automatic techniques for the creation of simplified stylistic illustrations from color images, video, and 3D renderings. Primary innovation of these novel techniques is that they utilize the smooth structure tensor as a simple and efficient way to obtain information about the local structure of an image. More specifically, this thesis contributes to knowledge in this field in the following ways. First, a comprehensive review of the structure tensor is provided. In particular, different methods for integrating the minor eigenvector field of the smoothed structure tensor are developed, and the superiority of the smoothed structure tensor over the popular edge tangent flow is demonstrated. Second, separable implementations of the popular bilateral and difference of Gaussians filters that adapt to the local structure are presented. These filters avoid artifacts while being computationally highly efficient. Taken together, both provide an effective way to create a cartoon-style effect. Third, a generalization of the Kuwahara filter is presented that avoids artifacts by adapting the shape, scale, and orientation of the filter to the local structure. This causes directional image features to be better preserved and emphasized, resulting in overall sharper edges and a more feature-abiding painterly effect. In addition to the single-scale variant, a multi-scale variant is presented, which is capable of performing a highly aggressive abstraction. Fourth, a technique that builds upon the idea of combining flow-guided smoothing with shock filtering is presented, allowing for an aggressive exaggeration and an emphasis of directional image features. All presented techniques are suitable for temporally coherent per-frame filtering of video or dynamic 3D renderings, without requiring expensive extra processing, such as optical flow. Moreover, they can be efficiently implemented to process content in real-time on a GPU.
It is predicted that Service-oriented Architectures (SOA) will have a high impact on future electronic business and markets. Services will provide an self-contained and standardised interface towards business and are considered as the future platform for business-to-business and business-toconsumer trades. Founded by the complexity of real world business scenarios a huge need for an easy, flexible and automated creation and enactment of service compositions is observed. This survey explores the relationship of service composition with workflow management—a technology/ concept already in use in many business environments. The similarities between the both and the key differences between them are elaborated. Furthermore methods for composition of services ranging from manual, semi- to full-automated composition are sketched. This survey concludes that current tools for service composition are in an immature state and that there is still much research to do before service composition can be used easily and conveniently in real world scenarios. However, since automated service composition is a key enabler for the full potential of Service-oriented Architectures, further research on this field is imperative. This survey closes with a formal sample scenario presented in appendix A to give the reader an impression on how full-automated service composition works.
HPI Future SOC Lab
(2015)
Das Future SOC Lab am HPI ist eine Kooperation des Hasso-Plattner-Instituts mit verschiedenen Industriepartnern. Seine Aufgabe ist die Ermöglichung und Förderung des Austausches zwischen Forschungsgemeinschaft und Industrie.
Am Lab wird interessierten Wissenschaftlern eine Infrastruktur von neuester Hard- und Software kostenfrei für Forschungszwecke zur Verfügung gestellt. Dazu zählen teilweise noch nicht am Markt verfügbare Technologien, die im normalen Hochschulbereich in der Regel nicht zu finanzieren wären, bspw. Server mit bis zu 64 Cores und 2 TB Hauptspeicher. Diese Angebote richten sich insbesondere an Wissenschaftler in den Gebieten Informatik und Wirtschaftsinformatik. Einige der Schwerpunkte sind Cloud Computing, Parallelisierung und In-Memory Technologien.
In diesem Technischen Bericht werden die Ergebnisse der Forschungsprojekte des Jahres 2015 vorgestellt. Ausgewählte Projekte stellten ihre Ergebnisse am 15. April 2015 und 4. November 2015 im Rahmen der Future SOC Lab Tag Veranstaltungen vor.
Business processes are fundamental to the operations of a company. Each product manufactured and every service provided is the result of a series of actions that constitute a business process. Business process management is an organizational principle that makes the processes of a company explicit and offers capabilities to implement procedures, control their execution, analyze their performance, and improve them. Therefore, business processes are documented as process models that capture these actions and their execution ordering, and make them accessible to stakeholders. As these models are an essential knowledge asset, they need to be managed effectively. In particular, the discovery and reuse of existing knowledge becomes challenging in the light of companies maintaining hundreds and thousands of process models. In practice, searching process models has been solved only superficially by means of free-text search of process names and their descriptions. Scientific contributions are limited in their scope, as they either present measures for process similarity or elaborate on query languages to search for particular aspects. However, they fall short in addressing efficient search, the presentation of search results, and the support to reuse discovered models. This thesis presents a novel search method, where a query is expressed by an exemplary business process model that describes the behavior of a possible answer. This method builds upon a formal framework that captures and compares the behavior of process models by the execution ordering of actions. The framework contributes a conceptual notion of behavioral distance that quantifies commonalities and differences of a pair of process models, and enables process model search. Based on behavioral distances, a set of measures is proposed that evaluate the quality of a particular search result to guide the user in assessing the returned matches. A projection of behavioral aspects to a process model enables highlighting relevant fragments that led to a match and facilitates its reuse. The thesis further elaborates on two search techniques that provide concrete behavioral distance functions as an instantiation of the formal framework. Querying enables search with a notion of behavioral inclusion with regard to the query. In contrast, similarity search obtains process models that are similar to a query, even if the query is not precisely matched. For both techniques, indexes are presented that enable efficient search. Methods to evaluate the quality and performance of process model search are introduced and applied to the techniques of this thesis. They show good results with regard to human assessment and scalability in a practical setting.
One of the key challenges in service-oriented systems engineering is the prediction and assurance of non-functional properties, such as the reliability and the availability of composite interorganizational services. Such systems are often characterized by a variety of inherent uncertainties, which must be addressed in the modeling and the analysis approach. The different relevant types of uncertainties can be categorized into (1) epistemic uncertainties due to incomplete knowledge and (2) randomization as explicitly used in protocols or as a result of physical processes. In this report, we study a probabilistic timed model which allows us to quantitatively reason about nonfunctional properties for a restricted class of service-oriented real-time systems using formal methods. To properly motivate the choice for the used approach, we devise a requirements catalogue for the modeling and the analysis of probabilistic real-time systems with uncertainties and provide evidence that the uncertainties of type (1) and (2) in the targeted systems have a major impact on the used models and require distinguished analysis approaches. The formal model we use in this report are Interval Probabilistic Timed Automata (IPTA). Based on the outlined requirements, we give evidence that this model provides both enough expressiveness for a realistic and modular specifiation of the targeted class of systems, and suitable formal methods for analyzing properties, such as safety and reliability properties in a quantitative manner. As technical means for the quantitative analysis, we build on probabilistic model checking, specifically on probabilistic time-bounded reachability analysis and computation of expected reachability rewards and costs. To carry out the quantitative analysis using probabilistic model checking, we developed an extension of the Prism tool for modeling and analyzing IPTA. Our extension of Prism introduces a means for modeling probabilistic uncertainty in the form of probability intervals, as required for IPTA. For analyzing IPTA, our Prism extension moreover adds support for probabilistic reachability checking and computation of expected rewards and costs. We discuss the performance of our extended version of Prism and compare the interval-based IPTA approach to models with fixed probabilities.
The modeling and evaluation calculus FMC-QE, the Fundamental Modeling Concepts for Quanti-tative Evaluation [1], extends the Fundamental Modeling Concepts (FMC) for performance modeling and prediction. In this new methodology, the hierarchical service requests are in the main focus, because they are the origin of every service provisioning process. Similar to physics, these service requests are a tuple of value and unit, which enables hierarchical service request transformations at the hierarchical borders and therefore the hierarchical modeling. Through reducing the model complexity of the models by decomposing the system in different hierarchical views, the distinction between operational and control states and the calculation of the performance values on the assumption of the steady state, FMC-QE has a scalable applica-bility on complex systems. According to FMC, the system is modeled in a 3-dimensional hierarchical representation space, where system performance parameters are described in three arbitrarily fine-grained hierarchi-cal bipartite diagrams. The hierarchical service request structures are modeled in Entity Relationship Diagrams. The static server structures, divided into logical and real servers, are de-scribed as Block Diagrams. The dynamic behavior and the control structures are specified as Petri Nets, more precisely Colored Time Augmented Petri Nets. From the structures and pa-rameters of the performance model, a hierarchical set of equations is derived. The calculation of the performance values is done on the assumption of stationary processes and is based on fundamental laws of the performance analysis: Little's Law and the Forced Traffic Flow Law. Little's Law is used within the different hierarchical levels (horizontal) and the Forced Traffic Flow Law is the key to the dependencies among the hierarchical levels (vertical). This calculation is suitable for complex models and allows a fast (re-)calculation of different performance scenarios in order to support development and configuration decisions. Within the Research Group Zorn at the Hasso Plattner Institute, the work is embedded in a broader research in the development of FMC-QE. While this work is concentrated on the theoretical background, description and definition of the methodology as well as the extension and validation of the applicability, other topics are in the development of an FMC-QE modeling and evaluation tool and the usage of FMC-QE in the design of an adaptive transport layer in order to fulfill Quality of Service and Service Level Agreements in volatile service based environments. This thesis contains a state-of-the-art, the description of FMC-QE as well as extensions of FMC-QE in representative general models and case studies. In the state-of-the-art part of the thesis in chapter 2, an overview on existing Queueing Theory and Time Augmented Petri Net models and other quantitative modeling and evaluation languages and methodologies is given. Also other hierarchical quantitative modeling frameworks will be considered. The description of FMC-QE in chapter 3 consists of a summary of the foundations of FMC-QE, basic definitions, the graphical notations, the FMC-QE Calculus and the modeling of open queueing networks as an introductory example. The extensions of FMC-QE in chapter 4 consist of the integration of the summation method in order to support the handling of closed networks and the modeling of multiclass and semaphore scenarios. Furthermore, FMC-QE is compared to other performance modeling and evaluation approaches. In the case study part in chapter 5, proof-of-concept examples, like the modeling of a service based search portal, a service based SAP NetWeaver application and the Axis2 Web service framework will be provided. Finally, conclusions are given by a summary of contributions and an outlook on future work in chapter 6. [1] Werner Zorn. FMC-QE - A New Approach in Quantitative Modeling. In Hamid R. Arabnia, editor, Procee-dings of the International Conference on Modeling, Simulation and Visualization Methods (MSV 2007) within WorldComp ’07, pages 280 – 287, Las Vegas, NV, USA, June 2007. CSREA Press. ISBN 1-60132-029-9.
Version Control Systems (VCS) allow developers to manage changes to software artifacts. Developers interact with VCSs through a variety of client programs, such as graphical front-ends or command line tools. It is desirable to use the same version control client program against different VCSs. Unfortunately, no established abstraction over VCS concepts exists. Instead, VCS client programs implement ad-hoc solutions to support interaction with multiple VCSs. This thesis presents Pur, an abstraction over version control concepts that allows building rich client programs that can interact with multiple VCSs. We provide an implementation of this abstraction and validate it by implementing a client application.
Every year, the Hasso Plattner Institute (HPI) invites guests from industry and academia to a collaborative scientific workshop on the topic Every year, the Hasso Plattner Institute (HPI) invites guests from industry and academia to a collaborative scientific workshop on the topic "Operating the Cloud". Our goal is to provide a forum for the exchange of knowledge and experience between industry and academia. Co-located with the event is the HPI's Future SOC Lab day, which offers an additional attractive and conducive environment for scientific and industry related discussions. "Operating the Cloud" aims to be a platform for productive interactions of innovative ideas, visions, and upcoming technologies in the field of cloud operation and administration.
On the occasion of this symposium we called for submissions of research papers and practitioner's reports. A compilation of the research papers realized during the fourth HPI cloud symposium "Operating the Cloud" 2016 are published in this proceedings. We thank the authors for exciting presentations and insights into their current work and research.
Moreover, we look forward to more interesting submissions for the upcoming symposium later in the year. Every year, the Hasso Plattner Institute (HPI) invites guests from industry and academia to a collaborative scientific workshop on the topic "Operating the Cloud". Our goal is to provide a forum for the exchange of knowledge and experience between industry and academia. Co-located with the event is the HPI's Future SOC Lab day, which offers an additional attractive and conducive environment for scientific and industry related discussions. "Operating the Cloud" aims to be a platform for productive interactions of innovative ideas, visions, and upcoming technologies in the field of cloud operation and administration.
E-learning is a flexible and personalized alternative to traditional education. Nonetheless, existing e-learning systems for IT security education have difficulties in delivering hands-on experience because of the lack of proximity. Laboratory environments and practical exercises are indispensable instruction tools to IT security education, but security education in con-ventional computer laboratories poses the problem of immobility as well as high creation and maintenance costs. Hence, there is a need to effectively transform security laboratories and practical exercises into e-learning forms. This report introduces the Tele-Lab IT-Security architecture that allows students not only to learn IT security principles, but also to gain hands-on security experience by exercises in an online laboratory environment. In this architecture, virtual machines are used to provide safe user work environments instead of real computers. Thus, traditional laboratory environments can be cloned onto the Internet by software, which increases accessibilities to laboratory resources and greatly reduces investment and maintenance costs. Under the Tele-Lab IT-Security framework, a set of technical solutions is also proposed to provide effective functionalities, reliability, security, and performance. The virtual machines with appropriate resource allocation, software installation, and system configurations are used to build lightweight security laboratories on a hosting computer. Reliability and availability of laboratory platforms are covered by the virtual machine management framework. This management framework provides necessary monitoring and administration services to detect and recover critical failures of virtual machines at run time. Considering the risk that virtual machines can be misused for compromising production networks, we present security management solutions to prevent misuse of laboratory resources by security isolation at the system and network levels. This work is an attempt to bridge the gap between e-learning/tele-teaching and practical IT security education. It is not to substitute conventional teaching in laboratories but to add practical features to e-learning. This report demonstrates the possibility to implement hands-on security laboratories on the Internet reliably, securely, and economically.
3D from 2D touch
(2013)
While interaction with computers used to be dominated by mice and keyboards, new types of sensors now allow users to interact through touch, speech, or using their whole body in 3D space. These new interaction modalities are often referred to as "natural user interfaces" or "NUIs." While 2D NUIs have experienced major success on billions of mobile touch devices sold, 3D NUI systems have so far been unable to deliver a mobile form factor, mainly due to their use of cameras. The fact that cameras require a certain distance from the capture volume has prevented 3D NUI systems from reaching the flat form factor mobile users expect. In this dissertation, we address this issue by sensing 3D input using flat 2D sensors. The systems we present observe the input from 3D objects as 2D imprints upon physical contact. By sampling these imprints at very high resolutions, we obtain the objects' textures. In some cases, a texture uniquely identifies a biometric feature, such as the user's fingerprint. In other cases, an imprint stems from the user's clothing, such as when walking on multitouch floors. By analyzing from which part of the 3D object the 2D imprint results, we reconstruct the object's pose in 3D space. While our main contribution is a general approach to sensing 3D input on 2D sensors upon physical contact, we also demonstrate three applications of our approach. (1) We present high-accuracy touch devices that allow users to reliably touch targets that are a third of the size of those on current touch devices. We show that different users and 3D finger poses systematically affect touch sensing, which current devices perceive as random input noise. We introduce a model for touch that compensates for this systematic effect by deriving the 3D finger pose and the user's identity from each touch imprint. We then investigate this systematic effect in detail and explore how users conceptually touch targets. Our findings indicate that users aim by aligning visual features of their fingers with the target. We present a visual model for touch input that eliminates virtually all systematic effects on touch accuracy. (2) From each touch, we identify users biometrically by analyzing their fingerprints. Our prototype Fiberio integrates fingerprint scanning and a display into the same flat surface, solving a long-standing problem in human-computer interaction: secure authentication on touchscreens. Sensing 3D input and authenticating users upon touch allows Fiberio to implement a variety of applications that traditionally require the bulky setups of current 3D NUI systems. (3) To demonstrate the versatility of 3D reconstruction on larger touch surfaces, we present a high-resolution pressure-sensitive floor that resolves the texture of objects upon touch. Using the same principles as before, our system GravitySpace analyzes all imprints and identifies users based on their shoe soles, detects furniture, and enables accurate touch input using feet. By classifying all imprints, GravitySpace detects the users' body parts that are in contact with the floor and then reconstructs their 3D body poses using inverse kinematics. GravitySpace thus enables a range of applications for future 3D NUI systems based on a flat sensor, such as smart rooms in future homes. We conclude this dissertation by projecting into the future of mobile devices. Focusing on the mobility aspect of our work, we explore how NUI devices may one day augment users directly in the form of implanted devices.
Modern 3D geovisualization systems (3DGeoVSs) are complex and evolving systems that are required to be adaptable and leverage distributed resources, including massive geodata. This article focuses on 3DGeoVSs built based on the principles of service-oriented architectures, standards and image-based representations (SSI) to address practically relevant challenges and potentials. Such systems facilitate resource sharing and agile and efficient system construction and change in an interoperable manner, while exploiting images as efficient, decoupled and interoperable representations. The software architecture of a 3DGeoVS and its underlying visualization model have strong effects on the system's quality attributes and support various system life cycle activities. This article contributes a software reference architecture (SRA) for 3DGeoVSs based on SSI that can be used to design, describe and analyze concrete software architectures with the intended primary benefit of an increase in effectiveness and efficiency in such activities. The SRA integrates existing, proven technology and novel contributions in a unique manner. As the foundation for the SRA, we propose the generalized visualization pipeline model that generalizes and overcomes expressiveness limitations of the prevalent visualization pipeline model. To facilitate exploiting image-based representations (IReps), the SRA integrates approaches for the representation, provisioning and styling of and interaction with IReps. Five applications of the SRA provide proofs of concept for the general applicability and utility of the SRA. A qualitative evaluation indicates the overall suitability of the SRA, its applications and the general approach of building 3DGeoVSs based on SSI.
Duplicate detection consists in determining different representations of real-world objects in a database. Recent research has considered the use of relationships among object representations to improve duplicate detection. In the general case where relationships form a graph, research has mainly focused on duplicate detection quality/effectiveness. Scalability has been neglected so far, even though it is crucial for large real-world duplicate detection tasks. In this paper we scale up duplicate detection in graph data (DDG) to large amounts of data and pairwise comparisons, using the support of a relational database system. To this end, we first generalize the process of DDG. We then present how to scale algorithms for DDG in space (amount of data processed with limited main memory) and in time. Finally, we explore how complex similarity computation can be performed efficiently. Experiments on data an order of magnitude larger than data considered so far in DDG clearly show that our methods scale to large amounts of data not residing in main memory.
The data quality of real-world datasets need to be constantly monitored and maintained to allow organizations and individuals to reliably use their data. Especially, data integration projects suffer from poor initial data quality and as a consequence consume more effort and money. Commercial products and research prototypes for data cleansing and integration help users to improve the quality of individual and combined datasets. They can be divided into either standalone systems or database management system (DBMS) extensions. On the one hand, standalone systems do not interact well with DBMS and require time-consuming data imports and exports. On the other hand, DBMS extensions are often limited by the underlying system and do not cover the full set of data cleansing and integration tasks.
We overcome both limitations by implementing a concise set of five data cleansing and integration operators on the parallel data analytics platform Stratosphere. We define the semantics of the operators, present their parallel implementation, and devise optimization techniques for individual operators and combinations thereof. Users specify declarative queries in our query language METEOR with our new operators to improve the data quality of individual datasets or integrate them to larger datasets. By integrating the data cleansing operators into the higher level language layer of Stratosphere, users can easily combine cleansing operators with operators from other domains, such as information extraction, to complex data flows. Through a generic description of the operators, the Stratosphere optimizer reorders operators even from different domains to find better query plans.
As a case study, we reimplemented a part of the large Open Government Data integration project GovWILD with our new operators and show that our queries run significantly faster than the original GovWILD queries, which rely on relational operators. Evaluation reveals that our operators exhibit good scalability on up to 100 cores, so that even larger inputs can be efficiently processed by scaling out to more machines. Finally, our scripts are considerably shorter than the original GovWILD scripts, which results in better maintainability of the scripts.
MDE techniques are more and more used in praxis. However, there is currently a lack of detailed reports about how different MDE techniques are integrated into the development and combined with each other. To learn more about such MDE settings, we performed a descriptive and exploratory field study with SAP, which is a worldwide operating company with around 50.000 employees and builds enterprise software applications. This technical report describes insights we got during this study. For example, we identified that MDE settings are subject to evolution. Finally, this report outlines directions for future research to provide practical advises for the application of MDE settings.
Nowadays, software systems are getting more and more complex. To tackle this challenge most diverse techniques, such as design patterns, service oriented architectures (SOA), software development processes, and model-driven engineering (MDE), are used to improve productivity, while time to market and quality of the products stay stable. Multiple of these techniques are used in parallel to profit from their benefits. While the use of sophisticated software development processes is standard, today, MDE is just adopted in practice. However, research has shown that the application of MDE is not always successful. It is not fully understood when advantages of MDE can be used and to what degree MDE can also be disadvantageous for productivity. Further, when combining different techniques that aim to affect the same factor (e.g. productivity) the question arises whether these techniques really complement each other or, in contrast, compensate their effects. Due to that, there is the concrete question how MDE and other techniques, such as software development process, are interrelated. Both aspects (advantages and disadvantages for productivity as well as the interrelation to other techniques) need to be understood to identify risks relating to the productivity impact of MDE. Before studying MDE's impact on productivity, it is necessary to investigate the range of validity that can be reached for the results. This includes two questions. First, there is the question whether MDE's impact on productivity is similar for all approaches of adopting MDE in practice. Second, there is the question whether MDE's impact on productivity for an approach of using MDE in practice remains stable over time. The answers for both questions are crucial for handling risks of MDE, but also for the design of future studies on MDE success. This thesis addresses these questions with the goal to support adoption of MDE in future. To enable a differentiated discussion about MDE, the term MDE setting'' is introduced. MDE setting refers to the applied technical setting, i.e. the employed manual and automated activities, artifacts, languages, and tools. An MDE setting's possible impact on productivity is studied with a focus on changeability and the interrelation to software development processes. This is done by introducing a taxonomy of changeability concerns that might be affected by an MDE setting. Further, three MDE traits are identified and it is studied for which manifestations of these MDE traits software development processes are impacted. To enable the assessment and evaluation of an MDE setting's impacts, the Software Manufacture Model language is introduced. This is a process modeling language that allows to reason about how relations between (modeling) artifacts (e.g. models or code files) change during application of manual or automated development activities. On that basis, risk analysis techniques are provided. These techniques allow identifying changeability risks and assessing the manifestations of the MDE traits (and with it an MDE setting's impact on software development processes). To address the range of validity, MDE settings from practice and their evolution histories were capture in context of this thesis. First, this data is used to show that MDE settings cover the whole spectrum concerning their impact on changeability or interrelation to software development processes. Neither it is seldom that MDE settings are neutral for processes nor is it seldom that MDE settings have impact on processes. Similarly, the impact on changeability differs relevantly. Second, a taxonomy of evolution of MDE settings is introduced. In that context it is discussed to what extent different types of changes on an MDE setting can influence this MDE setting's impact on changeability and the interrelation to processes. The category of structural evolution, which can change these characteristics of an MDE setting, is identified. The captured MDE settings from practice are used to show that structural evolution exists and is common. In addition, some examples of structural evolution steps are collected that actually led to a change in the characteristics of the respective MDE settings. Two implications are: First, the assessed diversity of MDE settings evaluates the need for the analysis techniques that shall be presented in this thesis. Second, evolution is one explanation for the diversity of MDE settings in practice. To summarize, this thesis studies the nature and evolution of MDE settings in practice. As a result support for the adoption of MDE settings is provided in form of techniques for the identification of risks relating to productivity impacts.
CSOM/PL is a software product line (SPL) derived from applying multi-dimensional separation of concerns (MDSOC) techniques to the domain of high-level language virtual machine (VM) implementations. For CSOM/PL, we modularised CSOM, a Smalltalk VM implemented in C, using VMADL (virtual machine architecture description language). Several features of the original CSOM were encapsulated in VMADL modules and composed in various combinations. In an evaluation of our approach, we show that applying MDSOC and SPL principles to a domain as complex as that of VMs is not only feasible but beneficial, as it improves understandability, maintainability, and configurability of VM implementations without harming performance.
An important characteristic of Service-Oriented Architectures is that clients do not depend on the service implementation's internal assignment of methods to objects. It is perhaps the most important technical characteristic that differentiates them from more common object-oriented solutions. This characteristic makes clients and services malleable, allowing them to be rearranged at run-time as circumstances change. That improvement in malleability is impaired by requiring clients to direct service requests to particular services. Ideally, the clients are totally oblivious to the service structure, as they are to aspect structure in aspect-oriented software. Removing knowledge of a method implementation's location, whether in object or service, requires re-defining the boundary line between programming language and middleware, making clearer specification of dependence on protocols, and bringing the transaction-like concept of failure scopes into language semantics as well. This paper explores consequences and advantages of a transition from object-request brokering to service-request brokering, including the potential to improve our ability to write more parallel software.
Component based software development (CBSD) and aspectoriented software development (AOSD) are two complementary approaches. However, existing proposals for integrating aspects into component models are direct transposition of object-oriented AOSD techniques to components. In this article, we propose a new approach based on views. Our proposal introduces crosscutting components quite naturally and can be integrated into different component models.
1. Design and Composition of 3D Geoinformation Services Benjamin Hagedorn 2. Operating System Abstractions for Service-Based Systems Michael Schöbel 3. A Task-oriented Approach to User-centered Design of Service-Based Enterprise Applications Matthias Uflacker 4. A Framework for Adaptive Transport in Service- Oriented Systems based on Performance Prediction Flavius Copaciu 5. Asynchronicity and Loose Coupling in Service-Oriented Architectures Nikola Milanovic
Imaginary Interfaces
(2013)
The size of a mobile device is primarily determined by the size of the touchscreen. As such, researchers have found that the way to achieve ultimate mobility is to abandon the screen altogether. These wearable devices are operated using hand gestures, voice commands or a small number of physical buttons. By abandoning the screen these devices also abandon the currently dominant spatial interaction style (such as tapping on buttons), because, seemingly, there is nothing to tap on. Unfortunately this design prevents users from transferring their learned interaction knowledge gained from traditional touchscreen-based devices. In this dissertation, I present Imaginary Interfaces, which return spatial interaction to screenless mobile devices. With these interfaces, users point and draw in the empty space in front of them or on the palm of their hands. While they cannot see the results of their interaction, they obtain some visual and tactile feedback by watching and feeling their hands interact. After introducing the concept of Imaginary Interfaces, I present two hardware prototypes that showcase two different forms of interaction with an imaginary interface, each with its own advantages: mid-air imaginary interfaces can be large and expressive, while palm-based imaginary interfaces offer an abundance of tactile features that encourage learning. Given that imaginary interfaces offer no visual output, one of the key challenges is to enable users to discover the interface's layout. This dissertation offers three main solutions: offline learning with coordinates, browsing with audio feedback and learning by transfer. The latter I demonstrate with the Imaginary Phone, a palm-based imaginary interface that mimics the layout of a physical mobile phone that users are already familiar with. Although these designs enable interaction with Imaginary Interfaces, they tell us little about why this interaction is possible. In the final part of this dissertation, I present an exploration into which human perceptual abilities are used when interacting with a palm-based imaginary interface and how much each accounts for performance with the interface. These findings deepen our understanding of Imaginary Interfaces and suggest that palm-based Imaginary Interfaces can enable stand-alone eyes-free use for many applications, including interfaces for visually impaired users.
User-centered design processes are the first choice when new interactive systems or services are developed to address real customer needs and provide a good user experience. Common tools for collecting user research data, conducting brainstormings, or sketching ideas are whiteboards and sticky notes. They are ubiquitously available, and no technical or domain knowledge is necessary to use them. However, traditional pen and paper tools fall short when saving the content and sharing it with others unable to be in the same location. They are also missing further digital advantages such as searching or sorting content. Although research on digital whiteboard and sticky note applications has been conducted for over 20 years, these tools are not widely adopted in company contexts. While many research prototypes exist, they have not been used for an extended period of time in a real-world context. The goal of this thesis is to investigate what the enablers and obstacles for the adoption of digital whiteboard systems are. As an instrument for different studies, we developed the Tele-Board software system for collaborative creative work. Based on interviews, observations, and findings from former research, we tried to transfer the analog way of working to the digital world. Being a software system, Tele-Board can be used with a variety of hardware and does not depend on special devices. This feature became one of the main factors for adoption on a larger scale. In this thesis, I will present three studies on the use of Tele-Board with different user groups and foci. I will use a combination of research methods (laboratory case studies and data from field research) with the overall goal of finding out when a digital whiteboard system is used and in which cases not. Not surprisingly, the system is used and accepted if a user sees a main benefit that neither analog tools nor other applications can offer. However, I found that these perceived benefits are very different for each user and usage context. If a tool provides possibilities to use in different ways and with different equipment, the chances of its adoption by a larger group increase. Tele-Board has now been in use for over 1.5 years in a global IT company in at least five countries with a constantly growing user base. Its use, advantages, and disadvantages will be described based on 42 interviews and usage statistics from server logs. Through these insights and findings from laboratory case studies, I will present a detailed analysis of digital whiteboard use in different contexts with design implications for future systems.
The Apache Modeling Project
(2004)
This document presents an introduction to the Apache HTTP Server, covering both an overview and implementation details. It presents results of the Apache Modelling Project done by research assistants and students of the Hasso–Plattner–Institute in 2001, 2002 and 2003. The Apache HTTP Server was used to introduce students to the application of the modeling technique FMC, a method that supports transporting knowledge about complex systems in the domain of information processing (software and hardware as well). After an introduction to HTTP servers in general, we will focus on protocols and web technology. Then we will discuss Apache, its operational environment and its extension capabilities— the module API. Finally we will guide the reader through parts of the Apache source code and explain the most important pieces.
In today’s life, embedded systems are ubiquitous. But they differ from traditional desktop systems in many aspects – these include predictable timing behavior (real-time), the management of scarce resources (memory, network), reliable communication protocols, energy management, special purpose user-interfaces (headless operation), system configuration, programming languages (to support software/hardware co-design), and modeling techniques. Within this technical report, authors present results from the lecture “Operating Systems for Embedded Computing” that has been offered by the “Operating Systems and Middleware” group at HPI in Winter term 2013/14. Focus of the lecture and accompanying projects was on principles of real-time computing. Students had the chance to gather practical experience with a number of different OSes and applications and present experiences with near-hardware programming. Projects address the entire spectrum, from bare-metal programming to harnessing a real-time OS to exercising the full software/hardware co-design cycle. Three outstanding projects are at the heart of this technical report. Project 1 focuses on the development of a bare-metal operating system for LEGO Mindstorms EV3. While still a toy, it comes with a powerful ARM processor, 64 MB of main memory, standard interfaces, such as Bluetooth and network protocol stacks. EV3 runs a version of 1 1 Introduction Linux. Sources are available from Lego’s web site. However, many devices and their driver software are proprietary and not well documented. Developing a new, bare-metal OS for the EV3 requires an understanding of the EV3 boot process. Since no standard input/output devices are available, initial debugging steps are tedious. After managing these initial steps, the project was able to adapt device drivers for a few Lego devices to an extent that a demonstrator (the Segway application) could be successfully run on the new OS. Project 2 looks at the EV3 from a different angle. The EV3 is running a pretty decent version of Linux- in principle, the RT_PREEMPT patch can turn any Linux system into a real-time OS by modifying the behavior of a number of synchronization constructs at the heart of the OS. Priority inversion is a problem that is solved by protocols such as priority inheritance or priority ceiling. Real-time OSes implement at least one of the protocols. The central idea of the project was the comparison of non-real-time and real-time variants of Linux on the EV3 hardware. A task set that showed effects of priority inversion on standard EV3 Linux would operate flawlessly on the Linux version with the RT_PREEMPT-patch applied. If only patching Lego’s version of Linux was that easy... Project 3 takes the notion of real-time computing more seriously. The application scenario was centered around our Carrera Digital 132 racetrack. Obtaining position information from the track, controlling individual cars, detecting and modifying the Carrera Digital protocol required design and implementation of custom controller hardware. What to implement in hardware, firmware, and what to implement in application software – this was the central question addressed by the project.
Virtual 3D city and landscape models are the main subject investigated in this thesis. They digitally represent urban space and have many applications in different domains, e.g., simulation, cadastral management, and city planning. Visualization is an elementary component of these applications. Photo-realistic visualization with an increasingly high degree of detail leads to fundamental problems for comprehensible visualization. A large number of highly detailed and textured objects within a virtual 3D city model may create visual noise and overload the users with information. Objects are subject to perspective foreshortening and may be occluded or not displayed in a meaningful way, as they are too small. In this thesis we present abstraction techniques that automatically process virtual 3D city and landscape models to derive abstracted representations. These have a reduced degree of detail, while essential characteristics are preserved. After introducing definitions for model, scale, and multi-scale representations, we discuss the fundamentals of map generalization as well as techniques for 3D generalization. The first presented technique is a cell-based generalization of virtual 3D city models. It creates abstract representations that have a highly reduced level of detail while maintaining essential structures, e.g., the infrastructure network, landmark buildings, and free spaces. The technique automatically partitions the input virtual 3D city model into cells based on the infrastructure network. The single building models contained in each cell are aggregated to abstracted cell blocks. Using weighted infrastructure elements, cell blocks can be computed on different hierarchical levels, storing the hierarchy relation between the cell blocks. Furthermore, we identify initial landmark buildings within a cell by comparing the properties of individual buildings with the aggregated properties of the cell. For each block, the identified landmark building models are subtracted using Boolean operations and integrated in a photo-realistic way. Finally, for the interactive 3D visualization we discuss the creation of the virtual 3D geometry and their appearance styling through colors, labeling, and transparency. We demonstrate the technique with example data sets. Additionally, we discuss applications of generalization lenses and transitions between abstract representations. The second technique is a real-time-rendering technique for geometric enhancement of landmark objects within a virtual 3D city model. Depending on the virtual camera distance, landmark objects are scaled to ensure their visibility within a specific distance interval while deforming their environment. First, in a preprocessing step a landmark hierarchy is computed, this is then used to derive distance intervals for the interactive rendering. At runtime, using the virtual camera distance, a scaling factor is computed and applied to each landmark. The scaling factor is interpolated smoothly at the interval boundaries using cubic Bézier splines. Non-landmark geometry that is near landmark objects is deformed with respect to a limited number of landmarks. We demonstrate the technique by applying it to a highly detailed virtual 3D city model and a generalized 3D city model. In addition we discuss an adaptation of the technique for non-linear projections and mobile devices. The third technique is a real-time rendering technique to create abstract 3D isocontour visualization of virtual 3D terrain models. The virtual 3D terrain model is visualized as a layered or stepped relief. The technique works without preprocessing and, as it is implemented using programmable graphics hardware, can be integrated with minimal changes into common terrain rendering techniques. Consequently, the computation is done in the rendering pipeline for each vertex, primitive, i.e., triangle, and fragment. For each vertex, the height is quantized to the nearest isovalue. For each triangle, the vertex configuration with respect to their isovalues is determined first. Using the configuration, the triangle is then subdivided. The subdivision forms a partial step geometry aligned with the triangle. For each fragment, the surface appearance is determined, e.g., depending on the surface texture, shading, and height-color-mapping. Flexible usage of the technique is demonstrated with applications from focus+context visualization, out-of-core terrain rendering, and information visualization. This thesis presents components for the creation of abstract representations of virtual 3D city and landscape models. Re-using visual language from cartography, the techniques enable users to build on their experience with maps when interpreting these representations. Simultaneously, characteristics of 3D geovirtual environments are taken into account by addressing and discussing, e.g., continuous scale, interaction, and perspective.
During the overall development of complex engineering systems different modeling notations are employed. For example, in the domain of automotive systems system engineering models are employed quite early to capture the requirements and basic structuring of the entire system, while software engineering models are used later on to describe the concrete software architecture. Each model helps in addressing the specific design issue with appropriate notations and at a suitable level of abstraction. However, when we step forward from system design to the software design, the engineers have to ensure that all decisions captured in the system design model are correctly transferred to the software engineering model. Even worse, when changes occur later on in either model, today the consistency has to be reestablished in a cumbersome manual step. In this report, we present in an extended version of [Holger Giese, Stefan Neumann, and Stephan Hildebrandt. Model Synchronization at Work: Keeping SysML and AUTOSAR Models Consistent. In Gregor Engels, Claus Lewerentz, Wilhelm Schäfer, Andy Schürr, and B. Westfechtel, editors, Graph Transformations and Model Driven Enginering - Essays Dedicated to Manfred Nagl on the Occasion of his 65th Birthday, volume 5765 of Lecture Notes in Computer Science, pages 555–579. Springer Berlin / Heidelberg, 2010.] how model synchronization and consistency rules can be applied to automate this task and ensure that the different models are kept consistent. We also introduce a general approach for model synchronization. Besides synchronization, the approach consists of tool adapters as well as consistency rules covering the overlap between the synchronized parts of a model and the rest. We present the model synchronization algorithm based on triple graph grammars in detail and further exemplify the general approach by means of a model synchronization solution between system engineering models in SysML and software engineering models in AUTOSAR which has been developed for an industrial partner. In the appendix as extension to [19] the meta-models and all TGG rules for the SysML to AUTOSAR model synchronization are documented.
The correctness of model transformations is a crucial element for the model-driven engineering of high quality software. A prerequisite to verify model transformations at the level of the model transformation specification is that an unambiguous formal semantics exists and that the employed implementation of the model transformation language adheres to this semantics. However, for existing relational model transformation approaches it is usually not really clear under which constraints particular implementations are really conform to the formal semantics. In this paper, we will bridge this gap for the formal semantics of triple graph grammars (TGG) and an existing efficient implementation. Whereas the formal semantics assumes backtracking and ignores non-determinism, practical implementations do not support backtracking, require rule sets that ensure determinism, and include further optimizations. Therefore, we capture how the considered TGG implementation realizes the transformation by means of operational rules, define required criteria and show conformance to the formal semantics if these criteria are fulfilled. We further outline how static analysis can be employed to guarantee these criteria.
Model-driven software development requires techniques to consistently propagate modifications between different related models to realize its full potential. For large-scale models, efficiency is essential in this respect. In this paper, we present an improved model synchronization algorithm based on triple graph grammars that is highly efficient and, therefore, can also synchronize large-scale models sufficiently fast. We can show, that the overall algorithm has optimal complexity if it is dominating the rule matching and further present extensive measurements that show the efficiency of the presented model transformation and synchronization technique.
The service-oriented architecture supports the dynamic assembly and runtime reconfiguration of complex open IT landscapes by means of runtime binding of service contracts, launching of new components and termination of outdated ones. Furthermore, the evolution of these IT landscapes is not restricted to exchanging components with other ones using the same service contracts, as new services contracts can be added as well. However, current approaches for modeling and verification of service-oriented architectures do not support these important capabilities to their full extend.In this report we present an extension of the current OMG proposal for service modeling with UML - SoaML - which overcomes these limitations. It permits modeling services and their service contracts at different levels of abstraction, provides a formal semantics for all modeling concepts, and enables verifying critical properties. Our compositional and incremental verification approach allows for complex properties including communication parameters and time and covers besides the dynamic binding of service contracts and the replacement of components also the evolution of the systems by means of new service contracts. The modeling as well as verification capabilities of the presented approach are demonstrated by means of a supply chain example and the verification results of a first prototype are shown.
Pattern matching is a well-established concept in the functional programming community. It provides the means for concisely identifying and destructuring values of interest. This enables a clean separation of data structures and respective functionality, as well as dispatching functionality based on more than a single value. Unfortunately, expressive pattern matching facilities are seldomly incorporated in present object-oriented programming languages. We present a seamless integration of pattern matching facilities in an object-oriented and dynamically typed programming language: Newspeak. We describe language extensions to improve the practicability and integrate our additions with the existing programming environment for Newspeak. This report is based on the first author’s master’s thesis.
Babelsberg/RML
(2015)
New programming language designs are often evaluated on concrete implementations. However, in order to draw conclusions about the language design from the evaluation of concrete programming languages, these implementations need to be verified against the formalism of the design. To that end, we also have to ensure that the design actually meets its stated goals. A useful tool for the latter has been to create an executable semantics from a formalism that can execute a test suite of examples. However, this mechanism so far did not allow to verify an implementation against the design.
Babelsberg is a new design for a family of object-constraint languages. Recently, we have developed a formal semantics to clarify some issues in the design of those languages. Supplementing this work, we report here on how this formalism is turned into an executable operational semantics using the RML system. Furthermore, we show how we extended the executable semantics to create a framework that can generate test suites for the concrete Babelsberg implementations that provide traceability from the design to the language. Finally, we discuss how these test suites helped us find and correct mistakes in the Babelsberg implementation for JavaScript.
Constraints allow developers to specify desired properties of systems in a number of domains, and have those properties be maintained automatically. This results in compact, declarative code, avoiding scattered code to check and imperatively re-satisfy invariants. Despite these advantages, constraint programming is not yet widespread, with standard imperative programming still the norm. There is a long history of research on integrating constraint programming with the imperative paradigm. However, this integration typically does not unify the constructs for encapsulation and abstraction from both paradigms. This impedes re-use of modules, as client code written in one paradigm can only use modules written to support that paradigm. Modules require redundant definitions if they are to be used in both paradigms. We present a language – Babelsberg – that unifies the constructs for en- capsulation and abstraction by using only object-oriented method definitions for both declarative and imperative code. Our prototype – Babelsberg/R – is an extension to Ruby, and continues to support Ruby’s object-oriented se- mantics. It allows programmers to add constraints to existing Ruby programs in incremental steps by placing them on the results of normal object-oriented message sends. It is implemented by modifying a state-of-the-art Ruby virtual machine. The performance of standard object-oriented code without con- straints is only modestly impacted, with typically less than 10% overhead compared with the unmodified virtual machine. Furthermore, our architec- ture for adding multiple constraint solvers allows Babelsberg to deal with constraints in a variety of domains. We argue that our approach provides a useful step toward making con- straint solving a generic tool for object-oriented programmers. We also provide example applications, written in our Ruby-based implementation, which use constraints in a variety of application domains, including interactive graphics, circuit simulations, data streaming with both hard and soft constraints on performance, and configuration file Management.
Enforcing security policies to distributed systems is difficult, in particular, when a system contains untrusted components. We designed AspectKE*, a distributed AOP language based on a tuple space, to tackle this issue. In AspectKE*, aspects can enforce access control policies that depend on future behavior of running processes. One of the key language features is the predicates and functions that extract results of static program analysis, which are useful for defining security aspects that have to know about future behavior of a program. AspectKE* also provides a novel variable binding mechanism for pointcuts, so that pointcuts can uniformly specify join points based on both static and dynamic information about the program. Our implementation strategy performs fundamental static analysis at load-time, so as to retain runtime overheads minimal. We implemented a compiler for AspectKE*, and demonstrate usefulness of AspectKE* through a security aspect for a distributed chat system.
Business processes are instrumental to manage work in organisations. To study the interdependencies between business processes, Business Process Architectures have been introduced. These express trigger and message ow relations between business processes. When we investigate real world Business Process Architectures, we find complex interdependencies, involving multiple process instances. These aspects have not been studied in detail so far, especially concerning correctness properties. In this paper, we propose a modular transformation of BPAs to open nets for the analysis of behavior involving multiple business processes with multiplicities. For this purpose we introduce intermediary nets to portray semantics of multiplicity specifications. We evaluate our approach on a use case from the public sector.
Business Process Management has become an integral part of modern organizations in the private and public sector for improving their operations. In the course of Business Process Management efforts, companies and organizations assemble large process model repositories with many hundreds and thousands of business process models bearing a large amount of information. With the advent of large business process model collections, new challenges arise as structuring and managing a large amount of process models, their maintenance, and their quality assurance.
This is covered by business process architectures that have been introduced for organizing and structuring business process model collections. A variety of business process architecture approaches have been proposed that align business processes along aspects of interest, e. g., goals, functions, or objects. They provide a high level categorization of single processes ignoring their interdependencies, thus hiding valuable information. The production of goods or the delivery of services are often realized by a complex system of interdependent business processes. Hence, taking a holistic view at business processes interdependencies becomes a major necessity to organize, analyze, and assess the impact of their re-/design. Visualizing business processes interdependencies reveals hidden and implicit information from a process model collection.
In this thesis, we present a novel Business Process Architecture approach for representing and analyzing business process interdependencies on an abstract level. We propose a formal definition of our Business Process Architecture approach, design correctness criteria, and develop analysis techniques for assessing their quality. We describe a methodology for applying our Business Process Architecture approach top-down and bottom-up. This includes techniques for Business Process Architecture extraction from, and decomposition to process models while considering consistency issues between business process architecture and process model level. Using our extraction algorithm, we present a novel technique to identify and visualize data interdependencies in Business Process Data Architectures. Our Business Process Architecture approach provides business process experts,managers, and other users of a process model collection with an overview that allows reasoning about a large set of process models,
understanding, and analyzing their interdependencies in a facilitated way. In this regard we evaluated our Business Process Architecture approach in an experiment and provide implementations of selected techniques.
For interactive construction of CSG models understanding the layout of a model is essential for its efficient manipulation. To understand position and orientation of aggregated components of a CSG model, we need to realize its visible and occluded parts as a whole. Hence, transparency and enhanced outlines are key techniques to assist comprehension. We present a novel real-time rendering technique for visualizing design and spatial assembly of CSG models. As enabling technology we combine an image-space CSG rendering algorithm with blueprint rendering. Blueprint rendering applies depth peeling for extracting layers of ordered depth from polygonal models and then composes them in sorted order facilitating a clear insight of the models. We develop a solution for implementing depth peeling for CSG models considering their depth complexity. Capturing surface colors of each layer and later combining the results allows for generating order-independent transparency as one major rendering technique for CSG models. We further define visually important edges for CSG models and integrate an image-space edgeenhancement technique for detecting them in each layer. In this way, we extract visually important edges that are directly and not directly visible to outline a model’s layout. Combining edges with transparency rendering, finally, generates edge-enhanced depictions of image-based CSG models and allows us to realize their complex, spatial assembly.
The correctness of model transformations is a crucial element for model-driven engineering of high quality software. In particular, behavior preservation is the most important correctness property avoiding the introduction of semantic errors during the model-driven engineering process. Behavior preservation verification techniques either show that specific properties are preserved, or more generally and complex, they show some kind of behavioral equivalence or refinement between source and target model of the transformation. Both kinds of behavior preservation verification goals have been presented with automatic tool support for the instance level, i.e. for a given source and target model specified by the model transformation. However, up until now there is no automatic verification approach available at the transformation level, i.e. for all source and target models specified by the model transformation.
In this report, we extend our results presented in [27] and outline a new sophisticated approach for the automatic verification of behavior preservation captured by bisimulation resp. simulation for model transformations specified by triple graph grammars and semantic definitions given by graph transformation rules. In particular, we show that the behavior preservation problem can be reduced to invariant checking for graph transformation and that the resulting checking problem can be addressed by our own invariant checker even for a complex example where a sequence chart is transformed into communicating automata. We further discuss today's limitations of invariant checking for graph transformation and motivate further lines of future work in this direction.
While offering significant expressive power, graph transformation systems often come with rather limited capabilities for automated analysis, particularly if systems with many possible initial graphs and large or infinite state spaces are concerned. One approach that tries to overcome these limitations is inductive invariant checking. However, the verification of inductive invariants often requires extensive knowledge about the system in question and faces the approach-inherent challenges of locality and lack of context.
To address that, this report discusses k-inductive invariant checking for graph transformation systems as a generalization of inductive invariants. The additional context acquired by taking multiple (k) steps into account is the key difference to inductive invariant checking and is often enough to establish the desired invariants without requiring the iterative development of additional properties.
To analyze possibly infinite systems in a finite fashion, we introduce a symbolic encoding for transformation traces using a restricted form of nested application conditions. As its central contribution, this report then presents a formal approach and algorithm to verify graph constraints as k-inductive invariants. We prove the approach's correctness and demonstrate its applicability by means of several examples evaluated with a prototypical implementation of our algorithm.
Graph transformation systems are a powerful formal model to capture model transformations or systems with infinite state space, among others. However, this expressive power comes at the cost of rather limited automated analysis capabilities. The general case of unbounded many initial graphs or infinite state spaces is only supported by approaches with rather limited scalability or expressiveness. In this report we improve an existing approach for the automated verification of inductive invariants for graph transformation systems. By employing partial negative application conditions to represent and check many alternative conditions in a more compact manner, we can check examples with rules and constraints of substantially higher complexity. We also substantially extend the expressive power by supporting more complex negative application conditions and provide higher accuracy by employing advanced implication checks. The improvements are evaluated and compared with another applicable tool by considering three case studies.
Duplicate detection is the task of identifying all groups of records within a data set that represent the same real-world entity, respectively. This task is difficult, because (i) representations might differ slightly, so some similarity measure must be defined to compare pairs of records and (ii) data sets might have a high volume making a pair-wise comparison of all records infeasible. To tackle the second problem, many algorithms have been suggested that partition the data set and compare all record pairs only within each partition. One well-known such approach is the Sorted Neighborhood Method (SNM), which sorts the data according to some key and then advances a window over the data comparing only records that appear within the same window. We propose several variations of SNM that have in common a varying window size and advancement. The general intuition of such adaptive windows is that there might be regions of high similarity suggesting a larger window size and regions of lower similarity suggesting a smaller window size. We propose and thoroughly evaluate several adaption strategies, some of which are provably better than the original SNM in terms of efficiency (same results with fewer comparisons).
Cloud computing is a model for enabling on-demand access to a shared pool of computing resources. With virtually limitless on-demand resources, a cloud environment enables the hosted Internet application to quickly cope when there is an increase in the workload. However, the overhead of provisioning resources exposes the Internet application to periods of under-provisioning and performance degradation. Moreover, the performance interference, due to the consolidation in the cloud environment, complicates the performance management of the Internet applications. In this dissertation, we propose two approaches to mitigate the impact of the resources provisioning overhead. The first approach employs control theory to scale resources vertically and cope fast with workload. This approach assumes that the provider has knowledge and control over the platform running in the virtual machines (VMs), which limits it to Platform as a Service (PaaS) and Software as a Service (SaaS) providers. The second approach is a customer-side one that deals with the horizontal scalability in an Infrastructure as a Service (IaaS) model. It addresses the trade-off problem between cost and performance with a multi-goal optimization solution. This approach finds the scale thresholds that achieve the highest performance with the lowest increase in the cost. Moreover, the second approach employs a proposed time series forecasting algorithm to scale the application proactively and avoid under-utilization periods. Furthermore, to mitigate the interference impact on the Internet application performance, we developed a system which finds and eliminates the VMs suffering from performance interference. The developed system is a light-weight solution which does not imply provider involvement. To evaluate our approaches and the designed algorithms at large-scale level, we developed a simulator called (ScaleSim). In the simulator, we implemented scalability components acting as the scalability components of Amazon EC2. The current scalability implementation in Amazon EC2 is used as a reference point for evaluating the improvement in the scalable application performance. ScaleSim is fed with realistic models of the RUBiS benchmark extracted from the real environment. The workload is generated from the access logs of the 1998 world cup website. The results show that optimizing the scalability thresholds and adopting proactive scalability can mitigate 88% of the resources provisioning overhead impact with only a 9% increase in the cost.
Developing rich Web applications can be a complex job - especially when it comes to mobile device support. Web-based environments such as Lively Webwerkstatt can help developers implement such applications by making the development process more direct and interactive. Further the process of developing software is collaborative which creates the need that the development environment offers collaboration facilities. This report describes extensions of the webbased development environment Lively Webwerkstatt such that it can be used in a mobile environment. The extensions are collaboration mechanisms, user interface adaptations but as well event processing and performance measuring on mobile devices.
This thesis presents novel ideas and research findings for the Web of Data – a global data space spanning many so-called Linked Open Data sources. Linked Open Data adheres to a set of simple principles to allow easy access and reuse for data published on the Web. Linked Open Data is by now an established concept and many (mostly academic) publishers adopted the principles building a powerful web of structured knowledge available to everybody. However, so far, Linked Open Data does not yet play a significant role among common web technologies that currently facilitate a high-standard Web experience. In this work, we thoroughly discuss the state-of-the-art for Linked Open Data and highlight several shortcomings – some of them we tackle in the main part of this work. First, we propose a novel type of data source meta-information, namely the topics of a dataset. This information could be published with dataset descriptions and support a variety of use cases, such as data source exploration and selection. For the topic retrieval, we present an approach coined Annotated Pattern Percolation (APP), which we evaluate with respect to topics extracted from Wikipedia portals. Second, we contribute to entity linking research by presenting an optimization model for joint entity linking, showing its hardness, and proposing three heuristics implemented in the LINked Data Alignment (LINDA) system. Our first solution can exploit multi-core machines, whereas the second and third approach are designed to run in a distributed shared-nothing environment. We discuss and evaluate the properties of our approaches leading to recommendations which algorithm to use in a specific scenario. The distributed algorithms are among the first of their kind, i.e., approaches for joint entity linking in a distributed fashion. Also, we illustrate that we can tackle the entity linking problem on the very large scale with data comprising more than 100 millions of entity representations from very many sources. Finally, we approach a sub-problem of entity linking, namely the alignment of concepts. We again target a method that looks at the data in its entirety and does not neglect existing relations. Also, this concept alignment method shall execute very fast to serve as a preprocessing for further computations. Our approach, called Holistic Concept Matching (HCM), achieves the required speed through grouping the input by comparing so-called knowledge representations. Within the groups, we perform complex similarity computations, relation conclusions, and detect semantic contradictions. The quality of our result is again evaluated on a large and heterogeneous dataset from the real Web. In summary, this work contributes a set of techniques for enhancing the current state of the Web of Data. All approaches have been tested on large and heterogeneous real-world input.
Because software development is increasingly expensive and timeconsuming, software reuse gains importance. Aspect-oriented software development modularizes crosscutting concerns which enables their systematic reuse. Literature provides a number of AOP patterns and best practices for developing reusable aspects based on compelling examples for concerns like tracing, transactions and persistence. However, such best practices are lacking for systematically reusing invasive aspects. In this paper, we present the ‘callback mismatch problem’. This problem arises in the context of abstraction mismatch, in which the aspect is required to issue a callback to the base application. As a consequence, the composition of invasive aspects is cumbersome to implement, difficult to maintain and impossible to reuse. We motivate this problem in a real-world example, show that it persists in the current state-of-the-art, and outline the need for advanced aspectual composition mechanisms to deal with this.