Refine
Year of publication
- 2014 (31) (remove)
Document Type
- Article (13)
- Doctoral Thesis (9)
- Monograph/Edited Volume (7)
- Postprint (1)
- Preprint (1)
Keywords
- cloud computing (4)
- Cloud Computing (3)
- 3D geovisualization (2)
- Classification (2)
- Forschungsprojekte (2)
- Future SOC Lab (2)
- Geschäftsprozessmanagement (2)
- In-Memory Technologie (2)
- Multicore Architekturen (2)
- RDF (2)
- Virtualization (2)
- business process management (2)
- image-based representation (2)
- multicore architectures (2)
- research projects (2)
- service-oriented architecture (2)
- software reference architecture (2)
- spatial data infrastructure (2)
- standardization (2)
- 3D geovirtual environments (1)
- 3D point clouds (1)
- Anfragepaare (1)
- Assoziationsregeln (1)
- Basic Storage Anbieter (1)
- Bayesian networks (1)
- Bayessche Netze (1)
- Betriebssysteme (1)
- Bitcoin (1)
- Carrera Digital D132 (1)
- Cloud (1)
- Cloud Datenzentren (1)
- Conformance Überprüfung (1)
- Datenflusskorrektheit (1)
- Datenintegration (1)
- Datenreinigung (1)
- Datenvertraulichkeit (1)
- Design (1)
- Differential Privacy (1)
- Distributed 3D geovisualization (1)
- E-Learning (1)
- Echtzeit (1)
- Energieeffizienz (1)
- Energy-aware (1)
- Entwicklungswerkzeuge (1)
- Erfahrungsbericht (1)
- Event processing (1)
- Fehlende Daten (1)
- Forschungskolleg (1)
- Graph rewriting (1)
- Hasso Plattner Institute (1)
- Hasso-Plattner-Institut (1)
- Hauptspeicher Technologie (1)
- Homomorphe Verschlüsselung (1)
- In-Memory technology (1)
- Informationsvorhaltung (1)
- Introspection (1)
- Klausurtagung (1)
- LEGO Mindstorms EV3 (1)
- LOD (1)
- Languages Model-driven engineering (1)
- Laufzeitverhalten (1)
- Live forensics (1)
- MOOCs (1)
- Mobile device (1)
- Model transformation (1)
- Monitoring language (1)
- Multimedia retrieval (1)
- Multiscale modeling (1)
- Mustererkennung (1)
- Objektlebenszyklus-Synchronisation (1)
- Online-Lernen (1)
- Onlinekurs (1)
- Out-of-core (1)
- Petri net Mapping (1)
- Petri net mapping (1)
- Ph.D. Retreat (1)
- Prediction (1)
- Privacy (1)
- Probabilistische Modelle (1)
- Process Mining (1)
- RT_PREEMT patch (1)
- RT_PREEMT-Patch (1)
- Research School (1)
- Resource management (1)
- Ressourcenmanagement (1)
- Robust optimization (1)
- SPARQL (1)
- Security (1)
- Service-oriented Systems Engineering (1)
- Sicherheit (1)
- Softwareanalyse (1)
- Softwareentwicklung (1)
- Softwareentwicklungsprozesse (1)
- Softwaretechnik (1)
- Softwarevisualisierung (1)
- Softwarewartung (1)
- Synonyme (1)
- System architecture (1)
- Tele-Lab (1)
- Tele-Teaching (1)
- Testen (1)
- Threshold Cryptography (1)
- Tool survey (1)
- Transformation tool contest (1)
- Vernetzte Daten (1)
- Versionierung (1)
- Video OCR (1)
- Video indexing (1)
- View navigation (1)
- Virtual 3D city model (1)
- Virtual camera control (1)
- Visualisierung (1)
- Visualization (1)
- Vorhersage (1)
- association rule mining (1)
- basic cloud storage services (1)
- bitcoin (1)
- cartographic design (1)
- changeability (1)
- cleansing (1)
- cloud (1)
- cloud datacenter (1)
- computer science (1)
- confidentiality (1)
- conformance checking (1)
- data (1)
- data flow correctness (1)
- database technology (1)
- development tools (1)
- differential privacy (1)
- dynamic consolidation (1)
- dynamische Umsortierung (1)
- e-learning (1)
- eingebettete Systeme (1)
- embedded systems (1)
- empirical studies (1)
- empirische Studien (1)
- energy efficiency (1)
- experience report (1)
- feedback loops (1)
- focus plus context visualization (1)
- fully concurrent bisimulation (1)
- future SOC lab (1)
- ganzheitlich (1)
- holistic (1)
- homomorphic encryption (1)
- in-memory technology (1)
- knowledge discovery (1)
- layered architecture (1)
- linked data (1)
- main memory computing (1)
- map reduce (1)
- maximal structuring (1)
- missing data (1)
- model interpreter (1)
- model transformation (1)
- model-driven engineering (1)
- modelgetriebene Entwicklung (1)
- modeling language (1)
- models at runtime (1)
- multi-perspective visualization (1)
- object life cycle synchronization (1)
- online course (1)
- online-learning (1)
- openHPI (1)
- operating systems (1)
- panorama (1)
- parallel (1)
- prediction (1)
- prefetching (1)
- privacy (1)
- probabilistic models (1)
- process mining (1)
- process modeling (1)
- public cloud storage services (1)
- query matching (1)
- query optimisation (1)
- query rewriting (1)
- real-time (1)
- resource management (1)
- runtime behavior (1)
- security (1)
- self-adaptive software (1)
- software analysis (1)
- software development (1)
- software development processes (1)
- software engineering (1)
- software evolution (1)
- software maintenance (1)
- software visualization (1)
- stochastic Petri nets (1)
- stochastische Petri Netze (1)
- structured process model (1)
- synonym discovery (1)
- tele-TASK (1)
- tele-lab (1)
- tele-teaching (1)
- testing (1)
- threshold cryptography (1)
- versioning (1)
- verteilte Datenbanken (1)
- virtualisierte IT-Infrastruktur (1)
- visualization (1)
- Änderbarkeit (1)
- öffentliche Cloud Speicherdienste (1)
Institute
- Hasso-Plattner-Institut für Digital Engineering gGmbH (31) (remove)
Organizations try to gain competitive advantages, and to increase customer satisfaction. To ensure the quality and efficiency of their business processes, they perform business process management. An important part of process management that happens on the daily operational level is process controlling. A prerequisite of controlling is process monitoring, i.e., keeping track of the performed activities in running process instances. Only by process monitoring can business analysts detect delays and react to deviations from the expected or guaranteed performance of a process instance. To enable monitoring, process events need to be collected from the process environment. When a business process is orchestrated by a process execution engine, monitoring is available for all orchestrated process activities. Many business processes, however, do not lend themselves to automatic orchestration, e.g., because of required freedom of action. This situation is often encountered in hospitals, where most business processes are manually enacted. Hence, in practice it is often inefficient or infeasible to document and monitor every process activity. Additionally, manual process execution and documentation is prone to errors, e.g., documentation of activities can be forgotten. Thus, organizations face the challenge of process events that occur, but are not observed by the monitoring environment. These unobserved process events can serve as basis for operational process decisions, even without exact knowledge of when they happened or when they will happen. An exemplary decision is whether to invest more resources to manage timely completion of a case, anticipating that the process end event will occur too late. This thesis offers means to reason about unobserved process events in a probabilistic way. We address decisive questions of process managers (e.g., "when will the case be finished?", or "when did we perform the activity that we forgot to document?") in this thesis. As main contribution, we introduce an advanced probabilistic model to business process management that is based on a stochastic variant of Petri nets. We present a holistic approach to use the model effectively along the business process lifecycle. Therefore, we provide techniques to discover such models from historical observations, to predict the termination time of processes, and to ensure quality by missing data management. We propose mechanisms to optimize configuration for monitoring and prediction, i.e., to offer guidance in selecting important activities to monitor. An implementation is provided as a proof of concept. For evaluation, we compare the accuracy of the approach with that of state-of-the-art approaches using real process data of a hospital. Additionally, we show its more general applicability in other domains by applying the approach on process data from logistics and finance.
The data quality of real-world datasets need to be constantly monitored and maintained to allow organizations and individuals to reliably use their data. Especially, data integration projects suffer from poor initial data quality and as a consequence consume more effort and money. Commercial products and research prototypes for data cleansing and integration help users to improve the quality of individual and combined datasets. They can be divided into either standalone systems or database management system (DBMS) extensions. On the one hand, standalone systems do not interact well with DBMS and require time-consuming data imports and exports. On the other hand, DBMS extensions are often limited by the underlying system and do not cover the full set of data cleansing and integration tasks.
We overcome both limitations by implementing a concise set of five data cleansing and integration operators on the parallel data analytics platform Stratosphere. We define the semantics of the operators, present their parallel implementation, and devise optimization techniques for individual operators and combinations thereof. Users specify declarative queries in our query language METEOR with our new operators to improve the data quality of individual datasets or integrate them to larger datasets. By integrating the data cleansing operators into the higher level language layer of Stratosphere, users can easily combine cleansing operators with operators from other domains, such as information extraction, to complex data flows. Through a generic description of the operators, the Stratosphere optimizer reorders operators even from different domains to find better query plans.
As a case study, we reimplemented a part of the large Open Government Data integration project GovWILD with our new operators and show that our queries run significantly faster than the original GovWILD queries, which rely on relational operators. Evaluation reveals that our operators exhibit good scalability on up to 100 cores, so that even larger inputs can be efficiently processed by scaling out to more machines. Finally, our scripts are considerably shorter than the original GovWILD scripts, which results in better maintainability of the scripts.
The term Linked Data refers to connected information sources comprising structured data about a wide range of topics and for a multitude of applications. In recent years, the conceptional and technical foundations of Linked Data have been formalized and refined. To this end, well-known technologies have been established, such as the Resource Description Framework (RDF) as a Linked Data model or the SPARQL Protocol and RDF Query Language (SPARQL) for retrieving this information. Whereas most research has been conducted in the area of generating and publishing Linked Data, this thesis presents novel approaches for improved management. In particular, we illustrate new methods for analyzing and processing SPARQL queries. Here, we present two algorithms suitable for identifying structural relationships between these queries. Both algorithms are applied to a large number of real-world requests to evaluate the performance of the approaches and the quality of their results. Based on this, we introduce different strategies enabling optimized access of Linked Data sources. We demonstrate how the presented approach facilitates effective utilization of SPARQL endpoints by prefetching results relevant for multiple subsequent requests. Furthermore, we contribute a set of metrics for determining technical characteristics of such knowledge bases. To this end, we devise practical heuristics and validate them through thorough analysis of real-world data sources. We discuss the findings and evaluate their impact on utilizing the endpoints. Moreover, we detail the adoption of a scalable infrastructure for improving Linked Data discovery and consumption. As we outline in an exemplary use case, this platform is eligible both for processing and provisioning the corresponding information.
This work introduces concepts and corresponding tool support to enable a complementary approach in dealing with recovery. Programmers need to recover a development state, or a part thereof, when previously made changes reveal undesired implications. However, when the need arises suddenly and unexpectedly, recovery often involves expensive and tedious work. To avoid tedious work, literature recommends keeping away from unexpected recovery demands by following a structured and disciplined approach, which consists of the application of various best practices including working only on one thing at a time, performing small steps, as well as making proper use of versioning and testing tools. However, the attempt to avoid unexpected recovery is both time-consuming and error-prone. On the one hand, it requires disproportionate effort to minimize the risk of unexpected situations. On the other hand, applying recommended practices selectively, which saves time, can hardly avoid recovery. In addition, the constant need for foresight and self-control has unfavorable implications. It is exhaustive and impedes creative problem solving. This work proposes to make recovery fast and easy and introduces corresponding support called CoExist. Such dedicated support turns situations of unanticipated recovery from tedious experiences into pleasant ones. It makes recovery fast and easy to accomplish, even if explicit commits are unavailable or tests have been ignored for some time. When mistakes and unexpected insights are no longer associated with tedious corrective actions, programmers are encouraged to change source code as a means to reason about it, as opposed to making changes only after structuring and evaluating them mentally. This work further reports on an implementation of the proposed tool support in the Squeak/Smalltalk development environment. The development of the tools has been accompanied by regular performance and usability tests. In addition, this work investigates whether the proposed tools affect programmers’ performance. In a controlled lab study, 22 participants improved the design of two different applications. Using a repeated measurement setup, the study examined the effect of providing CoExist on programming performance. The result of analyzing 88 hours of programming suggests that built-in recovery support as provided with CoExist positively has a positive effect on programming performance in explorative programming tasks.
In the field of disk-based parallel database management systems exists a great variety of solutions based on a shared-storage or a shared-nothing architecture. In contrast, main memory-based parallel database management systems are dominated solely by the shared-nothing approach as it preserves the in-memory performance advantage by processing data locally on each server. We argue that this unilateral development is going to cease due to the combination of the following three trends: a) Nowadays network technology features remote direct memory access (RDMA) and narrows the performance gap between accessing main memory inside a server and of a remote server to and even below a single order of magnitude. b) Modern storage systems scale gracefully, are elastic, and provide high-availability. c) A modern storage system such as Stanford's RAMCloud even keeps all data resident in main memory. Exploiting these characteristics in the context of a main-memory parallel database management system is desirable. The advent of RDMA-enabled network technology makes the creation of a parallel main memory DBMS based on a shared-storage approach feasible.
This thesis describes building a columnar database on shared main memory-based storage. The thesis discusses the resulting architecture (Part I), the implications on query processing (Part II), and presents an evaluation of the resulting solution in terms of performance, high-availability, and elasticity (Part III).
In our architecture, we use Stanford's RAMCloud as shared-storage, and the self-designed and developed in-memory AnalyticsDB as relational query processor on top. AnalyticsDB encapsulates data access and operator execution via an interface which allows seamless switching between local and remote main memory, while RAMCloud provides not only storage capacity, but also processing power. Combining both aspects allows pushing-down the execution of database operators into the storage system. We describe how the columnar data processed by AnalyticsDB is mapped to RAMCloud's key-value data model and how the performance advantages of columnar data storage can be preserved.
The combination of fast network technology and the possibility to execute database operators in the storage system opens the discussion for site selection. We construct a system model that allows the estimation of operator execution costs in terms of network transfer, data processed in memory, and wall time. This can be used for database operators that work on one relation at a time - such as a scan or materialize operation - to discuss the site selection problem (data pull vs. operator push). Since a database query translates to the execution of several database operators, it is possible that the optimal site selection varies per operator. For the execution of a database operator that works on two (or more) relations at a time, such as a join, the system model is enriched by additional factors such as the chosen algorithm (e.g. Grace- vs. Distributed Block Nested Loop Join vs. Cyclo-Join), the data partitioning of the respective relations, and their overlapping as well as the allowed resource allocation.
We present an evaluation on a cluster with 60 nodes where all nodes are connected via RDMA-enabled network equipment. We show that query processing performance is about 2.4x slower if everything is done via the data pull operator execution strategy (i.e. RAMCloud is being used only for data access) and about 27% slower if operator execution is also supported inside RAMCloud (in comparison to operating only on main memory inside a server without any network communication at all). The fast-crash recovery feature of RAMCloud can be leveraged to provide high-availability, e.g. a server crash during query execution only delays the query response for about one second. Our solution is elastic in a way that it can adapt to changing workloads a) within seconds, b) without interruption of the ongoing query processing, and c) without manual intervention.
Nowadays, software systems are getting more and more complex. To tackle this challenge most diverse techniques, such as design patterns, service oriented architectures (SOA), software development processes, and model-driven engineering (MDE), are used to improve productivity, while time to market and quality of the products stay stable. Multiple of these techniques are used in parallel to profit from their benefits. While the use of sophisticated software development processes is standard, today, MDE is just adopted in practice. However, research has shown that the application of MDE is not always successful. It is not fully understood when advantages of MDE can be used and to what degree MDE can also be disadvantageous for productivity. Further, when combining different techniques that aim to affect the same factor (e.g. productivity) the question arises whether these techniques really complement each other or, in contrast, compensate their effects. Due to that, there is the concrete question how MDE and other techniques, such as software development process, are interrelated. Both aspects (advantages and disadvantages for productivity as well as the interrelation to other techniques) need to be understood to identify risks relating to the productivity impact of MDE. Before studying MDE's impact on productivity, it is necessary to investigate the range of validity that can be reached for the results. This includes two questions. First, there is the question whether MDE's impact on productivity is similar for all approaches of adopting MDE in practice. Second, there is the question whether MDE's impact on productivity for an approach of using MDE in practice remains stable over time. The answers for both questions are crucial for handling risks of MDE, but also for the design of future studies on MDE success. This thesis addresses these questions with the goal to support adoption of MDE in future. To enable a differentiated discussion about MDE, the term MDE setting'' is introduced. MDE setting refers to the applied technical setting, i.e. the employed manual and automated activities, artifacts, languages, and tools. An MDE setting's possible impact on productivity is studied with a focus on changeability and the interrelation to software development processes. This is done by introducing a taxonomy of changeability concerns that might be affected by an MDE setting. Further, three MDE traits are identified and it is studied for which manifestations of these MDE traits software development processes are impacted. To enable the assessment and evaluation of an MDE setting's impacts, the Software Manufacture Model language is introduced. This is a process modeling language that allows to reason about how relations between (modeling) artifacts (e.g. models or code files) change during application of manual or automated development activities. On that basis, risk analysis techniques are provided. These techniques allow identifying changeability risks and assessing the manifestations of the MDE traits (and with it an MDE setting's impact on software development processes). To address the range of validity, MDE settings from practice and their evolution histories were capture in context of this thesis. First, this data is used to show that MDE settings cover the whole spectrum concerning their impact on changeability or interrelation to software development processes. Neither it is seldom that MDE settings are neutral for processes nor is it seldom that MDE settings have impact on processes. Similarly, the impact on changeability differs relevantly. Second, a taxonomy of evolution of MDE settings is introduced. In that context it is discussed to what extent different types of changes on an MDE setting can influence this MDE setting's impact on changeability and the interrelation to processes. The category of structural evolution, which can change these characteristics of an MDE setting, is identified. The captured MDE settings from practice are used to show that structural evolution exists and is common. In addition, some examples of structural evolution steps are collected that actually led to a change in the characteristics of the respective MDE settings. Two implications are: First, the assessed diversity of MDE settings evaluates the need for the analysis techniques that shall be presented in this thesis. Second, evolution is one explanation for the diversity of MDE settings in practice. To summarize, this thesis studies the nature and evolution of MDE settings in practice. As a result support for the adoption of MDE settings is provided in form of techniques for the identification of risks relating to productivity impacts.
Concepts and techniques for integration, analysis and visualization of massive 3D point clouds
(2014)
Remote sensing methods, such as LiDAR and image-based photogrammetry, are established approaches for capturing the physical world. Professional and low-cost scanning devices are capable of generating dense 3D point clouds. Typically, these 3D point clouds are preprocessed by GIS and are then used as input data in a variety of applications such as urban planning, environmental monitoring, disaster management, and simulation. The availability of area-wide 3D point clouds will drastically increase in the future due to the availability of novel capturing methods (e.g., driver assistance systems) and low-cost scanning devices. Applications, systems, and workflows will therefore face large collections of redundant, up-to-date 3D point clouds and have to cope with massive amounts of data. Hence, approaches are required that will efficiently integrate, update, manage, analyze, and visualize 3D point clouds. In this paper, we define requirements for a system infrastructure that enables the integration of 3D point clouds from heterogeneous capturing devices and different timestamps. Change detection and update strategies for 3D point clouds are presented that reduce storage requirements and offer new insights for analysis purposes. We also present an approach that attributes 3D point clouds with semantic information (e.g., object class category information), which enables more effective data processing, analysis, and visualization. Out-of-core real-time rendering techniques then allow for an interactive exploration of the entire 3D point cloud and the corresponding analysis results. Web-based visualization services are utilized to make 3D point clouds available to a large community. The proposed concepts and techniques are designed to establish 3D point clouds as base datasets, as well as rendering primitives for analysis and visualization tasks, which allow operations to be performed directly on the point data. Finally, we evaluate the presented system, report on its applications, and discuss further research challenges.
Software maintenance encompasses any changes made to a software system after its initial deployment and is thereby one of the key phases in the typical software-engineering lifecycle. In software maintenance, we primarily need to understand structural and behavioral aspects, which are difficult to obtain, e.g., by code reading. Software analysis is therefore a vital tool for maintaining these systems: It provides - the preferably automated - means to extract and evaluate information from their artifacts such as software structure, runtime behavior, and related processes. However, such analysis typically results in massive raw data, so that even experienced engineers face difficulties directly examining, assessing, and understanding these data. Among other things, they require tools with which to explore the data if no clear question can be formulated beforehand. For this, software analysis and visualization provide its users with powerful interactive means. These enable the automation of tasks and, particularly, the acquisition of valuable and actionable insights into the raw data. For instance, one means for exploring runtime behavior is trace visualization. This thesis aims at extending and improving the tool set for visual software analysis by concentrating on several open challenges in the fields of dynamic and static analysis of software systems. This work develops a series of concepts and tools for the exploratory visualization of the respective data to support users in finding and retrieving information on the system artifacts concerned. This is a difficult task, due to the lack of appropriate visualization metaphors; in particular, the visualization of complex runtime behavior poses various questions and challenges of both a technical and conceptual nature. This work focuses on a set of visualization techniques for visually representing control-flow related aspects of software traces from shared-memory software systems: A trace-visualization concept based on icicle plots aids in understanding both single-threaded as well as multi-threaded runtime behavior on the function level. The concept’s extensibility further allows the visualization and analysis of specific aspects of multi-threading such as synchronization, the correlation of such traces with data from static software analysis, and a comparison between traces. Moreover, complementary techniques for simultaneously analyzing system structures and the evolution of related attributes are proposed. These aim at facilitating long-term planning of software architecture and supporting management decisions in software projects by extensions to the circular-bundle-view technique: An extension to 3-dimensional space allows for the use of additional variables simultaneously; interaction techniques allow for the modification of structures in a visual manner. The concepts and techniques presented here are generic and, as such, can be applied beyond software analysis for the visualization of similarly structured data. The techniques' practicability is demonstrated by several qualitative studies using subject data from industry-scale software systems. The studies provide initial evidence that the techniques' application yields useful insights into the subject data and its interrelationships in several scenarios.
A growing number of enterprises use complex event processing for monitoring and controlling their operations, while business process models are used to document working procedures. In this work, we propose a comprehensive method for complex event processing optimization using business process models. Our proposed method is based on the extraction of behaviorial constraints that are used, in turn, to rewrite patterns for event detection, and select and transform execution plans. We offer a set of rewriting rules that is shown to be complete with respect to the all, seq, and any patterns. The effectiveness of our method is demonstrated in an experimental evaluation with a large number of processes from an insurance company. We illustrate that the proposed optimization leads to significant savings in query processing. By integrating the optimization in state-of-the-art systems for event pattern matching, we demonstrate that these savings materialize in different technical infrastructures and can be combined with existing optimization techniques.
Virtual 3D city models serve as integration platforms for complex geospatial and georeferenced information and as medium for effective communication of spatial information. In order to explore these information spaces, navigation techniques for controlling the virtual camera are required to facilitate wayfinding and movement. However, navigation is not a trivial task and many available navigation techniques do not support users effectively and efficiently with their respective skills and tasks. In this article, we present an assisting, constrained navigation technique for multiscale virtual 3D city models that is based on three basic principles: users point to navigate, users are lead by suggestions, and the exploitation of semantic, multiscale, hierarchical structurings of city models. The technique particularly supports users with low navigation and virtual camera control skills but is also valuable for experienced users. It supports exploration, search, inspection, and presentation tasks, is easy to learn and use, supports orientation, is efficient, and yields effective view properties. In particular, the technique is suitable for interactive kiosks and mobile devices with a touch display and low computing resources and for use in mobile situations where users only have restricted resources for operating the application. We demonstrate the validity of the proposed navigation technique by presenting an implementation and evaluation results. The implementation is based on service-oriented architectures, standards, and image-based representations and allows exploring massive virtual 3D city models particularly on mobile devices with limited computing resources. Results of a user study comparing the proposed navigation technique with standard techniques suggest that the proposed technique provides the targeted properties, and that it is more advantageous to novice than to expert users.
Text displayed in a video is an essential part for the high-level semantic information of the video content. Therefore, video text can be used as a valuable source for automated video indexing in digital video libraries. In this paper, we propose a workflow for video text detection and recognition. In the text detection stage, we have developed a fast localization-verification scheme, in which an edge-based multi-scale text detector first identifies potential text candidates with high recall rate. Then, detected candidate text lines are refined by using an image entropy-based filter. Finally, Stroke Width Transform (SWT)- and Support Vector Machine (SVM)-based verification procedures are applied to eliminate the false alarms. For text recognition, we have developed a novel skeleton-based binarization method in order to separate text from complex backgrounds to make it processible for standard OCR (Optical Character Recognition) software. Operability and accuracy of proposed text detection and binarization methods have been evaluated by using publicly available test data sets.
Model transformation is one of the key tasks in model-driven engineering and relies on the efficient matching and modification of graph-based data structures; its sibling graph rewriting has been used to successfully model problems in a variety of domains. Over the last years, a wide range of graph and model transformation tools have been developed all of them with their own particular strengths and typical application domains. In this paper, we give a survey and a comparison of the model and graph transformation tools that participated at the Transformation Tool Contest 2011. The reader gains an overview of the field and its tools, based on the illustrative solutions submitted to a Hello World task, and a comparison alongside a detailed taxonomy. The article is of interest to researchers in the field of model and graph transformation, as well as to software engineers with a transformation task at hand who have to choose a tool fitting to their needs. All solutions referenced in this article provide a SHARE demo. It supported the peer-review process for the contest, and now allows the reader to test the tools online.
Nested application conditions generalise the well-known negative application conditions and are important for several application domains. In this paper, we present Local Church-Rosser, Parallelism, Concurrency and Amalgamation Theorems for rules with nested application conditions in the framework of M-adhesive categories, where M-adhesive categories are slightly more general than weak adhesive high-level replacement categories. Most of the proofs are based on the corresponding statements for rules without application conditions and two shift lemmas stating that nested application conditions can be shifted over morphisms and rules.
With the growth of virtualization and cloud computing, more and more forensic investigations rely on being able to perform live forensics on a virtual machine using virtual machine introspection (VMI). Inspecting a virtual machine through its hypervisor enables investigation without risking contamination of the evidence, crashing the computer, etc. To further access to these techniques for the investigator/researcher we have developed a new VMI monitoring language. This language is based on a review of the most commonly used VMI-techniques to date, and it enables the user to monitor the virtual machine's memory, events and data streams. A prototype implementation of our monitoring system was implemented in KVM, though implementation on any hypervisor that uses the common x86 virtualization hardware assistance support should be straightforward. Our prototype outperforms the proprietary VMWare VProbes in many cases, with a maximum performance loss of 18% for a realistic test case, which we consider acceptable. Our implementation is freely available under a liberal software distribution license. (C) 2014 Digital Forensics Research Workshop. Published by Elsevier Ltd. All rights reserved.
This article addresses the transformation of a process model with an arbitrary topology into an equivalent structured process model. In particular, this article studies the subclass of process models that have no equivalent well-structured representation but which, nevertheless, can be partially structured into their maximally-structured representation. The transformations are performed under a behavioral equivalence notion that preserves the observed concurrency of tasks in equivalent process models. The article gives a full characterization of the subclass of acyclic process models that have no equivalent well-structured representation, but do have an equivalent maximally-structured one, as well as proposes a complete structuring method. Together with our previous results, this article completes the solution of the process model structuring problem for the class of acyclic process models.
The development of self-adaptive software requires the engineering of an adaptation engine that controls the underlying adaptable software by feedback loops. The engine often describes the adaptation by runtime models representing the adaptable software and by activities such as analysis and planning that use these models. To systematically address the interplay between runtime models and adaptation activities, runtime megamodels have been proposed. A runtime megamodel is a specific model capturing runtime models and adaptation activities. In this article, we go one step further and present an executable modeling language for ExecUtable RuntimE MegAmodels (EUREMA) that eases the development of adaptation engines by following a model-driven engineering approach. We provide a domain-specific modeling language and a runtime interpreter for adaptation engines, in particular feedback loops. Megamodels are kept alive at runtime and by interpreting them, they are directly executed to run feedback loops. Additionally, they can be dynamically adjusted to adapt feedback loops. Thus, EUREMA supports development by making feedback loops explicit at a higher level of abstraction and it enables solutions where multiple feedback loops interact or operate on top of each other and self-adaptation co-exists with offline adaptation for evolution.
Modern 3D geovisualization systems (3DGeoVSs) are complex and evolving systems that are required to be adaptable and leverage distributed resources, including massive geodata. This article focuses on 3DGeoVSs built based on the principles of service-oriented architectures, standards and image-based representations (SSI) to address practically relevant challenges and potentials. Such systems facilitate resource sharing and agile and efficient system construction and change in an interoperable manner, while exploiting images as efficient, decoupled and interoperable representations. The software architecture of a 3DGeoVS and its underlying visualization model have strong effects on the system's quality attributes and support various system life cycle activities. This article contributes a software reference architecture (SRA) for 3DGeoVSs based on SSI that can be used to design, describe and analyze concrete software architectures with the intended primary benefit of an increase in effectiveness and efficiency in such activities. The SRA integrates existing, proven technology and novel contributions in a unique manner. As the foundation for the SRA, we propose the generalized visualization pipeline model that generalizes and overcomes expressiveness limitations of the prevalent visualization pipeline model. To facilitate exploiting image-based representations (IReps), the SRA integrates approaches for the representation, provisioning and styling of and interaction with IReps. Five applications of the SRA provide proofs of concept for the general applicability and utility of the SRA. A qualitative evaluation indicates the overall suitability of the SRA, its applications and the general approach of building 3DGeoVSs based on SSI.
Linked Open Data (LOD) comprises very many and often large public data sets and knowledge bases. Those datasets are mostly presented in the RDF triple structure of subject, predicate, and object, where each triple represents a statement or fact. Unfortunately, the heterogeneity of available open data requires significant integration steps before it can be used in applications. Meta information, such as ontological definitions and exact range definitions of predicates, are desirable and ideally provided by an ontology. However in the context of LOD, ontologies are often incomplete or simply not available. Thus, it is useful to automatically generate meta information, such as ontological dependencies, range definitions, and topical classifications. Association rule mining, which was originally applied for sales analysis on transactional databases, is a promising and novel technique to explore such data. We designed an adaptation of this technique for min-ing Rdf data and introduce the concept of “mining configurations”, which allows us to mine RDF data sets in various ways. Different configurations enable us to identify schema and value dependencies that in combination result in interesting use cases. To this end, we present rule-based approaches for auto-completion, data enrichment, ontology improvement, and query relaxation. Auto-completion remedies the problem of inconsistent ontology usage, providing an editing user with a sorted list of commonly used predicates. A combination of different configurations step extends this approach to create completely new facts for a knowledge base. We present two approaches for fact generation, a user-based approach where a user selects the entity to be amended with new facts and a data-driven approach where an algorithm discovers entities that have to be amended with missing facts. As knowledge bases constantly grow and evolve, another approach to improve the usage of RDF data is to improve existing ontologies. Here, we present an association rule based approach to reconcile ontology and data. Interlacing different mining configurations, we infer an algorithm to discover synonymously used predicates. Those predicates can be used to expand query results and to support users during query formulation. We provide a wide range of experiments on real world datasets for each use case. The experiments and evaluations show the added value of association rule mining for the integration and usability of RDF data and confirm the appropriateness of our mining configuration methodology.