TY  - THES
A1  - Heise, Arvid
T1  - Data cleansing and integration operators for a parallel data analytics platform
T1  - Datenreinigungs- und Integrationsoperatoren für ein
paralles Datenanalyseframework
N2  - The data quality of real-world datasets need to be constantly monitored and maintained to allow organizations and individuals to reliably use their data. Especially, data integration projects suffer from poor initial data quality and as a consequence consume more effort and money. Commercial products and research prototypes for data cleansing and integration help users to improve the quality of individual and combined datasets. They can be divided into either standalone systems or database management system (DBMS) extensions. On the one hand, standalone systems do not interact well with DBMS and require time-consuming data imports and exports. On the other hand, DBMS extensions are often limited by the underlying system and do not cover the full set of data cleansing and integration tasks.

We overcome both limitations by implementing a concise set of five data cleansing and integration operators on the parallel data analytics platform Stratosphere. We define the semantics of the operators, present their parallel implementation, and devise optimization techniques for individual operators and combinations thereof. Users specify declarative queries in our query language METEOR with our new operators to improve the data quality of individual datasets or integrate them to larger datasets. By integrating the data cleansing operators into the higher level language layer of Stratosphere, users can easily combine cleansing operators with operators from other domains, such as information extraction, to complex data flows. Through a generic description of the operators, the Stratosphere optimizer reorders operators even from different domains to find better query plans.

As a case study, we reimplemented a part of the large Open Government Data integration project GovWILD with our new operators and show that our queries run significantly faster than the original GovWILD queries, which rely on relational operators. Evaluation reveals that our operators exhibit good scalability on up to 100 cores, so that even larger inputs can be efficiently processed by scaling out to more machines. Finally, our scripts are considerably shorter than the original GovWILD scripts, which results in better maintainability of the scripts.
N2  - Die Datenqualität von Realweltdaten muss ständig überwacht und gewartet werden, damit Organisationen und Individuen ihre Daten verlässlich nutzen können. Besonders Datenintegrationsprojekte leiden unter schlechter Datenqualität in den Quelldaten und benötigen somit mehr Zeit und Geld. Kommerzielle Produkte und Forschungsprototypen helfen Nutzern die Qualität in einzelnen und kombinierten Datensätzen zu verbessern. Die Systeme können in selbständige Systeme und Erweiterungen von bestehenden Datenbankmanagementsystemen (DBMS) unterteilt werden. Auf der einen Seite interagieren selbständige Systeme nicht gut mit DBMS und brauchen zeitaufwändigen Datenimport und -export. Auf der anderen Seite sind die DBMS Erweiterungen häufig durch das unterliegende System limitiert und unterstützen nicht die gesamte Bandbreite an Datenreinigungs- und -integrationsaufgaben.

Wir überwinden beide Limitationen, indem wir eine Menge von häufig benötigten Datenreinigungs- und Datenintegrationsoperatoren direkt in der parallelen Datenanalyseplattform Stratosphere implementieren. Wir definieren die Semantik der Operatoren, präsentieren deren parallele Implementierung und entwickeln Optimierungstechniken für die einzelnen und mehrere Operatoren. Nutzer können deklarative Anfragen in unserer Anfragesprache METEOR mit unseren neuen Operatoren formulieren, um die Datenqualität von einzelnen Datensätzen zu erhöhen, oder um sie zu größeren Datensätzen zu integrieren. Durch die Integration der Operatoren in die Hochsprachenschicht von Stratosphere können Nutzer Datenreinigungsoperatoren einfach mit Operatoren aus anderen Domänen wie Informationsextraktion zu komplexen Datenflüssen kombinieren. Da Stratosphere Operatoren durch generische Beschreibungen in den Optimierer integriert werden, ist es für den Optimierer sogar möglich Operatoren unterschiedlicher Domänen zu vertauschen, um besseren Anfrageplänen zu ermitteln.

Für eine Fallstudie haben wir Teile des großen Datenintegrationsprojektes GovWILD auf Stratosphere mit den neuen Operatoren nachimplementiert und zeigen, dass unsere Anfragen signifikant schneller laufen als die originalen GovWILD Anfragen, die sich auf relationale Operatoren verlassen. Die Evaluation zeigt, dass unsere Operatoren gut auf bis zu 100 Kernen skalieren, sodass sogar größere Datensätze effizient verarbeitet werden können, indem die Anfragen auf mehr Maschinen ausgeführt werden. Schließlich sind unsere Skripte erheblich kürzer als die originalen GovWILD Skripte, was in besserer Wartbarkeit unserer Skripte resultiert.
KW  - data
KW  - cleansing
KW  - holistic
KW  - parallel
KW  - map reduce
KW  - Datenreinigung
KW  - Datenintegration
KW  - ganzheitlich
KW  - parallel
KW  - map reduce
Y1  - 2014
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-77100
ER  - 
TY  - THES
A1  - Steinert, Bastian
T1  - Built-in recovery support for explorative programming
T1  - Eingebaute Unterstützung für Wiederherstellungsbedürfnisse für unstrukturierte ergebnisoffene Programmieraufgaben
BT  - preserving immediate access to static and dynamic information of intermediate development states
BT  - Erhaltung des unmittelbaren Zugriffs auf statische und dynamische Informationen von Entwicklungszwischenständen
N2  - This work introduces concepts and corresponding tool support to enable a complementary approach in dealing with recovery. Programmers need to recover a development state, or a part thereof, when previously made changes reveal undesired implications. However, when the need arises suddenly and unexpectedly, recovery often involves expensive and tedious work. To avoid tedious work, literature recommends keeping away from unexpected recovery demands by following a structured and disciplined approach, which consists of the application of various best practices including working only on one thing at a time, performing small steps, as well as making proper use of versioning and testing tools. However, the attempt to avoid unexpected recovery is both time-consuming and error-prone. On the one hand, it requires disproportionate effort to minimize the risk of unexpected situations. On the other hand, applying recommended practices selectively, which saves time, can hardly avoid recovery. In addition, the constant need for foresight and self-control has unfavorable implications. It is exhaustive and impedes creative problem solving. This work proposes to make recovery fast and easy and introduces corresponding support called CoExist. Such dedicated support turns situations of unanticipated recovery from tedious experiences into pleasant ones. It makes recovery fast and easy to accomplish, even if explicit commits are unavailable or tests have been ignored for some time. When mistakes and unexpected insights are no longer associated with tedious corrective actions, programmers are encouraged to change source code as a means to reason about it, as opposed to making changes only after structuring and evaluating them mentally. This work further reports on an implementation of the proposed tool support in the Squeak/Smalltalk development environment. The development of the tools has been accompanied by regular performance and usability tests. In addition, this work investigates whether the proposed tools affect programmers’ performance. In a controlled lab study, 22 participants improved the design of two different applications. Using a repeated measurement setup, the study examined the effect of providing CoExist on programming performance. The result of analyzing 88 hours of programming suggests that built-in recovery support as provided with CoExist positively has a positive effect on programming performance in explorative programming tasks.
N2  - Diese Arbeit präsentiert Konzepte und die zugehörige Werkzeugunterstützung um einen komplementären Umgang mit Wiederherstellungsbedürfnissen zu ermöglichen. Programmierer haben Bedarf zur Wiederherstellung eines früheren Entwicklungszustandes oder Teils davon, wenn ihre Änderungen ungewünschte Implikationen aufzeigen. Wenn dieser Bedarf plötzlich und unerwartet auftritt, dann ist die notwendige Wiederherstellungsarbeit häufig mühsam und aufwendig. Zur Vermeidung mühsamer Arbeit empfiehlt die Literatur die Vermeidung von unerwarteten Wiederherstellungsbedürfnissen durch einen strukturierten und disziplinierten Programmieransatz, welcher die Verwendung verschiedener bewährter Praktiken vorsieht. Diese Praktiken sind zum Beispiel: nur an einer Sache gleichzeitig zu arbeiten, immer nur kleine Schritte auszuführen, aber auch der sachgemäße Einsatz von Versionskontroll- und Testwerkzeugen. Jedoch ist der Versuch des Abwendens unerwarteter Wiederherstellungsbedürfnisse sowohl zeitintensiv als auch fehleranfällig. Einerseits erfordert es unverhältnismäßig hohen Aufwand, das Risiko des Eintretens unerwarteter Situationen auf ein Minimum zu reduzieren. Andererseits ist eine zeitsparende selektive Ausführung der empfohlenen Praktiken kaum hinreichend, um Wiederherstellungssituationen zu vermeiden. Zudem bringt die ständige Notwendigkeit an Voraussicht und Selbstkontrolle Nachteile mit sich. Dies ist ermüdend und erschwert das kreative Problemlösen. Diese Arbeit schlägt vor, Wiederherstellungsaufgaben zu vereinfachen und beschleunigen, und stellt entsprechende Werkzeugunterstützung namens CoExist vor. Solche zielgerichtete Werkzeugunterstützung macht aus unvorhergesehenen mühsamen Wiederherstellungssituationen eine konstruktive Erfahrung. Damit ist Wiederherstellung auch dann leicht und schnell durchzuführen, wenn explizit gespeicherte Zwischenstände fehlen oder die Tests für einige Zeit ignoriert wurden. Wenn Fehler und unerwartete Ein- sichten nicht länger mit mühsamen Schadensersatz verbunden sind, fühlen sich Programmierer eher dazu ermutig, Quelltext zu ändern, um dabei darüber zu reflektieren, und nehmen nicht erst dann Änderungen vor, wenn sie diese gedanklich strukturiert und evaluiert haben. Diese Arbeit berichtet weiterhin von einer Implementierung der vorgeschlagenen Werkzeugunterstützung in der Squeak/Smalltalk Entwicklungsumgebung. Regelmäßige Tests von Laufzeitverhalten und Benutzbarkeit begleiteten die Entwicklung. Zudem prüft die Arbeit, ob sich die Verwendung der vorgeschlagenen Werkzeuge auf die Leistung der Programmierer auswirkt. In einem kontrollierten Experiment, verbesserten 22 Teilnehmer den Aufbau von zwei verschiedenen Anwendungen. Unter der Verwendung einer Versuchsanordnung mit wiederholter Messung, ermittelte die Studie die Auswirkung von CoExist auf die Programmierleistung. Das Ergebnis der Analyse von 88 Programmierstunden deutet darauf hin, dass sich eingebaute Werkzeugunterstützung für Wiederherstellung, wie sie mit CoExist bereitgestellt wird, positiv bei der Bearbeitung von unstrukturierten ergebnisoffenen Programmieraufgaben auswirkt.
KW  - Softwaretechnik
KW  - Entwicklungswerkzeuge
KW  - Versionierung
KW  - Testen
KW  - software engineering
KW  - development tools
KW  - versioning
KW  - testing
Y1  - 2014
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus-71305
ER  - 
TY  - THES
A1  - Tinnefeld, Christian
T1  - Building a columnar database on shared main memory-based storage
BT  - database operator placement in a shared main memory-based storage system that supports data access and code execution
N2  - In the field of disk-based parallel database management systems exists a great variety of solutions based on a shared-storage or a shared-nothing architecture. In contrast, main memory-based parallel database management systems are dominated solely by the shared-nothing approach as it preserves the in-memory performance advantage by processing data locally on each server. We argue that this unilateral development is going to cease due to the combination of the following three trends: a) Nowadays network technology features remote direct memory access (RDMA) and narrows the performance gap between accessing main memory inside a server and of a remote server to and even below a single order of magnitude. b) Modern storage systems scale gracefully, are elastic, and provide high-availability. c) A modern storage system such as Stanford's RAMCloud even keeps all data resident in main memory. Exploiting these characteristics in the context of a main-memory parallel database management system is desirable. The advent of RDMA-enabled network technology makes the creation of a parallel main memory DBMS based on a shared-storage approach feasible.

This thesis describes building a columnar database on shared main memory-based storage. The thesis discusses the resulting architecture (Part I), the implications on query processing (Part II), and presents an evaluation of the resulting solution in terms of performance, high-availability, and elasticity (Part III).

In our architecture, we use Stanford's RAMCloud as shared-storage, and the self-designed and developed in-memory AnalyticsDB as relational query processor on top. AnalyticsDB encapsulates data access and operator execution via an interface which allows seamless switching between local and remote main memory, while RAMCloud provides not only storage capacity, but also processing power. Combining both aspects allows pushing-down the execution of database operators into the storage system. We describe how the columnar data processed by AnalyticsDB is mapped to RAMCloud's key-value data model and how the performance advantages of columnar data storage can be preserved.

The combination of fast network technology and the possibility to execute database operators in the storage system opens the discussion for site selection. We construct a system model that allows the estimation of operator execution costs in terms of network transfer, data processed in memory, and wall time. This can be used for database operators that work on one relation at a time - such as a scan or materialize operation - to discuss the site selection problem (data pull vs. operator push). Since a database query translates to the execution of several database operators, it is possible that the optimal site selection varies per operator. For the execution of a database operator that works on two (or more) relations at a time, such as a join, the system model is enriched by additional factors such as the chosen algorithm (e.g. Grace- vs. Distributed Block Nested Loop Join vs. Cyclo-Join), the data partitioning of the respective relations, and their overlapping as well as the allowed resource allocation.

We present an evaluation on a cluster with 60 nodes where all nodes are connected via RDMA-enabled network equipment. We show that query processing performance is about 2.4x slower if everything is done via the data pull operator execution strategy (i.e. RAMCloud is being used only for data access) and about 27% slower if operator execution is also supported inside RAMCloud (in comparison to operating only on main memory inside a server without any network communication at all). The fast-crash recovery feature of RAMCloud can be leveraged to provide high-availability, e.g. a server crash during query execution only delays the query response for about one second. Our solution is elastic in a way that it can adapt to changing workloads a) within seconds, b) without interruption of the ongoing query processing, and c) without manual intervention.
N2  - Diese Arbeit beschreibt die Erstellung einer spalten-orientierten Datenbank auf einem geteilten, Hauptspeicher-basierenden Speichersystem. Motiviert wird diese Arbeit durch drei Faktoren. Erstens ist moderne Netzwerktechnologie mit “Remote Direct Memory Access” (RDMA) ausgestattet. Dies reduziert den Unterschied hinsichtlich Latenz und Durchsatz zwischen dem Speicherzugriff innerhalb eines Rechners und auf einen entfernten Rechner auf eine Größenordnung. Zweitens skalieren moderne Speichersysteme, sind elastisch und hochverfügbar. Drittens hält ein modernes Speichersystem wie Stanford's RAMCloud alle Daten im Hauptspeicher vor. Diese Eigenschaften im Kontext einer spalten-orientierten Datenbank zu nutzen ist erstrebenswert. Die Arbeit ist in drei Teile untergliedert. Der erste Teile beschreibt die Architektur einer spalten-orientierten Datenbank auf einem geteilten, Hauptspeicher-basierenden Speichersystem. Hierbei werden die im Rahmen dieser Arbeit entworfene und entwickelte Datenbank AnalyticsDB sowie Stanford's RAMCloud verwendet. Die Architektur beschreibt wie Datenzugriff und Operatorausführung gekapselt werden um nahtlos zwischen lokalem und entfernten Hauptspeicher wechseln zu können. Weiterhin wird die Ablage der nach einem relationalen Schema formatierten Daten von AnalyticsDB in RAMCloud behandelt, welches mit einem Schlüssel-Wertpaar Datenmodell operiert. Der zweite Teil fokussiert auf die Implikationen bei der Abarbeitung von Datenbankanfragen. Hier steht die Diskussion im Vordergrund wo (entweder in AnalyticsDB oder in RAMCloud) und mit welcher Parametrisierung einzelne Datenbankoperationen ausgeführt werden. Dafür werden passende Kostenmodelle vorgestellt, welche die Abbildung von Datenbankoperationen ermöglichen, die auf einer oder mehreren Relationen arbeiten. Der dritte Teil der Arbeit präsentiert eine Evaluierung auf einem Verbund von 60 Rechnern hinsichtlich der Leistungsfähigkeit, der Hochverfügbarkeit und der Elastizität vom System.
T2  - Die Erstellung einer spaltenorientierten Datenbank auf einem verteilten, Hauptspeicher-basierenden Speichersystem
KW  - computer science
KW  - database technology
KW  - main memory computing
KW  - cloud computing
KW  - verteilte Datenbanken
KW  - Hauptspeicher Technologie
KW  - virtualisierte IT-Infrastruktur
Y1  - 2014
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-72063
ER  - 
TY  - JOUR
A1  - Richter, Rico
A1  - Döllner, Jürgen Roland Friedrich
T1  - Concepts and techniques for integration, analysis and visualization of massive 3D point clouds
JF  - Computers, environment and urban systems
N2  - Remote sensing methods, such as LiDAR and image-based photogrammetry, are established approaches for capturing the physical world. Professional and low-cost scanning devices are capable of generating dense 3D point clouds. Typically, these 3D point clouds are preprocessed by GIS and are then used as input data in a variety of applications such as urban planning, environmental monitoring, disaster management, and simulation. The availability of area-wide 3D point clouds will drastically increase in the future due to the availability of novel capturing methods (e.g., driver assistance systems) and low-cost scanning devices. Applications, systems, and workflows will therefore face large collections of redundant, up-to-date 3D point clouds and have to cope with massive amounts of data. Hence, approaches are required that will efficiently integrate, update, manage, analyze, and visualize 3D point clouds. In this paper, we define requirements for a system infrastructure that enables the integration of 3D point clouds from heterogeneous capturing devices and different timestamps. Change detection and update strategies for 3D point clouds are presented that reduce storage requirements and offer new insights for analysis purposes. We also present an approach that attributes 3D point clouds with semantic information (e.g., object class category information), which enables more effective data processing, analysis, and visualization. Out-of-core real-time rendering techniques then allow for an interactive exploration of the entire 3D point cloud and the corresponding analysis results. Web-based visualization services are utilized to make 3D point clouds available to a large community. The proposed concepts and techniques are designed to establish 3D point clouds as base datasets, as well as rendering primitives for analysis and visualization tasks, which allow operations to be performed directly on the point data. Finally, we evaluate the presented system, report on its applications, and discuss further research challenges.
KW  - 3D point clouds
KW  - System architecture
KW  - Classification
KW  - Out-of-core
KW  - Visualization
Y1  - 2014
U6  - https://doi.org/10.1016/j.compenvurbsys.2013.07.004
SN  - 0198-9715
SN  - 1873-7587
VL  - 45
SP  - 114
EP  - 124
PB  - Elsevier
CY  - Oxford
ER  - 
TY  - THES
A1  - Trümper, Jonas
T1  - Visualization techniques for the analysis of software behavior and related structures
T1  - Visualisierungstechniken für die Analyse von Softwareverhalten und verwandter Strukturen
N2  - Software maintenance encompasses any changes made to a software system after its initial deployment and is thereby one of the key phases in the typical software-engineering lifecycle. In software maintenance, we primarily need to understand structural and behavioral aspects, which are difficult to obtain, e.g., by code reading. Software analysis is therefore a vital tool for maintaining these systems: It provides - the preferably automated - means to extract and evaluate information from their artifacts such as software structure, runtime behavior, and related processes. However, such analysis typically results in massive raw data, so that even experienced engineers face difficulties directly examining, assessing, and understanding these data. Among other things, they require tools with which to explore the data if no clear question can be formulated beforehand. For this, software analysis and visualization provide its users with powerful interactive means. These enable the automation of tasks and, particularly, the acquisition of valuable and actionable insights into the raw data. For instance, one means for exploring runtime behavior is trace visualization. This thesis aims at extending and improving the tool set for visual software analysis by concentrating on several open challenges in the fields of dynamic and static analysis of software systems. This work develops a series of concepts and tools for the exploratory visualization of the respective data to support users in finding and retrieving information on the system artifacts concerned. This is a difficult task, due to the lack of appropriate visualization metaphors; in particular, the visualization of complex runtime behavior poses various questions and challenges of both a technical and conceptual nature. This work focuses on a set of visualization techniques for visually representing control-flow related aspects of software traces from shared-memory software systems: A trace-visualization concept based on icicle plots aids in understanding both single-threaded as well as multi-threaded runtime behavior on the function level. The concept’s extensibility further allows the visualization and analysis of specific aspects of multi-threading such as synchronization, the correlation of such traces with data from static software analysis, and a comparison between traces. Moreover, complementary techniques for simultaneously analyzing system structures and the evolution of related attributes are proposed. These aim at facilitating long-term planning of software architecture and supporting management decisions in software projects by extensions to the circular-bundle-view technique: An extension to 3-dimensional space allows for the use of additional variables simultaneously; interaction techniques allow for the modification of structures in a visual manner. The concepts and techniques presented here are generic and, as such, can be applied beyond software analysis for the visualization of similarly structured data. The techniques' practicability is demonstrated by several qualitative studies using subject data from industry-scale software systems. The studies provide initial evidence that the techniques' application yields useful insights into the subject data and its interrelationships in several scenarios.
N2  - Die Softwarewartung umfasst alle Änderungen an einem Softwaresystem nach dessen initialer Bereitstellung und stellt damit eine der wesentlichen Phasen im typischen Softwarelebenszyklus dar. In der Softwarewartung müssen wir insbesondere strukturelle und verhaltensbezogene Aspekte verstehen, welche z.B. alleine durch Lesen von Quelltext schwer herzuleiten sind. Die Softwareanalyse ist daher ein unverzichtbares Werkzeug zur Wartung solcher Systeme: Sie bietet - vorzugsweise automatisierte - Mittel, um Informationen über deren Artefakte, wie Softwarestruktur, Laufzeitverhalten und verwandte Prozesse, zu extrahieren und zu evaluieren. Eine solche Analyse resultiert jedoch typischerweise in großen und größten Rohdaten, die selbst erfahrene Softwareingenieure direkt nur schwer untersuchen, bewerten und verstehen können. Unter Anderem dann, wenn vorab keine klare Frage formulierbar ist, benötigen sie Werkzeuge, um diese Daten zu erforschen. Hierfür bietet die Softwareanalyse und Visualisierung ihren Nutzern leistungsstarke, interaktive Mittel. Diese ermöglichen es Aufgaben zu automatisieren und insbesondere wertvolle und belastbare Einsichten aus den Rohdaten zu erlangen. Beispielsweise ist die Visualisierung von Software-Traces ein Mittel, um das Laufzeitverhalten eines Systems zu ergründen. Diese Arbeit zielt darauf ab, den "Werkzeugkasten" der visuellen Softwareanalyse zu erweitern und zu verbessern, indem sie sich auf bestimmte, offene Herausforderungen in den Bereichen der dynamischen und statischen Analyse von Softwaresystemen konzentriert. Die Arbeit entwickelt eine Reihe von Konzepten und Werkzeugen für die explorative Visualisierung der entsprechenden Daten, um Nutzer darin zu unterstützen, Informationen über betroffene Systemartefakte zu lokalisieren und zu verstehen. Da es insbesondere an geeigneten Visualisierungsmetaphern mangelt, ist dies eine schwierige Aufgabe. Es bestehen, insbesondere bei komplexen Softwaresystemen, verschiedenste offene technische sowie konzeptionelle Fragestellungen und Herausforderungen. Diese Arbeit konzentriert sich auf Techniken zur visuellen Darstellung kontrollflussbezogener Aspekte aus Software-Traces von Shared-Memory Softwaresystemen: Ein Trace-Visualisierungskonzept, basierend auf Icicle Plots, unterstützt das Verstehen von single- und multi-threaded Laufzeitverhalten auf Funktionsebene. Die Erweiterbarkeit des Konzepts ermöglicht es zudem spezifische Aspekte des Multi-Threading, wie Synchronisation, zu visualisieren und zu analysieren, derartige Traces mit Daten aus der statischen Softwareanalyse zu korrelieren sowie Traces mit einander zu vergleichen. Darüber hinaus werden komplementäre Techniken für die kombinierte Analyse von Systemstrukturen und der Evolution zugehöriger Eigenschaften vorgestellt. Diese zielen darauf ab, die Langzeitplanung von Softwarearchitekturen und Management-Entscheidungen in Softwareprojekten mittels Erweiterungen an der Circular-Bundle-View-Technik zu unterstützen: Eine Erweiterung auf den 3-dimensionalen Raum ermöglicht es zusätzliche visuelle Variablen zu nutzen; Strukturen können mithilfe von Interaktionstechniken visuell bearbeitet werden. Die gezeigten Techniken und Konzepte sind allgemein verwendbar und lassen sich daher auch jenseits der Softwareanalyse einsetzen, um ähnlich strukturierte Daten zu visualisieren. Mehrere qualitative Studien an Softwaresystemen in industriellem Maßstab stellen die Praktikabilität der Techniken dar. Die Ergebnisse sind erste Belege dafür, dass die Anwendung der Techniken in verschiedenen Szenarien nützliche Einsichten in die untersuchten Daten und deren Zusammenhänge liefert.
KW  - Visualisierung
KW  - Softwarewartung
KW  - Softwareanalyse
KW  - Softwarevisualisierung
KW  - Laufzeitverhalten
KW  - visualization
KW  - software maintenance
KW  - software analysis
KW  - software visualization
KW  - runtime behavior
Y1  - 2014
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus-72145
ER  - 
TY  - GEN
A1  - Hildebrandt, Dieter
T1  - A software reference architecture for service-oriented 3D geovisualization systems
T2  - Postprints der Universität Potsdam : Mathematisch-Naturwissenschaftliche Reihe
N2  - Modern 3D geovisualization systems (3DGeoVSs) are complex and evolving systems that are required to be adaptable and leverage distributed resources, including massive geodata. This article focuses on 3DGeoVSs built based on the principles of service-oriented architectures, standards and image-based representations (SSI) to address practically relevant challenges and potentials. Such systems facilitate resource sharing and agile and efficient system construction and change in an interoperable manner, while exploiting images as efficient, decoupled and interoperable representations. The software architecture of a 3DGeoVS and its underlying visualization model have strong effects on the system's quality attributes and support various system life cycle activities. This article contributes a software reference architecture (SRA) for 3DGeoVSs based on SSI that can be used to design, describe and analyze concrete software architectures with the intended primary benefit of an increase in effectiveness and efficiency in such activities. The SRA integrates existing, proven technology and novel contributions in a unique manner. As the foundation for the SRA, we propose the generalized visualization pipeline model that generalizes and overcomes expressiveness limitations of the prevalent visualization pipeline model. To facilitate exploiting image-based representations (IReps), the SRA integrates approaches for the representation, provisioning and styling of and interaction with IReps. Five applications of the SRA provide proofs of concept for the general applicability and utility of the SRA. A qualitative evaluation indicates the overall suitability of the SRA, its applications and the general approach of building 3DGeoVSs based on SSI.
T3  - Zweitveröffentlichungen der Universität Potsdam : Mathematisch-Naturwissenschaftliche Reihe - 1131 
KW  - 3D geovisualization
KW  - software reference architecture
KW  - spatial data infrastructure
KW  - service-oriented architecture
KW  - standardization
KW  - image-based representation
Y1  - 2021
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-475831
SN  - 1866-8372
IS  - 1131
ER  - 
TY  - JOUR
A1  - Marr, Stefan
A1  - Pape, Tobias
A1  - De Meuter, Wolfgang
T1  - Are we there yet? Simple language implementation techniques for the 21st century
JF  - IEEE software
Y1  - 2014
SN  - 0740-7459
SN  - 1937-4194
VL  - 31
IS  - 5
SP  - 60
EP  - 67
PB  - Inst. of Electr. and Electronics Engineers
CY  - Los Alamitos
ER  - 
TY  - JOUR
A1  - Weidlich, Matthias
A1  - Ziekow, Holger
A1  - Gal, Avigdor
A1  - Mendling, Jan
A1  - Weske, Mathias
T1  - Optimizing event pattern matching using business process models
JF  - IEEE transactions on knowledge and data engineering
N2  - A growing number of enterprises use complex event processing for monitoring and controlling their operations, while business process models are used to document working procedures. In this work, we propose a comprehensive method for complex event processing optimization using business process models. Our proposed method is based on the extraction of behaviorial constraints that are used, in turn, to rewrite patterns for event detection, and select and transform execution plans. We offer a set of rewriting rules that is shown to be complete with respect to the all, seq, and any patterns. The effectiveness of our method is demonstrated in an experimental evaluation with a large number of processes from an insurance company. We illustrate that the proposed optimization leads to significant savings in query processing. By integrating the optimization in state-of-the-art systems for event pattern matching, we demonstrate that these savings materialize in different technical infrastructures and can be combined with existing optimization techniques.
KW  - Event processing
KW  - query optimisation
KW  - query rewriting
Y1  - 2014
U6  - https://doi.org/10.1109/TKDE.2014.2302306
SN  - 1041-4347
SN  - 1558-2191
VL  - 26
IS  - 11
SP  - 2759
EP  - 2773
PB  - Inst. of Electr. and Electronics Engineers
CY  - Los Alamitos
ER  - 
TY  - JOUR
A1  - Hildebrandt, Dieter
A1  - Timm, Robert
T1  - An assisting, constrained 3D navigation technique for multiscale virtual 3D city models
JF  - Geoinformatica : an international journal on advances of computer science for geographic information systems
N2  - Virtual 3D city models serve as integration platforms for complex geospatial and georeferenced information and as medium for effective communication of spatial information. In order to explore these information spaces, navigation techniques for controlling the virtual camera are required to facilitate wayfinding and movement. However, navigation is not a trivial task and many available navigation techniques do not support users effectively and efficiently with their respective skills and tasks. In this article, we present an assisting, constrained navigation technique for multiscale virtual 3D city models that is based on three basic principles: users point to navigate, users are lead by suggestions, and the exploitation of semantic, multiscale, hierarchical structurings of city models. The technique particularly supports users with low navigation and virtual camera control skills but is also valuable for experienced users. It supports exploration, search, inspection, and presentation tasks, is easy to learn and use, supports orientation, is efficient, and yields effective view properties. In particular, the technique is suitable for interactive kiosks and mobile devices with a touch display and low computing resources and for use in mobile situations where users only have restricted resources for operating the application. We demonstrate the validity of the proposed navigation technique by presenting an implementation and evaluation results. The implementation is based on service-oriented architectures, standards, and image-based representations and allows exploring massive virtual 3D city models particularly on mobile devices with limited computing resources. Results of a user study comparing the proposed navigation technique with standard techniques suggest that the proposed technique provides the targeted properties, and that it is more advantageous to novice than to expert users.
KW  - Virtual 3D city model
KW  - Multiscale modeling
KW  - View navigation
KW  - Virtual camera control
KW  - Mobile device
KW  - Distributed 3D geovisualization
Y1  - 2014
U6  - https://doi.org/10.1007/s10707-013-0189-8
SN  - 1384-6175
SN  - 1573-7624
VL  - 18
IS  - 3
SP  - 537
EP  - 567
PB  - Springer
CY  - Dordrecht
ER  - 
TY  - JOUR
A1  - Yang, Haojin
A1  - Quehl, Bernhard
A1  - Sack, Harald
T1  - A framework for improved video text detection and recognition
JF  - Multimedia tools and applications : an international journal
N2  - Text displayed in a video is an essential part for the high-level semantic information of the video content. Therefore, video text can be used as a valuable source for automated video indexing in digital video libraries. In this paper, we propose a workflow for video text detection and recognition. In the text detection stage, we have developed a fast localization-verification scheme, in which an edge-based multi-scale text detector first identifies potential text candidates with high recall rate. Then, detected candidate text lines are refined by using an image entropy-based filter. Finally, Stroke Width Transform (SWT)- and Support Vector Machine (SVM)-based verification procedures are applied to eliminate the false alarms. For text recognition, we have developed a novel skeleton-based binarization method in order to separate text from complex backgrounds to make it processible for standard OCR (Optical Character Recognition) software. Operability and accuracy of proposed text detection and binarization methods have been evaluated by using publicly available test data sets.
KW  - Video OCR
KW  - Video indexing
KW  - Multimedia retrieval
Y1  - 2014
U6  - https://doi.org/10.1007/s11042-012-1250-6
SN  - 1380-7501
SN  - 1573-7721
VL  - 69
IS  - 1
SP  - 217
EP  - 245
PB  - Springer
CY  - Dordrecht
ER  - 
TY  - JOUR
A1  - Buchwald, Sebastian
A1  - Wagelaar, Dennis
A1  - Dan, Li
A1  - Hegedues, Abel
A1  - Herrmannsdoerfer, Markus
A1  - Horn, Tassilo
A1  - Kalnina, Elina
A1  - Krause, Christian
A1  - Lano, Kevin
A1  - Lepper, Markus
A1  - Rensink, Arend
A1  - Rose, Louis
A1  - Waetzoldt, Sebastian
A1  - Mazanek, Steffen
T1  - A survey and comparison of transformation tools based on the transformation tool contest
JF  - Science of computer programming
N2  - Model transformation is one of the key tasks in model-driven engineering and relies on the efficient matching and modification of graph-based data structures; its sibling graph rewriting has been used to successfully model problems in a variety of domains. Over the last years, a wide range of graph and model transformation tools have been developed all of them with their own particular strengths and typical application domains. In this paper, we give a survey and a comparison of the model and graph transformation tools that participated at the Transformation Tool Contest 2011. The reader gains an overview of the field and its tools, based on the illustrative solutions submitted to a Hello World task, and a comparison alongside a detailed taxonomy. The article is of interest to researchers in the field of model and graph transformation, as well as to software engineers with a transformation task at hand who have to choose a tool fitting to their needs. All solutions referenced in this article provide a SHARE demo. It supported the peer-review process for the contest, and now allows the reader to test the tools online.
KW  - Graph rewriting
KW  - Model transformation
KW  - Tool survey
KW  - Transformation tool contest
Y1  - 2014
U6  - https://doi.org/10.1016/j.scico.2013.10.009
SN  - 0167-6423
SN  - 1872-7964
VL  - 85
SP  - 41
EP  - 99
PB  - Elsevier
CY  - Amsterdam
ER  - 
TY  - JOUR
A1  - Hildebrandt, Dieter
T1  - A software reference architecture for service-oriented 3D geovisualization systems
JF  - ISPRS International Journal of Geo-Information
KW  - 3D geovisualization
KW  - software reference architecture
KW  - spatial data infrastructure
KW  - service-oriented architecture
KW  - standardization
KW  - image-based representation
Y1  - 2014
U6  - https://doi.org/10.3390/ijgi3041445
SN  - 2220-9964
VL  - 3
IS  - 4
SP  - 1445
EP  - 1490
PB  - MDPI
CY  - Basel
ER  - 
TY  - JOUR
A1  - Ehrig, Hartmut
A1  - Golas, Ulrike
A1  - Habel, Annegret
A1  - Lambers, Leen
A1  - Orejas, Fernando
T1  - M-adhesive transformation systems with nested application conditions. Part 1: parallelism, concurrency and amalgamation
JF  - Mathematical structures in computer science : a journal in the applications of categorical, algebraic and geometric methods in computer science
N2  - Nested application conditions generalise the well-known negative application conditions and are important for several application domains. In this paper, we present Local Church-Rosser, Parallelism, Concurrency and Amalgamation Theorems for rules with nested application conditions in the framework of M-adhesive categories, where M-adhesive categories are slightly more general than weak adhesive high-level replacement categories. Most of the proofs are based on the corresponding statements for rules without application conditions and two shift lemmas stating that nested application conditions can be shifted over morphisms and rules.
Y1  - 2014
U6  - https://doi.org/10.1017/S0960129512000357
SN  - 0960-1295
SN  - 1469-8072
VL  - 24
IS  - 4
PB  - Cambridge Univ. Press
CY  - New York
ER  - 
TY  - JOUR
A1  - Westphal, Florian
A1  - Axelsson, Stefan
A1  - Neuhaus, Christian
A1  - Polze, Andreas
T1  - VMI-PL: A monitoring language for virtual platforms using virtual machine introspection
JF  - Digital Investigation : the international journal of digital forensics & incident response
N2  - With the growth of virtualization and cloud computing, more and more forensic investigations rely on being able to perform live forensics on a virtual machine using virtual machine introspection (VMI). Inspecting a virtual machine through its hypervisor enables investigation without risking contamination of the evidence, crashing the computer, etc. To further access to these techniques for the investigator/researcher we have developed a new VMI monitoring language. This language is based on a review of the most commonly used VMI-techniques to date, and it enables the user to monitor the virtual machine's memory, events and data streams. A prototype implementation of our monitoring system was implemented in KVM, though implementation on any hypervisor that uses the common x86 virtualization hardware assistance support should be straightforward. Our prototype outperforms the proprietary VMWare VProbes in many cases, with a maximum performance loss of 18% for a realistic test case, which we consider acceptable. Our implementation is freely available under a liberal software distribution license. (C) 2014 Digital Forensics Research Workshop. Published by Elsevier Ltd. All rights reserved.
KW  - Virtualization
KW  - Security
KW  - Monitoring language
KW  - Live forensics
KW  - Introspection
KW  - Classification
Y1  - 2014
U6  - https://doi.org/10.1016/j.diin.2014.05.016
SN  - 1742-2876
SN  - 1873-202X
VL  - 11
SP  - S85
EP  - S94
PB  - Elsevier
CY  - Oxford
ER  - 
TY  - JOUR
A1  - Polyvyanyy, Artem
A1  - Garcia-Banuelos, Luciano
A1  - Fahland, Dirk
A1  - Weske, Mathias
T1  - Maximal structuring of acyclic process models
JF  - The computer journal : a publication of the British Computer Society
N2  - This article addresses the transformation of a process model with an arbitrary topology into an equivalent structured process model. In particular, this article studies the subclass of process models that have no equivalent well-structured representation but which, nevertheless, can be partially structured into their maximally-structured representation. The transformations are performed under a behavioral equivalence notion that preserves the observed concurrency of tasks in equivalent process models. The article gives a full characterization of the subclass of acyclic process models that have no equivalent well-structured representation, but do have an equivalent maximally-structured one, as well as proposes a complete structuring method. Together with our previous results, this article completes the solution of the process model structuring problem for the class of acyclic process models.
KW  - process modeling
KW  - structured process model
KW  - maximal structuring
KW  - model transformation
KW  - fully concurrent bisimulation
Y1  - 2014
U6  - https://doi.org/10.1093/comjnl/bxs126
SN  - 0010-4620
SN  - 1460-2067
VL  - 57
IS  - 1
SP  - 12
EP  - 35
PB  - Oxford Univ. Press
CY  - Oxford
ER  - 
TY  - JOUR
A1  - Vogel, Thomas
A1  - Giese, Holger
T1  - Model-Driven engineering of self-adaptive software with EUREMA
JF  - ACM transactions on autonomous and adaptive systems
N2  - The development of self-adaptive software requires the engineering of an adaptation engine that controls the underlying adaptable software by feedback loops. The engine often describes the adaptation by runtime models representing the adaptable software and by activities such as analysis and planning that use these models. To systematically address the interplay between runtime models and adaptation activities, runtime megamodels have been proposed. A runtime megamodel is a specific model capturing runtime models and adaptation activities. In this article, we go one step further and present an executable modeling language for ExecUtable RuntimE MegAmodels (EUREMA) that eases the development of adaptation engines by following a model-driven engineering approach. We provide a domain-specific modeling language and a runtime interpreter for adaptation engines, in particular feedback loops. Megamodels are kept alive at runtime and by interpreting them, they are directly executed to run feedback loops. Additionally, they can be dynamically adjusted to adapt feedback loops. Thus, EUREMA supports development by making feedback loops explicit at a higher level of abstraction and it enables solutions where multiple feedback loops interact or operate on top of each other and self-adaptation co-exists with offline adaptation for evolution.
KW  - Design
KW  - Languages Model-driven engineering
KW  - modeling language
KW  - models at runtime
KW  - model interpreter
KW  - self-adaptive software
KW  - feedback loops
KW  - layered architecture
KW  - software evolution
Y1  - 2014
U6  - https://doi.org/10.1145/2555612
SN  - 1556-4665
SN  - 1556-4703
VL  - 8
IS  - 4
PB  - Association for Computing Machinery
CY  - New York
ER  - 
TY  - JOUR
A1  - Pasewaldt, Sebastian
A1  - Semmo, Amir
A1  - Trapp, Matthias
A1  - Döllner, Jürgen
T1  - Multi-perspective 3D panoramas
JF  - International journal of geographical information science
N2  - This article presents multi-perspective 3D panoramas that focus on visualizing 3D geovirtual environments (3D GeoVEs) for navigation and exploration tasks. Their key element, a multi-perspective view (MPV), seamlessly combines what is seen from multiple viewpoints into a single image. This approach facilitates the presentation of information for virtual 3D city and landscape models, particularly by reducing occlusions, increasing screen-space utilization, and providing additional context within a single image. We complement MPVs with cartographic visualization techniques to stylize features according to their semantics and highlight important or prioritized information. When combined, both techniques constitute the core implementation of interactive, multi-perspective 3D panoramas. They offer a large number of effective means for visual communication of 3D spatial information, a high degree of customization with respect to cartographic design, and manifold applications in different domains. We discuss design decisions of 3D panoramas for the exploration of and navigation in 3D GeoVEs. We also discuss a preliminary user study that indicates that 3D panoramas are a promising approach for navigation systems using 3D GeoVEs.
KW  - multi-perspective visualization
KW  - panorama
KW  - focus plus context visualization
KW  - 3D geovirtual environments
KW  - cartographic design
Y1  - 2014
U6  - https://doi.org/10.1080/13658816.2014.922686
SN  - 1365-8816
SN  - 1362-3087
VL  - 28
IS  - 10
SP  - 2030
EP  - 2051
PB  - Routledge, Taylor & Francis Group
CY  - Abingdon
ER  - 
TY  - BOOK
ED  - Meinel, Christoph
ED  - Polze, Andreas
ED  - Oswald, Gerhard
ED  - Strotmann, Rolf
ED  - Seibold, Ulrich
ED  - Schulzki, Bernhard
T1  - HPI Future SOC Lab
BT  - Proceedings 2014
N2  - Das Future SOC Lab am HPI ist eine Kooperation des Hasso-Plattner-Instituts mit verschiedenen Industriepartnern. Seine Aufgabe ist die Ermöglichung und Förderung des Austausches zwischen Forschungsgemeinschaft und Industrie.
Am Lab wird interessierten Wissenschaftlern eine Infrastruktur von neuester Hard- und Software kostenfrei für Forschungszwecke zur Verfügung gestellt. Dazu zählen teilweise noch nicht am Markt verfügbare Technologien, die im normalen Hochschulbereich in der Regel nicht zu finanzieren wären, bspw. Server mit bis zu 64 Cores und 2 TB Hauptspeicher. Diese Angebote richten sich insbesondere an Wissenschaftler in den Gebieten Informatik und Wirtschaftsinformatik. Einige der Schwerpunkte sind Cloud Computing, Parallelisierung und In-Memory Technologien. 
In diesem Technischen Bericht werden die Ergebnisse der Forschungsprojekte des Jahres 2014 vorgestellt.  Ausgewählte Projekte stellten ihre Ergebnisse am 9. April 2014 und 29. Oktober 2014 im Rahmen der Future SOC Lab Tag Veranstaltungen vor.
N2  - The “HPI Future SOC Lab” is a cooperation of the Hasso-Plattner-Institut (HPI) and industrial partners. Its mission is to enable and promote exchange and interaction between the research community and the industrial partners.
The HPI Future SOC Lab provides researchers with free of charge access to a complete infrastructure of state of the art hard- and software. This infrastructure includes components, which might be too expensive for an ordinary research environment, such as servers with up to 64 cores. The offerings address researchers particularly from but not limited to the areas of computer science and business information systems. Main areas of research include cloud computing, parallelization, and In-Memory technologies.
This technical report presents results of research projects executed in 2014. Selected projects have presented their results on April 9th and September 29th 2014 at the Future SOC Lab Day events.
KW  - Future SOC Lab
KW  - Forschungsprojekte
KW  - Multicore Architekturen
KW  - In-Memory Technologie
KW  - Cloud Computing
KW  - Future SOC Lab
KW  - research projects
KW  - multicore architectures
KW  - In-Memory technology
KW  - cloud computing
Y1  - 2014
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-86271
ER  - 
TY  - BOOK
ED  - Meinel, Christoph
ED  - Polze, Andreas
ED  - Oswald, Gerhard
ED  - Strotmann, Rolf
ED  - Seibold, Ulrich
ED  - Schulzki, Bernard
T1  - HPI Future SOC Lab
BT  - Proceedings 2013
N2  - The “HPI Future SOC Lab” is a cooperation of the Hasso-Plattner-Institut (HPI) and industrial partners. Its mission is to enable and promote exchange and interaction between the research community and the industrial partners. The HPI Future SOC Lab provides researchers with free of charge access to a complete infrastructure of state of the art hard- and software. This infrastructure includes components, which might be too expensive for an ordinary research environment, such as servers with up to 64 cores. The offerings address researchers particularly from but not limited to the areas of computer science and business information systems. Main areas of research include cloud computing, parallelization, and In-Memory technologies. This technical report presents results of research projects executed in 2013. Selected projects have presented their results on April 10th and September 24th 2013 at the Future SOC Lab Day events.
N2  - Das Future SOC Lab am HPI ist eine Kooperation des Hasso-Plattner-Instituts mit verschiedenen Industriepartnern. Seine Aufgabe ist die Ermöglichung und Förderung des Austausches zwischen Forschungsgemeinschaft und Industrie. Am Lab wird interessierten Wissenschaftlern eine Infrastruktur von neuester Hard- und Software kostenfrei für Forschungszwecke zur Verfügung gestellt. Dazu zählen teilweise noch nicht am Markt verfügbare Technologien, die im normalen Hochschulbereich in der Regel nicht zu finanzieren wären, bspw. Server mit bis zu 64 Cores und 2 TB Hauptspeicher. Diese Angebote richten sich insbesondere an Wissenschaftler in den Gebieten Informatik und Wirtschaftsinformatik. Einige der Schwerpunkte sind Cloud Computing, Parallelisierung und In-Memory Technologien. In diesem Technischen Bericht werden die Ergebnisse der Forschungsprojekte des Jahres 2013 vorgestellt. Ausgewählte Projekte stellten ihre Ergebnisse am 10. April 2013 und 24. September 2013 im Rahmen der Future SOC Lab Tag Veranstaltungen vor.
T3  - Technische Berichte des Hasso-Plattner-Instituts für Digital Engineering an der Universität Potsdam - 88 
KW  - future SOC lab
KW  - Forschungsprojekte
KW  - Multicore Architekturen
KW  - In-Memory Technologie
KW  - Cloud Computing
KW  - Future SOC Lab
KW  - research projects
KW  - multicore architectures
KW  - in-memory technology
KW  - cloud computing
Y1  - 2014
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus-68195
SN  - 978-3-86956-282-7
SN  - 1613-5652
SN  - 2191-1665
IS  - 88
PB  - Universitätsverlag Potsdam
CY  - Potsdam
ER  - 
TY  - BOOK
A1  - Meinel, Christoph
A1  - Schnjakin, Maxim
A1  - Metzke, Tobias
A1  - Freitag, Markus
T1  - Anbieter von Cloud Speicherdiensten im Überblick
N2  - Durch die immer stärker werdende Flut an digitalen Informationen basieren immer mehr Anwendungen auf der Nutzung von kostengünstigen Cloud Storage Diensten. Die Anzahl der Anbieter, die diese Dienste zur Verfügung stellen, hat sich in den letzten Jahren deutlich erhöht. Um den passenden Anbieter für eine Anwendung zu finden, müssen verschiedene Kriterien individuell berücksichtigt werden. In der vorliegenden Studie wird eine Auswahl an Anbietern etablierter Basic Storage Diensten vorgestellt und miteinander verglichen. Für die Gegenüberstellung werden Kriterien extrahiert, welche bei jedem der untersuchten Anbieter anwendbar sind und somit eine möglichst objektive Beurteilung erlauben. Hierzu gehören unter anderem Kosten, Recht, Sicherheit, Leistungsfähigkeit sowie bereitgestellte Schnittstellen. Die vorgestellten Kriterien können genutzt werden, um Cloud Storage Anbieter bezüglich eines konkreten Anwendungsfalles zu bewerten.
N2  - Due to the ever-increasing flood of digital information, more and more applications make use of cost-effective cloud storage services. The number of vendors that provide these services has increased significantly in recent years. The identification of an appropriate service provider requires an individual consideration of several criteria. This survey presents a comparison of some established basic storage providers. For this comparison, several criteria are extracted that are applicable to any of the selected providers and thus allow for an assessment that is as objective as possible. The criteria include factors like costs, legal information, security, performance, and supported interfaces. The presented criteria can be used to evaluate cloud storage providers in a specific use case in order to identify the most suitable service based on individual requirements.
T3  - Technische Berichte des Hasso-Plattner-Instituts für Digital Engineering an der Universität Potsdam - 84 
KW  - Cloud Computing
KW  - öffentliche Cloud Speicherdienste
KW  - Basic Storage Anbieter
KW  - cloud computing
KW  - public cloud storage services
KW  - basic cloud storage services
Y1  - 2014
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus-68780
SN  - 978-3-86956-274-2
PB  - Universitätsverlag Potsdam
CY  - Potsdam
ER  - 
TY  - BOOK
A1  - Meinel, Christoph
A1  - Willems, Christian
T1  - openHPI : 哈索•普拉特纳研究院的 MOOC（大规模公开在线课）计划
T1  - openHPI : the MOOC offer at Hasso Plattner Institute
N2  - 摘要。哈索•普拉特纳研究院 (HPI) 的新型互动在线教育平台 openHPI (https://openHPI.de) 可以为从事信息技术和信息学领域内容的工作和感兴趣的学员提供可自由访问的、免费的在线课程。与斯坦福大学于 2011 年首推，之后也在美国其他精英大学提供的“网络公开群众课”（简称 MOOC）一样，openHPI 同样在互联网中提供学习视频和阅读材料，其中综合了支持学习的自我测试、家庭作业和社交讨论论坛，并刺激对促进学习的虚拟学习团队的培训。与“传统的”讲座平台，比如 tele-TASK 平台 (http://www.tele-task.de) 不同（在该平台中，可调用以多媒体方式记录的和已准备好的讲座），openHPI 提供的是按教学法准备的在线课程。这些课程的开始时间固定，之后在连续六个课程周稳定的提供以多媒体方式准备的、尽可能可以互动的学习材料。每周讲解课程主题的一章。为此在该周开始前会准备一系列学习视频、文字、自我测试和家庭作业材料，课程学员在该周将精力用于处理这些内容。这些计划与一个社交讨论平台相结合，学员在该平台上可以与课程导师和其他学员交换意见、解答问题和讨论更多主题。当然，学员可以自己决定学习活动的类型和范围。他们可以为课程作出自己的贡献，比如在论坛中引用博文或推文。之后其他学员可以评论、讨论或自己扩展这些博文或推文。这样学员、教师和提供的学习内容就在一个虚拟的团体中与社交学习网络相互结合起来。
N2  - Abstract. The new interactive online educational platform openHPI, (https://openHPI.de) from Hasso Plattner Institute (HPI), offers freely accessible courses at no charge for all who are interested in subjects in the field of information technology and computer science. Since 2011, “Massive Open Online Courses,” called MOOCs for short, have been offered, first at Stanford University and then later at other U.S. elite universities. Following suit, openHPI provides instructional videos on the Internet and further reading material, combined with learning-supportive self-tests, homework and a social discussion forum. Education is further stimulated by the support of a virtual learning community. In contrast to “traditional” lecture platforms, such as the tele-TASK portal (http://www.tele-task.de) where multimedia recorded lectures are available on demand, openHPI offers didactic online courses. The courses have a fixed start date and offer a balanced schedule of six consecutive weeks presented in multimedia and, whenever possible, interactive learning material. Each week, one chapter of the course subject is treated. In addition, a series of learning videos, texts, self-tests and homework exercises are provided to course participants at the beginning of the week. The course offering is combined with a social discussion platform where participants have the opportunity to enter into an exchange with course instructors and fellow participants. Here, for example, they can get answers to questions and discuss the topics in depth. The participants naturally decide themselves about the type and range of their learning activities. They can make personal contributions to the course, for example, in blog posts or tweets, which they can refer to in the forum. In turn, other participants have the chance to comment on, discuss or expand on what has been said. In this way, the learners become the teachers and the subject matter offered to a virtual community is linked to a social learning network.
T3  - Technische Berichte des Hasso-Plattner-Instituts für Digital Engineering an der Universität Potsdam - 89 
KW  - Online-Lernen
KW  - E-Learning
KW  - MOOCs
KW  - Onlinekurs
KW  - openHPI
KW  - tele-TASK
KW  - Tele-Lab
KW  - Tele-Teaching
KW  - online-learning
KW  - e-learning
KW  - MOOCs
KW  - online course
KW  - openHPI
KW  - tele-TASK
KW  - tele-lab
KW  - tele-teaching
Y1  - 2014
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus-70380
SN  - 978-3-86956-291-9
SN  - 1613-5652
SN  - 2191-1665
IS  - 89
PB  - Universitätsverlag Potsdam
CY  - Potsdam
ER  - 
TY  - BOOK
A1  - Meinel, Christoph
A1  - Plattner, Hasso
A1  - Döllner, Jürgen Roland Friedrich
A1  - Weske, Mathias
A1  - Polze, Andreas
A1  - Hirschfeld, Robert
A1  - Naumann, Felix
A1  - Giese, Holger
A1  - Baudisch, Patrick
T1  - Proceedings of the 7th Ph.D. Retreat of the HPI Research School on Service-oriented Systems Engineering
N2  - Design and Implementation of service-oriented architectures imposes a huge number of research questions from the fields of software engineering, system analysis and modeling, adaptability, and application integration. Component orientation and web services are two approaches for design and realization of complex web-based system. Both approaches allow for dynamic application adaptation as well as integration of enterprise application. Commonly used technologies, such as J2EE and .NET, form de facto standards for the realization of complex distributed systems. Evolution of component systems has lead to web services and service-based architectures. This has been manifested in a multitude of industry standards and initiatives such as XML, WSDL UDDI, SOAP, etc. All these achievements lead to a new and promising paradigm in IT systems engineering which proposes to design complex software solutions as collaboration of contractually defined software services. Service-Oriented Systems Engineering represents a symbiosis of best practices in object-orientation, component-based development, distributed computing, and business process management. It provides integration of business and IT concerns. The annual Ph.D. Retreat of the Research School provides each member the opportunity to present his/her current state of their research and to give an outline of a prospective Ph.D. thesis. Due to the interdisciplinary structure of the Research Scholl, this technical report covers a wide range of research topics. These include but are not limited to: Self-Adaptive Service-Oriented Systems, Operating System Support for Service-Oriented Systems, Architecture and Modeling of Service-Oriented Systems, Adaptive Process Management, Services Composition and Workflow Planning, Security Engineering of Service-Based IT Systems, Quantitative Analysis and Optimization of Service-Oriented Systems, Service-Oriented Systems in 3D Computer Graphics sowie Service-Oriented Geoinformatics.
N2  - Der Entwurf und die Realisierung dienstbasierender Architekturen wirft eine Vielzahl von Forschungsfragestellungen aus den Gebieten der Softwaretechnik, der Systemmodellierung und -analyse, sowie der Adaptierbarkeit und Integration von Applikationen auf. Komponentenorientierung und WebServices sind zwei Ansätze für den effizienten Entwurf und die Realisierung komplexer Web-basierender Systeme. Sie ermöglichen die Reaktion auf wechselnde Anforderungen ebenso, wie die Integration großer komplexer Softwaresysteme. Heute übliche Technologien, wie J2EE und .NET, sind de facto Standards für die Entwicklung großer verteilter Systeme. Die Evolution solcher Komponentensysteme führt über WebServices zu dienstbasierenden Architekturen. Dies manifestiert sich in einer Vielzahl von Industriestandards und Initiativen wie XML, WSDL, UDDI, SOAP. All diese Schritte führen letztlich zu einem neuen, vielversprechenden Paradigma für IT Systeme, nach dem komplexe Softwarelösungen durch die Integration vertraglich vereinbarter Software-Dienste aufgebaut werden sollen. "Service-Oriented Systems Engineering" repräsentiert die Symbiose bewährter Praktiken aus den Gebieten der Objektorientierung, der Komponentenprogrammierung, des verteilten Rechnen sowie der Geschäftsprozesse und berücksichtigt auch die Integration von Geschäftsanliegen und Informationstechnologien. Die Klausurtagung des Forschungskollegs "Service-oriented Systems Engineering" findet einmal jährlich statt und bietet allen Kollegiaten die Möglichkeit den Stand ihrer aktuellen Forschung darzulegen. Bedingt durch die Querschnittstruktur des Kollegs deckt dieser Bericht ein große Bandbreite aktueller Forschungsthemen ab. Dazu zählen unter anderem Self-Adaptive Service-Oriented Systems, Operating System Support for Service-Oriented Systems, Architecture and Modeling of Service-Oriented Systems, Adaptive Process Management, Services Composition and Workflow Planning, Security Engineering of Service-Based IT Systems, Quantitative Analysis and Optimization of Service-Oriented Systems, Service-Oriented Systems in 3D Computer Graphics sowie Service-Oriented Geoinformatics.
T3  - Technische Berichte des Hasso-Plattner-Instituts für Digital Engineering an der Universität Potsdam - 83 
KW  - Hasso-Plattner-Institut
KW  - Forschungskolleg
KW  - Klausurtagung
KW  - Service-oriented Systems Engineering
KW  - Hasso Plattner Institute
KW  - Research School
KW  - Ph.D. Retreat
KW  - Service-oriented Systems Engineering
Y1  - 2014
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus-63490
SN  - 978-3-86956-273-5
PB  - Universitätsverlag Potsdam
CY  - Potsdam
ER  - 
TY  - JOUR
A1  - Takouna, Ibrahim
A1  - Sachs, Kai
A1  - Meinel, Christoph
T1  - Multiperiod robust optimization for proactive resource provisioning in virtualized data centers
JF  - The journal of supercomputing : an internat. journal of supercomputer design, analysis and use
KW  - Energy-aware
KW  - Virtualization
KW  - Resource management
KW  - Robust optimization
KW  - Prediction
Y1  - 2014
U6  - https://doi.org/10.1007/s11227-014-1246-2
SN  - 0920-8542
SN  - 1573-0484
VL  - 70
IS  - 3
SP  - 1514
EP  - 1536
PB  - Springer
CY  - Dordrecht
ER  - 
TY  - THES
A1  - Abedjan, Ziawasch
T1  - Improving RDF data with data mining
T1  - Verbessern von RDF Daten durch Data-Mining
N2  - Linked Open Data (LOD) comprises very many and often large public data sets and knowledge bases. Those datasets are mostly presented in the RDF triple structure of subject, predicate, and object, where each triple represents a statement or fact. Unfortunately, the heterogeneity of available open data requires significant integration steps before it can be used in applications. Meta information, such as ontological definitions and exact range definitions of predicates, are desirable and ideally provided by an ontology. However in the context of LOD, ontologies are often incomplete or simply not available. Thus, it is useful to automatically generate meta information, such as ontological dependencies, range definitions, and topical classifications. Association rule mining, which was originally applied for sales analysis on transactional databases, is a promising and novel technique to explore such data. We designed an adaptation of this technique for min-ing Rdf data and introduce the concept of “mining configurations”, which allows us to mine RDF data sets in various ways. Different configurations enable us to identify schema and value dependencies that in combination result in interesting use cases. To this end, we present rule-based approaches for auto-completion, data enrichment, ontology improvement, and query relaxation. Auto-completion remedies the problem of inconsistent ontology usage, providing an editing user with a sorted list of commonly used predicates. A combination of different configurations step extends this approach to create completely new facts for a knowledge base. We present two approaches for fact generation, a user-based approach where a user selects the entity to be amended with new facts and a data-driven approach where an algorithm discovers entities that have to be amended with missing facts. As knowledge bases constantly grow and evolve, another approach to improve the usage of RDF data is to improve existing ontologies. Here, we present an association rule based approach to reconcile ontology and data. Interlacing different mining configurations, we infer an algorithm to discover synonymously used predicates. Those predicates can be used to expand query results and to support users during query formulation. We provide a wide range of experiments on real world datasets for each use case. The experiments and evaluations show the added value of association rule mining for the integration and usability of RDF data and confirm the appropriateness of our mining configuration methodology.
N2  - Linked Open Data (LOD) umfasst viele und oft sehr große öffentlichen Datensätze und Wissensbanken, die hauptsächlich in der RDF Triplestruktur bestehend aus Subjekt, Prädikat und Objekt vorkommen. Dabei repräsentiert jedes Triple einen Fakt. Unglücklicherweise erfordert die Heterogenität der verfügbaren öffentlichen Daten signifikante Integrationsschritte bevor die Daten in Anwendungen genutzt werden können. Meta-Daten wie ontologische Strukturen und Bereichsdefinitionen von Prädikaten sind zwar wünschenswert und idealerweise durch eine Wissensbank verfügbar. Jedoch sind Wissensbanken im Kontext von LOD oft unvollständig oder einfach nicht verfügbar. Deshalb ist es nützlich automatisch Meta-Informationen, wie ontologische Abhängigkeiten, Bereichs-und Domänendefinitionen und thematische Assoziationen von Ressourcen generieren zu können. Eine neue und vielversprechende Technik um solche Daten zu untersuchen basiert auf das entdecken von Assoziationsregeln, welche ursprünglich für Verkaufsanalysen in transaktionalen Datenbanken angewendet wurde. Wir haben eine Adaptierung dieser Technik auf RDF Daten entworfen und stellen das Konzept der Mining Konfigurationen vor, welches uns befähigt in RDF Daten auf unterschiedlichen Weisen Muster zu erkennen. Verschiedene Konfigurationen erlauben uns Schema- und Wertbeziehungen zu erkennen, die für interessante Anwendungen genutzt werden können. In dem Sinne, stellen wir assoziationsbasierte Verfahren für eine Prädikatvorschlagsverfahren, Datenvervollständigung, Ontologieverbesserung und Anfrageerleichterung vor. Das Vorschlagen von Prädikaten behandelt das Problem der inkonsistenten Verwendung von Ontologien, indem einem Benutzer, der einen neuen Fakt einem Rdf-Datensatz hinzufügen will, eine sortierte Liste von passenden Prädikaten vorgeschlagen wird. Eine Kombinierung von verschiedenen Konfigurationen erweitert dieses Verfahren sodass automatisch komplett neue Fakten für eine Wissensbank generiert werden. Hierbei stellen wir zwei Verfahren vor, einen nutzergesteuertenVerfahren, bei dem ein Nutzer die Entität aussucht die erweitert werden soll und einen datengesteuerten Ansatz, bei dem ein Algorithmus selbst die Entitäten aussucht, die mit fehlenden Fakten erweitert werden. Da Wissensbanken stetig wachsen und sich verändern, ist ein anderer Ansatz um die Verwendung von RDF Daten zu erleichtern die Verbesserung von Ontologien. Hierbei präsentieren wir ein Assoziationsregeln-basiertes Verfahren, der Daten und zugrundeliegende Ontologien zusammenführt. Durch die Verflechtung von unterschiedlichen Konfigurationen leiten wir einen neuen Algorithmus her, der gleichbedeutende Prädikate entdeckt. Diese Prädikate können benutzt werden um Ergebnisse einer Anfrage zu erweitern oder einen Nutzer während einer Anfrage zu unterstützen. Für jeden unserer vorgestellten Anwendungen präsentieren wir eine große Auswahl an Experimenten auf Realweltdatensätzen. Die Experimente und Evaluierungen zeigen den Mehrwert von Assoziationsregeln-Generierung für die Integration und Nutzbarkeit von RDF Daten und bestätigen die Angemessenheit unserer konfigurationsbasierten Methodologie um solche Regeln herzuleiten.
KW  - Assoziationsregeln
KW  - RDF
KW  - LOD
KW  - Mustererkennung
KW  - Synonyme
KW  - association rule mining
KW  - RDF
KW  - LOD
KW  - knowledge discovery
KW  - synonym discovery
Y1  - 2014
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus-71334
ER  - 
TY  - BOOK
A1  - Meyer, Andreas
A1  - Weske, Mathias
T1  - Weak conformance between process models and synchronized object life cycles
N2  - Process models specify behavioral execution constraints between activities as well as between activities and data objects. A data object is characterized by its states and state transitions represented as object life cycle. For process execution, all behavioral execution constraints must be correct. Correctness can be verified via soundness checking which currently only considers control flow information. For data correctness, conformance between a process model and its object life cycles is checked. Current approaches abstract from dependencies between multiple data objects and require fully specified process models although, in real-world process repositories, often underspecified models are found. Coping with these issues, we introduce the concept of synchronized object life cycles and we define a mapping of data constraints of a process model to Petri nets extending an existing mapping. Further, we apply the notion of weak conformance to process models to tell whether each time an activity needs to access a data object in a particular state, it is guaranteed that the data object is in or can reach the expected state. Then, we introduce an algorithm for an integrated verification of control flow correctness and weak data conformance using soundness checking.
N2  - Prozessmodelle spezifizieren die Verhaltensabhängigkeiten bezüglich der Ausführung sowohl zwischen Aktivitäten als auch zwischen Aktivitäten und Datenobjekten. Ein Datenobjekt wird über seine Zustände und Zustandsübergänge charakterisiert, welche in einem Objektlebenszyklus abgebildet werden. Für eine fehlerfreie Prozessausführung müssen alle Verhaltensabhängigkeiten korrekt modelliert werden. Eine Standardtechnik zur Korrektheitsüberprüfung ist das Überprüfen auf Soundness. Aktuelle Ansätze berücksichtigen allerdings nur den Kontrollfluss. Datenkorrektheit wird dagegen mittels Conformance zwischen einem Prozessmodel und den verwendeten Objektlebenszyklen überprüft, indem die Existenz eines Zustandsüberganges im Prozessmodell auch im Objektlebenszyklus möglich sein muss. Allerdings abstrahieren aktuelle Ansätze von Abhängigkeiten zwischen mehreren Datenobjekten und erfordern eine vollständige Prozessmodellspezifikation, d.h. das Überspringen oder Zusammenfassen von Zuständen beziehungsweise das Auslagern von Zustandsüberhängen in andere Prozessmodelle ist zum Beispiel nicht vorgesehen. In Prozessmodellsammlungen aus der Praxis sind allerdings oft solche unterspezifizierten Prozessmodelle vorhanden. In diesem Report adressieren wir diese Problemstellungen. Dazu führen wir das Konzept der synchronisierten Objektlebenszyklen ein, erweitern ein Mapping von Prozessmodellen zu Petri Netzen um Datenabhängigkeiten und wenden das Konzept der Weak Conformance auf Prozessmodelle an, um zu entscheiden ob immer wenn eine Aktivität auf ein Datenobjekt zugreift dieses auch im richtigen Zustand vorliegt. Dazu kann das Datenobjekt bereits in diesem Zustand sein oder aber diesen über eine beliebige Anzahl von Zustandsübergängen erreichen. Basierend auf diesen Konzepten führen wir auch einen Algorithmus ein, welcher ein integriertes Überprüfen von Kontrollfluss- und Datenflusskorrektheit unter Nutzung von Soundness-Überprüfungen ermöglicht.
T3  - Technische Berichte des Hasso-Plattner-Instituts für Digital Engineering an der Universität Potsdam - 91 
KW  - business process management
KW  - data flow correctness
KW  - object life cycle synchronization
KW  - Petri net mapping
KW  - conformance checking
KW  - Geschäftsprozessmanagement
KW  - Datenflusskorrektheit
KW  - Objektlebenszyklus-Synchronisation
KW  - Petri net Mapping
KW  - Conformance Überprüfung
Y1  - 2014
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus-71722
SN  - 978-3-86956-303-9
SN  - 1613-5652
SN  - 2191-1665
IS  - 91
PB  - Universitätsverlag Potsdam
CY  - Potsdam
ER  - 
TY  - BOOK
ED  - Neuhaus, Christian
ED  - Polze, Andreas
T1  - Cloud security mechanisms
N2  - Cloud computing has brought great benefits in cost and flexibility for provisioning services. The greatest challenge of cloud computing remains however the question of security. The current standard tools in access control mechanisms and cryptography can only partly solve the security challenges of cloud infrastructures. In the recent years of research in security and cryptography, novel mechanisms, protocols and algorithms have emerged that offer new ways to create secure services atop cloud infrastructures. This report provides introductions to a selection of security mechanisms that were part of the "Cloud Security Mechanisms" seminar in summer term 2013 at HPI.
N2  - Cloud Computing hat deutliche Kostenersparnisse und verbesserte Flexibilität bei der Bereitstellung von Computer-Diensten ermöglicht. Allerdings bleiben Sicherheitsbedenken die größte Herausforderung bei der Nutzung von Cloud-Diensten. Die etablierten Mechanismen für Zugriffskontrolle und Verschlüsselungstechnik können die Herausforderungen und Probleme der Sicherheit von Cloud-Infrastrukturen nur teilweise lösen. In den letzten Jahren hat die Forschung jedoch neue Mechanismen, Protokolle und Algorithmen hervorgebracht, welche neue Möglichkeiten eröffnen die Sicherheit von Cloud-Anwendungen zu erhöhen. Dieser technische Bericht bietet Einführungen zu einigen dieser Mechanismen, welche im Seminar "Cloud Security Mechanisms" im Sommersemester 2013 am HPI behandelt wurden.
T3  - Technische Berichte des Hasso-Plattner-Instituts für Digital Engineering an der Universität Potsdam - 87 
KW  - Cloud
KW  - Sicherheit
KW  - Privacy
KW  - Datenvertraulichkeit
KW  - Threshold Cryptography
KW  - Bitcoin
KW  - Homomorphe Verschlüsselung
KW  - Differential Privacy
KW  - cloud
KW  - security
KW  - privacy
KW  - confidentiality
KW  - threshold cryptography
KW  - bitcoin
KW  - homomorphic encryption
KW  - differential privacy
Y1  - 2014
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus-68168
SN  - 978-3-86956-281-0
SN  - 1613-5652
SN  - 2191-1665
IS  - 87
PB  - Universitätsverlag Potsdam
CY  - Potsdam
ER  - 
TY  - INPR
A1  - Grapentin, Andreas
A1  - Heidler, Kirstin
A1  - Korsch, Dimitri
A1  - Kumar Sah, Rakesh
A1  - Kunzmann, Nicco
A1  - Henning, Johannes
A1  - Mattis, Toni
A1  - Rein, Patrick
A1  - Seckler, Eric
A1  - Groneberg, Björn
A1  - Zimmermann, Florian
ED  - Hentschel, Uwe
ED  - Richter, Daniel
ED  - Polze, Andreas
T1  - Embedded operating system projects
N2  - In today’s life, embedded systems are ubiquitous. But they differ from traditional desktop systems in many aspects – these include predictable timing behavior (real-time), the management of scarce resources (memory, network), reliable communication protocols, energy management, special purpose user-interfaces (headless operation), system configuration, programming languages (to support software/hardware co-design), and modeling techniques. Within this technical report, authors present results from the lecture “Operating Systems for Embedded Computing” that has been offered by the “Operating Systems and Middleware” group at HPI in Winter term 2013/14. Focus of the lecture and accompanying projects was on principles of real-time computing. Students had the chance to gather practical experience with a number of different OSes and applications and present experiences with near-hardware programming. Projects address the entire spectrum, from bare-metal programming to harnessing a real-time OS to exercising the full software/hardware co-design cycle. Three outstanding projects are at the heart of this technical report. Project 1 focuses on the development of a bare-metal operating system for LEGO Mindstorms EV3. While still a toy, it comes with a powerful ARM processor, 64 MB of main memory, standard interfaces, such as Bluetooth and network protocol stacks. EV3 runs a version of 1 1 Introduction Linux. Sources are available from Lego’s web site. However, many devices and their driver software are proprietary and not well documented. Developing a new, bare-metal OS for the EV3 requires an understanding of the EV3 boot process. Since no standard input/output devices are available, initial debugging steps are tedious. After managing these initial steps, the project was able to adapt device drivers for a few Lego devices to an extent that a demonstrator (the Segway application) could be successfully run on the new OS. Project 2 looks at the EV3 from a different angle. The EV3 is running a pretty decent version of Linux- in principle, the RT_PREEMPT patch can turn any Linux system into a real-time OS by modifying the behavior of a number of synchronization constructs at the heart of the OS. Priority inversion is a problem that is solved by protocols such as priority inheritance or priority ceiling. Real-time OSes implement at least one of the protocols. The central idea of the project was the comparison of non-real-time and real-time variants of Linux on the EV3 hardware. A task set that showed effects of priority inversion on standard EV3 Linux would operate flawlessly on the Linux version with the RT_PREEMPT-patch applied. If only patching Lego’s version of Linux was that easy... Project 3 takes the notion of real-time computing more seriously. The application scenario was centered around our Carrera Digital 132 racetrack. Obtaining position information from the track, controlling individual cars, detecting and modifying the Carrera Digital protocol required design and implementation of custom controller hardware. What to implement in hardware, firmware, and what to implement in application software – this was the central question addressed by the project.
N2  - Heutzutage sind eingebettete Systeme allgegenwärtig. Allerdings unterscheiden sie sich in vielen Aspekten von traditionellen Desktop-System – dazu gehören vorhersagbares Zeitverhalten („Echtzeit“), die Verwaltung von knappen Ressourcen (Speicher, Netzwerk), zuverlässige Kommunikationsprotokolle, Energiemanagement, spezialisierte Benutzungsschnittstellen („headless“), Systemkonfiguration, Programmiersprachen (zur Unterstützung von Software-Hardware-Co-Design) und Modellierungstechniken. In diesem technischen Bericht präsentieren die Autoren Ergebnisse aus der Vorlesung „Betriebssysteme für Embedded Computing“, die von der Fachgruppe „Betriebssysteme und Middleware“ am HPI in Wintersemester 2013/14 angeboten wurde. Schwerpunkte der Vorlesung und der begleitenden Projekte waren Prinzipien von Echtzeit-Computing. Die Studenten hatten die Möglichkeit, praktische Erfahrungen mit einer Reihe von verschiedenen Betriebssystemen und Anwendungen zu sammeln und präsentieren ihre Erfahrungen mit hardwarenaher Programmierung. Die Projekte adressieren das gesamte Spektrum von der Bare-Metal-Programmierung über die Nutzung eines Echtzeitbetriebssystem bis zur Anwendung des vollen Software-Hardware-Co-Design-Zyklus‘. Drei herausragende Projekte sind das Herzstück dieses technischen Berichts. Projekt 1 konzentriert sich auf die Entwicklung eines Bare-Metal-Betriebssystems für LEGO Mindstorms EV3. Obwohl es ein Spielzeug ist, kommt es mit einem leistungsstarken ARM-Prozessor, 64 MB Hauptspeicher und Standardschnittstellen wie Bluetooth und einem Netzwerkprotokollstapel. Auf dem EV3 läuft spezielle Linux-Version – die Quellen sind auf der Lego-Website verfügbar. Allerdings sind viele Geräte und deren Treiber-Software urheberrechtlich geschützt und nicht gut dokumentiert. Die Entwicklung eines neuen Bare-Metal-Betriebssystem für den EV3 erfordert ein Verständnis des EV3-Bootvorgangs. Da keine Standard-Ein-/Ausgabegeräte zur Verfügung stehen, sind anfängliche Debug-Schritte mühsam. Nach dem Absolvieren dieser ersten Schritte war das Projekt in der Lage, Gerätetreiber für einige Lego-Geräte anzupassen um einen Demonstrator (die Segway-Anwendung) erfolgreich auf dem neuen Betriebssystem laufen zu lassen. Projekt 2 befasst sich mit dem EV3 aus einer anderen Perspektive. Der EV3 wird mit einer üblichen EV3 Linux-Version betrieben – im Prinzip kann der RT_PREEMPT-Patch jedes Linux-System in ein Echtzeitbetriebssystem verwandeln, indem er das Verhalten einer Anzahl von Synchronisationskonstrukten im Herzen des Betriebssystems anpasst. Priority Inversion ist ein Problem, das durch Protokolle wie Prioritätsvererbung oder Priority Ceiling gelöst wird. Heutige Echtzeit-Betriebssysteme implementieren mindestens eines dieser Protokolle. Die zentrale Idee des Projekts war der Vergleich der Nicht-Echtzeit und Echtzeit-Varianten von Linux auf der EV3-Hardware. Ein Task-Set, das die Auswirkungen der Prioritätsumkehr auf Standard-EV3 Linux zeigt, würde ohne Probleme auf der Linux-Version mit dem RT_PREEMPT-Patch betrieben werden können. Wenn nur das Patchen Lego-Version von Linux war so einfach wäre... Projekt 3 nimmt den Begriff des Echtzeit-Computing ernst. Das Anwendungsszenario wurde um unsere Carrera Digital 132 Bahn angeordnet. Das Sammeln von Positionsinformationen, die Steuerung einzelner Fahrzeuge, die Erfassung und Änderung des Carrera Digital-Protokolls erfordert die Konzeption und Umsetzung von spezialisierter Controller-Hardware. Die zentrale Fragestellung dieses Projekts war, was in Hardware, in Firmware oder in der Anwendungssoftware zu implementieren ist.
T3  - Technische Berichte des Hasso-Plattner-Instituts für Digital Engineering an der Universität Potsdam - 90 
KW  - Echtzeit
KW  - eingebettete Systeme
KW  - Betriebssysteme
KW  - Erfahrungsbericht
KW  - LEGO Mindstorms EV3
KW  - RT_PREEMT-Patch
KW  - Carrera Digital D132
KW  - real-time
KW  - embedded systems
KW  - operating systems
KW  - experience report
KW  - LEGO Mindstorms EV3
KW  - RT_PREEMT patch
KW  - Carrera Digital D132
Y1  - 2014
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus-69154
SN  - 978-3-86956-296-4
SN  - 1613-5652
SN  - 2191-1665
IS  - 90
PB  - Universitätsverlag Potsdam
CY  - Potsdam
ER  - 
TY  - THES
A1  - Takouna, Ibrahim
T1  - Energy-efficient and performance-aware virtual machine management for cloud data centers
T1  - Energieeffizientes und performancebewusstes Management virtueller Maschinen für Cloud Datenzentren
N2  - Virtualisierte Cloud Datenzentren stellen nach Bedarf Ressourcen zur Verfügu-ng, ermöglichen agile Ressourcenbereitstellung und beherbergen heterogene Applikationen mit verschiedenen Anforderungen an Ressourcen. Solche Datenzentren verbrauchen enorme Mengen an Energie, was die Erhöhung der Betriebskosten, der Wärme innerhalb der Zentren und des Kohlendioxidausstoßes verursacht. Der Anstieg des Energieverbrauches kann durch ein ineffektives Ressourcenmanagement, das die ineffiziente Ressourcenausnutzung verursacht, entstehen. Die vorliegende Dissertation stellt detaillierte Modelle und neue Verfahren für virtualisiertes Ressourcenmanagement in Cloud Datenzentren vor. Die vorgestellten Verfahren ziehen das Service-Level-Agreement (SLA) und die Heterogenität der Auslastung bezüglich des Bedarfs an Speicherzugriffen und Kommunikationsmustern von Web- und HPC- (High Performance Computing) Applikationen in Betracht. Um die präsentierten Techniken zu evaluieren, verwenden wir Simulationen und echte Protokollierung der Auslastungen von Web- und HPC- Applikationen. Außerdem vergleichen wir unser Techniken und Verfahren mit anderen aktuellen Verfahren durch die Anwendung von verschiedenen Performance Metriken. Die Hauptbeiträge dieser Dissertation sind Folgendes: Ein Proaktives auf robuster Optimierung basierendes Ressourcenbereitstellungsverfahren. Dieses Verfahren erhöht die Fähigkeit der Hostes zur Verfüg-ungsstellung von mehr VMs. Gleichzeitig aber wird der unnötige Energieverbrauch minimiert. Zusätzlich mindert diese Technik unerwünschte Ände-rungen im Energiezustand des Servers. Die vorgestellte Technik nutzt einen auf Intervall basierenden Vorhersagealgorithmus zur Implementierung einer robusten Optimierung. Dabei werden unsichere Anforderungen in Betracht gezogen. Ein adaptives und auf Intervall basierendes Verfahren zur Vorhersage des Arbeitsaufkommens mit hohen, in kürzer Zeit auftretenden Schwankungen. Die Intervall basierende Vorhersage ist implementiert in der Standard Abweichung Variante und in der Median absoluter Abweichung Variante. Die Intervall-Änderungen basieren auf einem adaptiven Vertrauensfenster um die Schwankungen des Arbeitsaufkommens zu bewältigen. Eine robuste VM Zusammenlegung für ein effizientes Energie und Performance Management. Dies ermöglicht die gegenseitige Abhängigkeit zwischen der Energie und der Performance zu minimieren. Unser Verfahren reduziert die Anzahl der VM-Migrationen im Vergleich mit den neu vor kurzem vorgestellten Verfahren. Dies trägt auch zur Reduzierung des durch das Netzwerk verursachten Energieverbrauches. Außerdem reduziert dieses Verfahren SLA-Verletzungen und die Anzahl von Änderungen an Energiezus-tänden. Ein generisches Modell für das Netzwerk eines Datenzentrums um die verzö-gerte Kommunikation und ihre Auswirkung auf die VM Performance und auf die Netzwerkenergie zu simulieren. Außerdem wird ein generisches Modell für ein Memory-Bus des Servers vorgestellt. Dieses Modell beinhaltet auch Modelle für die Latenzzeit und den Energieverbrauch für verschiedene Memory Frequenzen. Dies erlaubt eine Simulation der Memory Verzögerung und ihre Auswirkung auf die VM-Performance und auf den Memory Energieverbrauch. Kommunikation bewusste und Energie effiziente Zusammenlegung für parallele Applikationen um die dynamische Entdeckung von Kommunikationsmustern und das Umplanen von VMs zu ermöglichen. Das Umplanen von VMs benutzt eine auf den entdeckten Kommunikationsmustern basierende Migration. Eine neue Technik zur Entdeckung von dynamischen Mustern ist implementiert. Sie basiert auf der Signal Verarbeitung des Netzwerks von VMs, anstatt die Informationen des virtuellen Umstellung der Hosts oder der Initiierung der VMs zu nutzen. Das Ergebnis zeigt, dass unsere Methode die durchschnittliche Anwendung des Netzwerks reduziert und aufgrund der Reduzierung der aktiven Umstellungen Energie gespart. Außerdem bietet sie eine bessere VM Performance im Vergleich zu der CPU-basierten Platzierung. Memory bewusste VM Zusammenlegung für unabhängige VMs. Sie nutzt die Vielfalt des VMs Memory Zuganges um die Anwendung vom Memory-Bus der Hosts zu balancieren. Die vorgestellte Technik, Memory-Bus Load Balancing (MLB), verteilt die VMs reaktiv neu im Bezug auf ihre Anwendung vom Memory-Bus. Sie nutzt die VM Migration um die Performance des gesamtem Systems zu verbessern. Außerdem sind die dynamische Spannung, die Frequenz Skalierung des Memory und die MLB Methode kombiniert um ein besseres Energiesparen zu leisten.
N2  - Virtualized cloud data centers provide on-demand resources, enable agile resource provisioning, and host heterogeneous applications with different resource requirements. These data centers consume enormous amounts of energy, increasing operational expenses, inducing high thermal inside data centers, and raising carbon dioxide emissions. The increase in energy consumption can result from ineffective resource management that causes inefficient resource utilization. This dissertation presents detailed models and novel techniques and algorithms for virtual resource management in cloud data centers. The proposed techniques take into account Service Level Agreements (SLAs) and workload heterogeneity in terms of memory access demand and communication patterns of web applications and High Performance Computing (HPC) applications. To evaluate our proposed techniques, we use simulation and real workload traces of web applications and HPC applications and compare our techniques against the other recently proposed techniques using several performance metrics. The major contributions of this dissertation are the following: proactive resource provisioning technique based on robust optimization to increase the hosts' availability for hosting new VMs while minimizing the idle energy consumption. Additionally, this technique mitigates undesirable changes in the power state of the hosts by which the hosts' reliability can be enhanced in avoiding failure during a power state change. The proposed technique exploits the range-based prediction algorithm for implementing robust optimization, taking into consideration the uncertainty of demand. An adaptive range-based prediction for predicting workload with high fluctuations in the short-term. The range prediction is implemented in two ways: standard deviation and median absolute deviation. The range is changed based on an adaptive confidence window to cope with the workload fluctuations. A robust VM consolidation for efficient energy and performance management to achieve equilibrium between energy and performance trade-offs. Our technique reduces the number of VM migrations compared to recently proposed techniques. This also contributes to a reduction in energy consumption by the network infrastructure. Additionally, our technique reduces SLA violations and the number of power state changes. A generic model for the network of a data center to simulate the communication delay and its impact on VM performance, as well as network energy consumption. In addition, a generic model for a memory-bus of a server, including latency and energy consumption models for different memory frequencies. This allows simulating the memory delay and its influence on VM performance, as well as memory energy consumption. Communication-aware and energy-efficient consolidation for parallel applications to enable the dynamic discovery of communication patterns and reschedule VMs using migration based on the determined communication patterns. A novel dynamic pattern discovery technique is implemented, based on signal processing of network utilization of VMs instead of using the information from the hosts' virtual switches or initiation from VMs. The result shows that our proposed approach reduces the network's average utilization, achieves energy savings due to reducing the number of active switches, and provides better VM performance compared to CPU-based placement. Memory-aware VM consolidation for independent VMs, which exploits the diversity of VMs' memory access to balance memory-bus utilization of hosts. The proposed technique, Memory-bus Load Balancing (MLB), reactively redistributes VMs according to their utilization of a memory-bus using VM migration to improve the performance of the overall system. Furthermore, Dynamic Voltage and Frequency Scaling (DVFS) of the memory and the proposed MLB technique are combined to achieve better energy savings.
KW  - Energieeffizienz
KW  - Cloud Datenzentren
KW  - Ressourcenmanagement
KW  - dynamische Umsortierung
KW  - energy efficiency
KW  - cloud datacenter
KW  - resource management
KW  - dynamic consolidation
Y1  - 2014
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus-72399
ER  -