004 Datenverarbeitung; Informatik
Refine
Year of publication
Document Type
- Article (320)
- Monograph/Edited Volume (165)
- Doctoral Thesis (155)
- Conference Proceeding (50)
- Postprint (47)
- Master's Thesis (10)
- Other (7)
- Preprint (3)
- Part of a Book (2)
- Bachelor Thesis (1)
Language
- English (568)
- German (192)
- Multiple languages (2)
Keywords
- Informatik (21)
- Didaktik (15)
- machine learning (15)
- Hochschuldidaktik (14)
- Ausbildung (13)
- answer set programming (13)
- Cloud Computing (11)
- cloud computing (11)
- Hasso-Plattner-Institut (10)
- Hasso Plattner Institute (9)
Institute
- Institut für Informatik und Computational Science (269)
- Hasso-Plattner-Institut für Digital Engineering gGmbH (213)
- Hasso-Plattner-Institut für Digital Engineering GmbH (119)
- Extern (65)
- Mathematisch-Naturwissenschaftliche Fakultät (24)
- Fachgruppe Betriebswirtschaftslehre (23)
- Wirtschaftswissenschaften (17)
- Institut für Mathematik (16)
- Bürgerliches Recht (12)
- Digital Engineering Fakultät (8)
Constraints allow developers to specify desired properties of systems in a number of domains, and have those properties be maintained automatically. This results in compact, declarative code, avoiding scattered code to check and imperatively re-satisfy invariants. Despite these advantages, constraint programming is not yet widespread, with standard imperative programming still the norm. There is a long history of research on integrating constraint programming with the imperative paradigm. However, this integration typically does not unify the constructs for encapsulation and abstraction from both paradigms. This impedes re-use of modules, as client code written in one paradigm can only use modules written to support that paradigm. Modules require redundant definitions if they are to be used in both paradigms. We present a language – Babelsberg – that unifies the constructs for en- capsulation and abstraction by using only object-oriented method definitions for both declarative and imperative code. Our prototype – Babelsberg/R – is an extension to Ruby, and continues to support Ruby’s object-oriented se- mantics. It allows programmers to add constraints to existing Ruby programs in incremental steps by placing them on the results of normal object-oriented message sends. It is implemented by modifying a state-of-the-art Ruby virtual machine. The performance of standard object-oriented code without con- straints is only modestly impacted, with typically less than 10% overhead compared with the unmodified virtual machine. Furthermore, our architec- ture for adding multiple constraint solvers allows Babelsberg to deal with constraints in a variety of domains. We argue that our approach provides a useful step toward making con- straint solving a generic tool for object-oriented programmers. We also provide example applications, written in our Ruby-based implementation, which use constraints in a variety of application domains, including interactive graphics, circuit simulations, data streaming with both hard and soft constraints on performance, and configuration file Management.
Die Automatisierung von Geschäftsprozessen unterstützt Unternehmen, die Ausführung ihrer Prozesse effizienter zu gestalten. In existierenden Business Process Management Systemen, werden die Instanzen eines Prozesses völlig unabhängig voneinander ausgeführt. Jedoch kann das Synchronisieren von Instanzen mit ähnlichen Charakteristiken wie z.B. den gleichen Daten zu reduzierten Ausführungskosten führen. Zum Beispiel, wenn ein Onlinehändler zwei Bestellungen vom selben Kunden mit der gleichen Lieferanschrift erhält, können diese zusammen verpackt und versendet werden, um Versandkosten zu sparen. In diesem Papier verwenden wir Konzepte aus dem Datenbankbereich und führen Datensichten für Geschäftsprozesse ein, um Instanzen zu identifizieren, welche synchronisiert werden können. Auf Grundlage der Datensichten führen wir das Konzept der Batch-Regionen ein. Eine Batch-Region ermöglicht eine kontext-bewusste Instanzen-Synchronisierung über mehrere verbundene Aktivitäten. Das eingeführte Konzept wird mit einer Fallstudie evaluiert, bei der ein Kostenvergleich zwischen der normalen Prozessausführung und der Batchverarbeitung durchgeführt wird.
Requirements engineers have to elicit, document, and validate how stakeholders act and interact to achieve their common goals in collaborative scenarios. Only after gathering all information concerning who interacts with whom to do what and why, can a software system be designed and realized which supports the stakeholders to do their work. To capture and structure requirements of different (groups of) stakeholders, scenario-based approaches have been widely used and investigated. Still, the elicitation and validation of requirements covering collaborative scenarios remains complicated, since the required information is highly intertwined, fragmented, and distributed over several stakeholders. Hence, it can only be elicited and validated collaboratively. In times of globally distributed companies, scheduling and conducting workshops with groups of stakeholders is usually not feasible due to budget and time constraints. Talking to individual stakeholders, on the other hand, is feasible but leads to fragmented and incomplete stakeholder scenarios. Going back and forth between different individual stakeholders to resolve this fragmentation and explore uncovered alternatives is an error-prone, time-consuming, and expensive task for the requirements engineers. While formal modeling methods can be employed to automatically check and ensure consistency of stakeholder scenarios, such methods introduce additional overhead since their formal notations have to be explained in each interaction between stakeholders and requirements engineers. Tangible prototypes as they are used in other disciplines such as design, on the other hand, allow designers to feasibly validate and iterate concepts and requirements with stakeholders. This thesis proposes a model-based approach for prototyping formal behavioral specifications of stakeholders who are involved in collaborative scenarios. By simulating and animating such specifications in a remote domain-specific visualization, stakeholders can experience and validate the scenarios captured so far, i.e., how other stakeholders act and react. This interactive scenario simulation is referred to as a model-based virtual prototype. Moreover, through observing how stakeholders interact with a virtual prototype of their collaborative scenarios, formal behavioral specifications can be automatically derived which complete the otherwise fragmented scenarios. This, in turn, enables requirements engineers to elicit and validate collaborative scenarios in individual stakeholder sessions – decoupled, since stakeholders can participate remotely and are not forced to be available for a joint session at the same time. This thesis discusses and evaluates the feasibility, understandability, and modifiability of model-based virtual prototypes. Similarly to how physical prototypes are perceived, the presented approach brings behavioral models closer to being tangible for stakeholders and, moreover, combines the advantages of joint stakeholder sessions and decoupled sessions.
Systems of Systems (SoS) have received a lot of attention recently. In this thesis we will focus on SoS that are built atop the techniques of Service-Oriented Architectures and thus combine the benefits and challenges of both paradigms. For this thesis we will understand SoS as ensembles of single autonomous systems that are integrated to a larger system, the SoS. The interesting fact about these systems is that the previously isolated systems are still maintained, improved and developed on their own. Structural dynamics is an issue in SoS, as at every point in time systems can join and leave the ensemble. This and the fact that the cooperation among the constituent systems is not necessarily observable means that we will consider these systems as open systems. Of course, the system has a clear boundary at each point in time, but this can only be identified by halting the complete SoS. However, halting a system of that size is practically impossible. Often SoS are combinations of software systems and physical systems. Hence a failure in the software system can have a serious physical impact what makes an SoS of this kind easily a safety-critical system. The contribution of this thesis is a modelling approach that extends OMG's SoaML and basically relies on collaborations and roles as an abstraction layer above the components. This will allow us to describe SoS at an architectural level. We will also give a formal semantics for our modelling approach which employs hybrid graph-transformation systems. The modelling approach is accompanied by a modular verification scheme that will be able to cope with the complexity constraints implied by the SoS' structural dynamics and size. Building such autonomous systems as SoS without evolution at the architectural level --- i. e. adding and removing of components and services --- is inadequate. Therefore our approach directly supports the modelling and verification of evolution.
Durch die immer stärker werdende Flut an digitalen Informationen basieren immer mehr Anwendungen auf der Nutzung von kostengünstigen Cloud Storage Diensten. Die Anzahl der Anbieter, die diese Dienste zur Verfügung stellen, hat sich in den letzten Jahren deutlich erhöht. Um den passenden Anbieter für eine Anwendung zu finden, müssen verschiedene Kriterien individuell berücksichtigt werden. In der vorliegenden Studie wird eine Auswahl an Anbietern etablierter Basic Storage Diensten vorgestellt und miteinander verglichen. Für die Gegenüberstellung werden Kriterien extrahiert, welche bei jedem der untersuchten Anbieter anwendbar sind und somit eine möglichst objektive Beurteilung erlauben. Hierzu gehören unter anderem Kosten, Recht, Sicherheit, Leistungsfähigkeit sowie bereitgestellte Schnittstellen. Die vorgestellten Kriterien können genutzt werden, um Cloud Storage Anbieter bezüglich eines konkreten Anwendungsfalles zu bewerten.
Learning a model for the relationship between the attributes and the annotated labels of data examples serves two purposes. Firstly, it enables the prediction of the label for examples without annotation. Secondly, the parameters of the model can provide useful insights into the structure of the data. If the data has an inherent partitioned structure, it is natural to mirror this structure in the model. Such mixture models predict by combining the individual predictions generated by the mixture components which correspond to the partitions in the data. Often the partitioned structure is latent, and has to be inferred when learning the mixture model. Directly evaluating the accuracy of the inferred partition structure is, in many cases, impossible because the ground truth cannot be obtained for comparison. However it can be assessed indirectly by measuring the prediction accuracy of the mixture model that arises from it. This thesis addresses the interplay between the improvement of predictive accuracy by uncovering latent cluster structure in data, and further addresses the validation of the estimated structure by measuring the accuracy of the resulting predictive model. In the application of filtering unsolicited emails, the emails in the training set are latently clustered into advertisement campaigns. Uncovering this latent structure allows filtering of future emails with very low false positive rates. In order to model the cluster structure, a Bayesian clustering model for dependent binary features is developed in this thesis. Knowing the clustering of emails into campaigns can also aid in uncovering which emails have been sent on behalf of the same network of captured hosts, so-called botnets. This association of emails to networks is another layer of latent clustering. Uncovering this latent structure allows service providers to further increase the accuracy of email filtering and to effectively defend against distributed denial-of-service attacks. To this end, a discriminative clustering model is derived in this thesis that is based on the graph of observed emails. The partitionings inferred using this model are evaluated through their capacity to predict the campaigns of new emails. Furthermore, when classifying the content of emails, statistical information about the sending server can be valuable. Learning a model that is able to make use of it requires training data that includes server statistics. In order to also use training data where the server statistics are missing, a model that is a mixture over potentially all substitutions thereof is developed. Another application is to predict the navigation behavior of the users of a website. Here, there is no a priori partitioning of the users into clusters, but to understand different usage scenarios and design different layouts for them, imposing a partitioning is necessary. The presented approach simultaneously optimizes the discriminative as well as the predictive power of the clusters. Each model is evaluated on real-world data and compared to baseline methods. The results show that explicitly modeling the assumptions about the latent cluster structure leads to improved predictions compared to the baselines. It is beneficial to incorporate a small number of hyperparameters that can be tuned to yield the best predictions in cases where the prediction accuracy can not be optimized directly.
There are two common approaches to implement a virtual machine (VM) for a dynamic object-oriented language. On the one hand, it can be implemented in a C-like language for best performance and maximum control over the resulting executable. On the other hand, it can be implemented in a language such as Java that allows for higher-level abstractions. These abstractions, such as proper object-oriented modularization, automatic memory management, or interfaces, are missing in C-like languages but they can simplify the implementation of prevalent but complex concepts in VMs, such as garbage collectors (GCs) or just-in-time compilers (JITs). Yet, the implementation of a dynamic object-oriented language in Java eventually results in two VMs on top of each other (double stack), which impedes performance. For statically typed languages, the Maxine VM solves this problem; it is written in Java but can be executed without a Java virtual machine (JVM). However, it is currently not possible to execute dynamic object-oriented languages in Maxine. This work presents an approach to bringing object models and execution models of dynamic object-oriented languages to the Maxine VM and the application of this approach to Squeak/Smalltalk. The representation of objects in and the execution of dynamic object-oriented languages pose certain challenges to the Maxine VM that lacks certain variation points necessary to enable an effortless and straightforward implementation of dynamic object-oriented languages' execution models. The implementation of Squeak/Smalltalk in Maxine as a feasibility study is to unveil such missing variation points.
Imaginary Interfaces
(2013)
The size of a mobile device is primarily determined by the size of the touchscreen. As such, researchers have found that the way to achieve ultimate mobility is to abandon the screen altogether. These wearable devices are operated using hand gestures, voice commands or a small number of physical buttons. By abandoning the screen these devices also abandon the currently dominant spatial interaction style (such as tapping on buttons), because, seemingly, there is nothing to tap on. Unfortunately this design prevents users from transferring their learned interaction knowledge gained from traditional touchscreen-based devices. In this dissertation, I present Imaginary Interfaces, which return spatial interaction to screenless mobile devices. With these interfaces, users point and draw in the empty space in front of them or on the palm of their hands. While they cannot see the results of their interaction, they obtain some visual and tactile feedback by watching and feeling their hands interact. After introducing the concept of Imaginary Interfaces, I present two hardware prototypes that showcase two different forms of interaction with an imaginary interface, each with its own advantages: mid-air imaginary interfaces can be large and expressive, while palm-based imaginary interfaces offer an abundance of tactile features that encourage learning. Given that imaginary interfaces offer no visual output, one of the key challenges is to enable users to discover the interface's layout. This dissertation offers three main solutions: offline learning with coordinates, browsing with audio feedback and learning by transfer. The latter I demonstrate with the Imaginary Phone, a palm-based imaginary interface that mimics the layout of a physical mobile phone that users are already familiar with. Although these designs enable interaction with Imaginary Interfaces, they tell us little about why this interaction is possible. In the final part of this dissertation, I present an exploration into which human perceptual abilities are used when interacting with a palm-based imaginary interface and how much each accounts for performance with the interface. These findings deepen our understanding of Imaginary Interfaces and suggest that palm-based Imaginary Interfaces can enable stand-alone eyes-free use for many applications, including interfaces for visually impaired users.
This thesis presents novel ideas and research findings for the Web of Data – a global data space spanning many so-called Linked Open Data sources. Linked Open Data adheres to a set of simple principles to allow easy access and reuse for data published on the Web. Linked Open Data is by now an established concept and many (mostly academic) publishers adopted the principles building a powerful web of structured knowledge available to everybody. However, so far, Linked Open Data does not yet play a significant role among common web technologies that currently facilitate a high-standard Web experience. In this work, we thoroughly discuss the state-of-the-art for Linked Open Data and highlight several shortcomings – some of them we tackle in the main part of this work. First, we propose a novel type of data source meta-information, namely the topics of a dataset. This information could be published with dataset descriptions and support a variety of use cases, such as data source exploration and selection. For the topic retrieval, we present an approach coined Annotated Pattern Percolation (APP), which we evaluate with respect to topics extracted from Wikipedia portals. Second, we contribute to entity linking research by presenting an optimization model for joint entity linking, showing its hardness, and proposing three heuristics implemented in the LINked Data Alignment (LINDA) system. Our first solution can exploit multi-core machines, whereas the second and third approach are designed to run in a distributed shared-nothing environment. We discuss and evaluate the properties of our approaches leading to recommendations which algorithm to use in a specific scenario. The distributed algorithms are among the first of their kind, i.e., approaches for joint entity linking in a distributed fashion. Also, we illustrate that we can tackle the entity linking problem on the very large scale with data comprising more than 100 millions of entity representations from very many sources. Finally, we approach a sub-problem of entity linking, namely the alignment of concepts. We again target a method that looks at the data in its entirety and does not neglect existing relations. Also, this concept alignment method shall execute very fast to serve as a preprocessing for further computations. Our approach, called Holistic Concept Matching (HCM), achieves the required speed through grouping the input by comparing so-called knowledge representations. Within the groups, we perform complex similarity computations, relation conclusions, and detect semantic contradictions. The quality of our result is again evaluated on a large and heterogeneous dataset from the real Web. In summary, this work contributes a set of techniques for enhancing the current state of the Web of Data. All approaches have been tested on large and heterogeneous real-world input.
Business processes are fundamental to the operations of a company. Each product manufactured and every service provided is the result of a series of actions that constitute a business process. Business process management is an organizational principle that makes the processes of a company explicit and offers capabilities to implement procedures, control their execution, analyze their performance, and improve them. Therefore, business processes are documented as process models that capture these actions and their execution ordering, and make them accessible to stakeholders. As these models are an essential knowledge asset, they need to be managed effectively. In particular, the discovery and reuse of existing knowledge becomes challenging in the light of companies maintaining hundreds and thousands of process models. In practice, searching process models has been solved only superficially by means of free-text search of process names and their descriptions. Scientific contributions are limited in their scope, as they either present measures for process similarity or elaborate on query languages to search for particular aspects. However, they fall short in addressing efficient search, the presentation of search results, and the support to reuse discovered models. This thesis presents a novel search method, where a query is expressed by an exemplary business process model that describes the behavior of a possible answer. This method builds upon a formal framework that captures and compares the behavior of process models by the execution ordering of actions. The framework contributes a conceptual notion of behavioral distance that quantifies commonalities and differences of a pair of process models, and enables process model search. Based on behavioral distances, a set of measures is proposed that evaluate the quality of a particular search result to guide the user in assessing the returned matches. A projection of behavioral aspects to a process model enables highlighting relevant fragments that led to a match and facilitates its reuse. The thesis further elaborates on two search techniques that provide concrete behavioral distance functions as an instantiation of the formal framework. Querying enables search with a notion of behavioral inclusion with regard to the query. In contrast, similarity search obtains process models that are similar to a query, even if the query is not precisely matched. For both techniques, indexes are presented that enable efficient search. Methods to evaluate the quality and performance of process model search are introduced and applied to the techniques of this thesis. They show good results with regard to human assessment and scalability in a practical setting.