004 Datenverarbeitung; Informatik
Refine
Has Fulltext
- no (212) (remove)
Year of publication
Document Type
- Article (160)
- Doctoral Thesis (24)
- Conference Proceeding (16)
- Other (7)
- Monograph/Edited Volume (3)
- Part of a Book (2)
Is part of the Bibliography
- yes (212)
Keywords
- social media (4)
- Data profiling (3)
- evaluation (3)
- machine learning (3)
- Blockchains (2)
- Deep learning (2)
- General Earth and Planetary Sciences (2)
- Geography, Planning and Development (2)
- JSP (2)
- Machine Learning (2)
Institute
- Institut für Informatik und Computational Science (64)
- Hasso-Plattner-Institut für Digital Engineering gGmbH (46)
- Fachgruppe Betriebswirtschaftslehre (27)
- Hasso-Plattner-Institut für Digital Engineering GmbH (27)
- Bürgerliches Recht (12)
- Wirtschaftswissenschaften (11)
- Institut für Mathematik (6)
- Institut für Biochemie und Biologie (4)
- Institut für Physik und Astronomie (4)
- Department Erziehungswissenschaft (3)
Industry 4.0 and the Internet of Things are recent developments that have lead to the creation of new kinds of manufacturing data. Linking this new kind of sensor data to traditional business information is crucial for enterprises to take advantage of the data’s full potential. In this paper, we present a demo which allows experiencing this data integration, both vertically between technical and business contexts and horizontally along the value chain. The tool simulates a manufacturing company, continuously producing both business and sensor data, and supports issuing ad-hoc queries that answer specific questions related to the business. In order to adapt to different environments, users can configure sensor characteristics to their needs.
The development of new and better optimization and approximation methods for Job Shop Scheduling Problems (JSP) uses simulations to compare their performance. The test data required for this has an uncertain influence on the simulation results, because the feasable search space can be changed drastically by small variations of the initial problem model. Methods could benefit from this to varying degrees. This speaks in favor of defining standardized and reusable test data for JSP problem classes, which in turn requires a systematic describability of the test data in order to be able to compile problem adequate data sets. This article looks at the test data used for comparing methods by literature review. It also shows how and why the differences in test data have to be taken into account. From this, corresponding challenges are derived which the management of test data must face in the context of JSP research.
We introduce a type and effect system, for an imperative object calculus, which infers sharing possibly introduced by the evaluation of an expression, represented as an equivalence relation among its free variables. This direct representation of sharing effects at the syntactic level allows us to express in a natural way, and to generalize, widely-used notions in literature, notably uniqueness and borrowing. Moreover, the calculus is pure in the sense that reduction is defined on language terms only, since they directly encode store. The advantage of this non-standard execution model with respect to a behaviorally equivalent standard model using a global auxiliary structure is that reachability relations among references are partly encoded by scoping. (C) 2018 Elsevier B.V. All rights reserved.
plasp 3
(2019)
We describe the new version of the Planning Domain Definition Language (PDDL)-to-Answer Set Programming (ASP) translator plasp. First, it widens the range of accepted PDDL features. Second, it contains novel planning encodings, some inspired by Satisfiability Testing (SAT) planning and others exploiting ASP features such as well-foundedness. All of them are designed for handling multivalued fluents in order to capture both PDDL as well as SAS planning formats. Third, enabled by multishot ASP solving, it offers advanced planning algorithms also borrowed from SAT planning. As a result, plasp provides us with an ASP-based framework for studying a variety of planning techniques in a uniform setting. Finally, we demonstrate in an empirical analysis that these techniques have a significant impact on the performance of ASP planning.
Social Media, Quo Vadis?
(2020)
Over the past two decades, social media have become a crucial and omnipresent cultural and economic phenomenon, which has seen platforms come and go and advance technologically. In this study, we explore the further development of social media regarding interactive technologies, platform development, relationships to news media, the activities of institutional and organizational users, and effects of social media on the individual and the society over the next five to ten years by conducting an international, two-stage Delphi study. Our results show that enhanced interaction on platforms, including virtual and augmented reality, somatosensory sense, and touch- and movement-based navigation are expected. AIs will interact with other social media users. Inactive user profiles will outnumber active ones. Platform providers will diversify into the WWW, e-commerce, edu-tech, fintechs, the automobile industry, and HR. They will change to a freemium business model and put more effort into combating cybercrime. Social media will become the predominant news distributor, but fake news will still be problematic. Firms will spend greater amounts of their budgets on social media advertising, and schools, politicians, and the medical sector will increase their social media engagement. Social media use will increasingly lead to individuals’ psychic issues. Society will benefit from economic growth and new jobs, increased political interest, democratic progress, and education due to social media. However, censorship and the energy consumption of platform operators might rise.
Multi-sided platforms (MSP) strongly affect markets and play a crucial part within the digital and networked economy. Although empirical evidence indicates their occurrence in many industries, research has not investigated the game-changing impact of MSP on traditional markets to a sufficient extent. More specifically, we have little knowledge of how MSP affect value creation and customer interaction in entire markets, exploiting the potential of digital technologies to offer new value propositions. Our paper addresses this research gap and provides an initial systematic approach to analyze the impact of MSP on the insurance industry. For this purpose, we analyze the state of the art in research and practice in order to develop a reference model of the value network for the insurance industry. On this basis, we conduct a case-study analysis to discover and analyze roles which are occupied or even newly created by MSP. As a final step, we categorize MSP with regard to their relation to traditional insurance companies, resulting in a classification scheme with four MSP standard types: Competition, Coordination, Cooperation, Collaboration.
The usage of mobile devices is rapidly growing with Android being the most prevalent mobile operating system. Thanks to the vast variety of mobile applications, users are preferring smartphones over desktops for day to day tasks like Internet surfing. Consequently, smartphones store a plenitude of sensitive data. This data together with the high values of smartphones make them an attractive target for device/data theft (thieves/malicious applications).
Unfortunately, state-of-the-art anti-theft solutions do not work if they do not have an active network connection, e.g., if the SIM card was removed from the device. In the majority of these cases, device owners permanently lose their smartphone together with their personal data, which is even worse.
Apart from that malevolent applications perform malicious activities to steal sensitive information from smartphones. Recent research considered static program analysis to detect dangerous data leaks. These analyses work well for data leaks due to inter-component communication, but suffer from shortcomings for inter-app communication with respect to precision, soundness, and scalability.
This thesis focuses on enhancing users' privacy on Android against physical device loss/theft and (un)intentional data leaks. It presents three novel frameworks: (1) ThiefTrap, an anti-theft framework for Android, (2) IIFA, a modular inter-app intent information flow analysis of Android applications, and (3) PIAnalyzer, a precise approach for PendingIntent vulnerability analysis.
ThiefTrap is based on a novel concept of an anti-theft honeypot account that protects the owner's data while preventing a thief from resetting the device.
We implemented the proposed scheme and evaluated it through an empirical user study with 35 participants. In this study, the owner's data could be protected, recovered, and anti-theft functionality could be performed unnoticed from the thief in all cases.
IIFA proposes a novel approach for Android's inter-component/inter-app communication (ICC/IAC) analysis. Our main contribution is the first fully automatic, sound, and precise ICC/IAC information flow analysis that is scalable for realistic apps due to modularity, avoiding combinatorial explosion: Our approach determines communicating apps using short summaries rather than inlining intent calls between components and apps, which requires simultaneously analyzing all apps installed on a device.
We evaluate IIFA in terms of precision, recall, and demonstrate its scalability to a large corpus of real-world apps. IIFA reports 62 problematic ICC-/IAC-related information flows via two or more apps/components.
PIAnalyzer proposes a novel approach to analyze PendingIntent related vulnerabilities. PendingIntents are a powerful and universal feature of Android for inter-component communication. We empirically evaluate PIAnalyzer on a set of 1000 randomly selected applications and find 1358 insecure usages of PendingIntents, including 70 severe vulnerabilities.
In den letzten Jahren ist die Aufnahme und Verbreitung von Videos immer einfacher geworden. Daher sind die Relevanz und Beliebtheit zur Aufnahme von Vorlesungsvideos in den letzten Jahren stark angestiegen. Dies führt zu einem großen Datenbestand an Vorlesungsvideos in den Video-Vorlesungsarchiven der Universitäten. Durch diesen wachsenden Datenbestand wird es allerdings für die Studenten immer schwieriger, die relevanten Videos eines Vorlesungsarchivs aufzufinden. Zusätzlich haben viele Lerninteressierte durch ihre alltägliche Arbeit und familiären Verpflichtungen immer weniger Zeit sich mit dem Lernen zu beschäftigen. Ein weiterer Aspekt, der das Lernen im Internet erschwert, ist, dass es durch soziale Netzwerke und anderen Online-Plattformen vielfältige Ablenkungsmöglichkeiten gibt. Daher ist das Ziel dieser Arbeit, Möglichkeiten aufzuzeigen, welche das E-Learning bieten kann, um Nutzer beim Lernprozess zu unterstützen und zu motivieren.
Das Hauptkonzept zur Unterstützung der Studenten ist das präzise Auffinden von Informationen in den immer weiter wachsenden Vorlesungsvideoarchiven. Dazu werden die Vorlesungen im Voraus analysiert und die Texte der Vorlesungsfolien mit verschiedenen Methoden indexiert. Daraufhin können die Studenten mit der Suche oder dem Lecture-Butler Lerninhalte entsprechend Ihres aktuellen Wissensstandes auffinden. Die möglichen verwendeten Technologien für das Auffinden wurden, sowohl technisch, als auch durch Studentenumfragen erfolgreich evaluiert. Zur Motivation von Studenten in Vorlesungsarchiven werden diverse Konzepte betrachtet und die Umsetzung evaluiert, die den Studenten interaktiv in den Lernprozess einbeziehen.
Neben Vorlesungsarchiven existieren sowohl im privaten als auch im dienstlichen Weiterbildungsbereich die in den letzten Jahren immer beliebter werdenden MOOCs. Generell sind die Abschlussquoten von MOOCs allerdings mit durchschnittlich 7% eher gering. Daher werden Motivationslösungen für MOOCs im Bereich von eingebetteten Systemen betrachtet, die in praktischen Programmierkursen Anwendung finden. Zusätzlich wurden Kurse evaluiert, welche die Programmierung von eingebetteten Systemen behandeln. Die Verfügbarkeit war bei Kursen von bis zu 10.000 eingeschriebenen Teilnehmern hierbei kein schwerwiegendes Problem. Die Verwendung von eingebetteten Systemen in Programmierkursen sind bei den Studenten in der praktischen Umsetzung auf sehr großes Interesse gestoßen.
In recent years, the ever-growing amount of documents on the Web as well as in closed systems for private or business contexts led to a considerable increase of valuable textual information about topics, events, and entities. It is a truism that the majority of information (i.e., business-relevant data) is only available in unstructured textual form. The text mining research field comprises various practice areas that have the common goal of harvesting high-quality information from textual data. These information help addressing users' information needs.
In this thesis, we utilize the knowledge represented in user-generated content (UGC) originating from various social media services to improve text mining results. These social media platforms provide a plethora of information with varying focuses. In many cases, an essential feature of such platforms is to share relevant content with a peer group. Thus, the data exchanged in these communities tend to be focused on the interests of the user base. The popularity of social media services is growing continuously and the inherent knowledge is available to be utilized. We show that this knowledge can be used for three different tasks.
Initially, we demonstrate that when searching persons with ambiguous names, the information from Wikipedia can be bootstrapped to group web search results according to the individuals occurring in the documents. We introduce two models and different means to handle persons missing in the UGC source. We show that the proposed approaches outperform traditional algorithms for search result clustering. Secondly, we discuss how the categorization of texts according to continuously changing community-generated folksonomies helps users to identify new information related to their interests. We specifically target temporal changes in the UGC and show how they influence the quality of different tag recommendation approaches. Finally, we introduce an algorithm to attempt the entity linking problem, a necessity for harvesting entity knowledge from large text collections. The goal is the linkage of mentions within the documents with their real-world entities. A major focus lies on the efficient derivation of coherent links.
For each of the contributions, we provide a wide range of experiments on various text corpora as well as different sources of UGC.
The evaluation shows the added value that the usage of these sources provides and confirms the appropriateness of leveraging user-generated content to serve different information needs.
Solving problems combining task and motion planning requires searching across a symbolic search space and a geometric search space. Because of the semantic gap between symbolic and geometric representations, symbolic sequences of actions are not guaranteed to be geometrically feasible. This compels us to search in the combined search space, in which frequent backtracks between symbolic and geometric levels make the search inefficient.We address this problem by guiding symbolic search with rich information extracted from the geometric level through culprit detection mechanisms.
Die Projektierung und Abwicklung sowie die statische und dynamische Analyse von Geschäftsprozessen im Bereich des Verwaltens und Regierens auf kommunaler, Länder- wie auch Bundesebene mit Hilfe von Informations- und Kommunikationstechniken beschäftigen Politiker und Strategen für Informationstechnologie ebenso wie die Öffentlichkeit seit Langem. Der hieraus entstandene Begriff E-Government wurde in der Folge aus den unterschiedlichsten technischen, politischen und semantischen Blickrichtungen beleuchtet.
Die vorliegende Arbeit konzentriert sich dabei auf zwei Schwerpunktthemen:
> Das erste Schwerpunktthema behandelt den Entwurf eines hierarchischen Architekturmodells, für welches sieben hierarchische Schichten identifiziert werden können. Diese erscheinen notwendig, aber auch hinreichend, um den allgemeinen Fall zu beschreiben. Den Hintergrund hierfür liefert die langjährige Prozess- und Verwaltungserfahrung als Leiter der EDV-Abteilung der Stadtverwaltung Landshut, eine kreisfreie Stadt mit rund 69.000 Einwohnern im Nordosten von München. Sie steht als Repräsentant für viele Verwaltungsvorgänge in der Bundesrepublik Deutschland und ist dennoch als Analyseobjekt in der Gesamtkomplexität und Prozessquantität überschaubar. Somit können aus der Analyse sämtlicher Kernabläufe statische und dynamische Strukturen extrahiert und abstrakt modelliert werden. Die Schwerpunkte liegen in der Darstellung der vorhandenen Bedienabläufe in einer Kommune. Die Transformation der Bedienanforderung in einem hierarchischen System, die Darstellung der Kontroll- und der Operationszustände in allen Schichten wie auch die Strategie der Fehlererkennung und Fehlerbehebung schaffen eine transparente Basis für umfassende Restrukturierungen und Optimierungen. Für die Modellierung wurde FMC-eCS eingesetzt, eine am Hasso-Plattner-Institut für Softwaresystemtechnik GmbH (HPI) im Fachgebiet Kommunikationssysteme entwickelte Methodik zur Modellierung zustandsdiskreter Systeme unter Berücksichtigung möglicher Inkonsistenzen
>Das zweite Schwerpunktthema widmet sich der quantitativen Modellierung und Optimierung von E-Government-Bediensystemen, welche am Beispiel des Bürgerbüros der Stadt Landshut im Zeitraum 2008 bis 2015 durchgeführt wurden. Dies erfolgt auf Basis einer kontinuierlichen Betriebsdatenerfassung mit aufwendiger Vorverarbeitung zur Extrahierung mathematisch beschreibbarer Wahrscheinlichkeitsverteilungen. Der hieraus entwickelte Dienstplan wurde hinsichtlich der erzielbaren Optimierungen im dauerhaften Echteinsatz verifiziert.
Behavioural Models
(2016)
This textbook introduces the basis for modelling and analysing discrete dynamic systems, such as computer programmes, soft- and hardware systems, and business processes. The underlying concepts are introduced and concrete modelling techniques are described, such as finite automata, state machines, and Petri nets. The concepts are related to concrete application scenarios, among which business processes play a prominent role.
The book consists of three parts, the first of which addresses the foundations of behavioural modelling. After a general introduction to modelling, it introduces transition systems as a basic formalism for representing the behaviour of discrete dynamic systems. This section also discusses causality, a fundamental concept for modelling and reasoning about behaviour. In turn, Part II forms the heart of the book and is devoted to models of behaviour. It details both sequential and concurrent systems and introduces finite automata, state machines and several different types of Petri nets. One chapter is especially devoted to business process models, workflow patterns and BPMN, the industry standard for modelling business processes. Lastly, Part III investigates how the behaviour of systems can be analysed. To this end, it introduces readers to the concept of state spaces. Further chapters cover the comparison of behaviour and the formal analysis and verification of behavioural models.
The book was written for students of computer science and software engineering, as well as for programmers and system analysts interested in the behaviour of the systems they work on. It takes readers on a journey from the fundamentals of behavioural modelling to advanced techniques for modelling and analysing sequential and concurrent systems, and thus provides them a deep understanding of the concepts and techniques introduced and how they can be applied to concrete application scenarios.
E-Learning-Anwendungen bieten Chancen für die gesetzlich vorgeschriebene Inklusion von Lernenden mit Beeinträchtigungen. Die gleichberechtigte Teilhabe von blinden Lernenden an Veranstaltungen in virtuellen Klassenzimmern ist jedoch durch den synchronen, multimedialen Charakter und den hohen Informationsumfang dieser Lösungen kaum möglich.
Die vorliegende Arbeit untersucht die Zugänglichkeit virtueller Klassenzimmer für blinde Nutzende, um eine möglichst gleichberechtigte Teilhabe an synchronen, kollaborativen Lernszenarien zu ermöglichen. Im Rahmen einer Produktanalyse werden dazu virtuelle Klassenzimmer auf ihre Zugänglichkeit und bestehende Barrieren untersucht und Richtlinien für die zugängliche Gestaltung von virtuellen Klassenzimmern definiert. Anschließend wird ein alternatives Benutzungskonzept zur Darstellung und Bedienung virtueller Klassenzimmer auf einem zweidimensionalen taktilen Braille-Display entwickelt, um eine möglichst gleichberechtigte Teilhabe blinder Lernender an synchronen Lehrveranstaltungen zu ermöglichen. Nach einer ersten Evaluation mit blinden Probanden erfolgt die prototypische Umsetzung des Benutzungskonzepts für ein Open-Source-Klassenzimmer. Die abschließende Evaluation der prototypischen Umsetzung zeigt die Verbesserung der Zugänglichkeit von virtuellen Klassenzimmern für blinde Lernende unter Verwendung eines taktilen Flächendisplays und bestätigt die Wirksamkeit der im Rahmen dieser Arbeit entwickelten Konzepte.
In this project I constructed a workflow that takes a DNA sequence as input and provides a phylogenetic tree, consisting of the input sequence and other sequences which were found during a database search. In this phylogenetic tree the sequences are arranged depending on similarities. In bioinformatics, constructing phylogenetic trees is often used to explore the evolutionary relationships of genes or organisms and to understand the mechanisms of evolution itself.
Spotlocator is a game wherein people have to guess the spots of where photos were taken. The photos of a defined area for each game are from panoramio.com. They are published at http://spotlocator. drupalgardens.com with an ID. Everyone can guess the photo spots by sending a special tweet via Twitter that contains the hashtag #spotlocator, the guessed coordinates and the ID of the photo. An evaluation is published for all tweets. The players are informed about the distance to the real photo spots and the positions are shown on a map.
Exploratory Data Analysis
(2014)
In bioinformatics the term exploratory data analysis refers to different methods to get an overview of large biological data sets. Hence, it helps to create a framework for further analysis and hypothesis testing. The workflow facilitates this first important step of the data analysis created by high-throughput technologies. The results are different plots showing the structure of the measurements. The goal of the workflow is the automatization of the exploratory data analysis, but also the flexibility should be guaranteed. The basic tool is the free software R.
The protein classification workflow described in this report enables users to get information about a novel protein sequence automatically. The information is derived by different bioinformatic analysis tools which calculate or predict features of a protein sequence. Also, databases are used to compare the novel sequence with known proteins.
Lessons Learned
(2014)
This chapter summarizes the experience and the lessons we learned concerning the application of the jABC as a framework for design and execution of scientific workflows. It reports experiences from the domain modeling (especially service integration) and workflow design phases and evaluates the resulting models statistically with respect to the SIB library and hierarchy levels.
The Course's SIB Libraries
(2014)
This chapter gives a detailed description of the service framework underlying all the example projects that form the foundation of this book. It describes the different SIB libraries that we made available for the course “Process modeling in the natural sciences” to provide the functionality that was required for the envisaged applications. The students used these SIB libraries to realize their projects.
A major part of the scientific experiments that are carried out today requires thorough computational support. While database and algorithm providers face the problem of bundling resources to create and sustain powerful computation nodes, the users have to deal with combining sets of (remote) services into specific data analysis and transformation processes. Today’s attention to “big data” amplifies the issues of size, heterogeneity, and process-level diversity/integration. In the last decade, especially workflow-based approaches to deal with these processes have enjoyed great popularity. This book concerns a particularly agile and model-driven approach to manage scientific workflows that is based on the XMDD paradigm. In this chapter we explain the scope and purpose of the book, briefly describe the concepts and technologies of the XMDD paradigm, explain the principal differences to related approaches, and outline the structure of the book.
We summarize here the main characteristics and features of the jABC framework, used in the case studies as a graphical tool for modeling scientific processes and workflows. As a comprehensive environment for service-oriented modeling and design according to the XMDD (eXtreme Model-Driven Design) paradigm, the jABC offers much more than the pure modeling capability. Associated technologies and plugins provide in fact means for a rich variety of supporting functionality, such as remote service integration, taxonomical service classification, model execution, model verification, model synthesis, and model compilation. We describe here in short both the essential jABC features and the service integration philosophy followed in the environment. In our work over the last years we have seen that this kind of service definition and provisioning platform has the potential to become a core technology in interdisciplinary service orchestration and technology transfer: Domain experts, like scientists not specially trained in computer science, directly define complex service orchestrations as process models and use efficient and complex domain-specific tools in a simple and intuitive way.
Recombination of free charge is a key process limiting the performance of solar cells. For low mobility materials, such as organic semiconductors, the kinetics of non-geminate recombination (NGR) is strongly linked to the motion of charges. As these materials possess significant disorder, thermalization of photogenerated carriers in the inhomogeneously broadened density of state distribution is an unavoidable process. Despite its general importance, knowledge about the kinetics of NGR in complete organic solar cells is rather limited. We employ time delayed collection field (TDCF) experiments to study the recombination of photogenerated charge in the high-performance polymer:fullerene blend PCDTBT:PCBM. NGR in the bulk of this amorphous blend is shown to be highly dispersive, with a continuous reduction of the recombination coefficient throughout the entire time scale, until all charge carriers have either been extracted or recombined. Rapid, contact-mediated recombination is identified as an additional loss channel, which, if not properly taken into account, would erroneously suggest a pronounced field dependence of charge generation. These findings are in stark contrast to the results of TDCF experiments on photovoltaic devices made from ordered blends, such as P3HT:PCBM, where non-dispersive recombination was proven to dominate the charge carrier dynamics under application relevant conditions.
Compared to their inorganic counterparts, organic semiconductors suffer from relatively low charge carrier mobilities. Therefore, expressions derived for inorganic solar cells to correlate characteristic performance parameters to material properties are prone to fail when applied to organic devices. This is especially true for the classical Shockley-equation commonly used to describe current-voltage (JV)-curves, as it assumes a high electrical conductivity of the charge transporting material. Here, an analytical expression for the JV-curves of organic solar cells is derived based on a previously published analytical model. This expression, bearing a similar functional dependence as the Shockley-equation, delivers a new figure of merit α to express the balance between free charge recombination and extraction in low mobility photoactive materials. This figure of merit is shown to determine critical device parameters such as the apparent series resistance and the fill factor.
Software-as-a-Service (SaaS) offers several advantages to both service providers and users. Service providers can benefit from the reduction of Total Cost of Ownership (TCO), better scalability, and better resource utilization. On the other hand, users can use the service anywhere and anytime, and minimize upfront investment by following the pay-as-you-go model. Despite the benefits of SaaS, users still have concerns about the security and privacy of their data. Due to the nature of SaaS and the Cloud in general, the data and the computation are beyond the users' control, and hence data security becomes a vital factor in this new paradigm. Furthermore, in multi-tenant SaaS applications, the tenants become more concerned about the confidentiality of their data since several tenants are co-located onto a shared infrastructure.
To address those concerns, we start protecting the data from the provisioning process by controlling how tenants are being placed in the infrastructure. We present a resource allocation algorithm designed to minimize the risk of co-resident tenants called SecPlace. It enables the SaaS provider to control the resource (i.e., database instance) allocation process while taking into account the security of tenants as a requirement.
Due to the design principles of the multi-tenancy model, tenants follow some degree of sharing on both application and infrastructure levels. Thus, strong security-isolation should be present. Therefore, we develop SignedQuery, a technique that prevents one tenant from accessing others' data. We use the Signing Concept to create a signature that is used to sign the tenant's request, then the server can verifies the signature and recognizes the requesting tenant, and hence ensures that the data to be accessed is belonging to the legitimate tenant.
Finally, Data confidentiality remains a critical concern due to the fact that data in the Cloud is out of users' premises, and hence beyond their control. Cryptography is increasingly proposed as a potential approach to address such a challenge. Therefore, we present SecureDB, a system designed to run SQL-based applications over an encrypted database. SecureDB captures the schema design and analyzes it to understand the internal structure of the data (i.e., relationships between the tables and their attributes). Moreover, we determine the appropriate partialhomomorphic encryption scheme for each attribute where computation is possible even when the data is encrypted.
To evaluate our work, we conduct extensive experiments with di↵erent settings. The main use case in our work is a popular open source HRM application, called OrangeHRM. The results show that our multi-layered approach is practical, provides enhanced security and isolation among tenants, and have a moderate complexity in terms of processing encrypted data.
In-Memory Data Management
(2012)
Nach 50 Jahren erfolgreicher Entwicklunghat die Business-IT einen neuenWendepunkt erreicht. Hier zeigen die Autoren erstmalig, wieIn-Memory Computing dieUnternehmensprozesse künftig verändern wird. Bisher wurden Unternehmensdaten aus Performance-Gründen auf verschiedene Datenbanken verteilt: Analytische Datenresidieren in Data Warehouses und werden regelmäßig mithilfe transaktionaler Systeme synchronisiert. Diese Aufspaltung macht flexibles Echtzeit-Reporting aktueller Daten unmöglich. Doch dank leistungsfähigerMulti-Core-CPUs, großer Hauptspeicher, Cloud Computing und immerbesserer mobiler Endgeräte lassen die Unternehmen dieses restriktive Modell zunehmend hinter sich. Die Autoren stellen Techniken vor, die eine analytische und transaktionale Verarbeitung in Echtzeit erlauben und so dem Geschäftsleben neue Wege bahnen.
Through the use of next generation sequencing (NGS) technology, a lot of newly sequenced organisms are now available. Annotating those genes is one of the most challenging tasks in sequence biology. Here, we present an automated workflow to find homologue proteins, annotate sequences according to function and create a three-dimensional model.
With the jABC it is possible to realize workflows for numerous questions in different fields. The goal of this project was to create a workflow for the identification of differentially expressed genes. This is of special interest in biology, for it gives the opportunity to get a better insight in cellular changes due to exogenous stress, diseases and so on. With the knowledge that can be derived from the differentially expressed genes in diseased tissues, it becomes possible to find new targets for treatment.
A workflow for visualizing server connections using the Google Maps API was built in the jABC. It makes use of three basic services: An XML-based IP address geolocation web service, a command line tool and the Static Maps API. The result of the workflow is an URL leading to an image file of a map, showing server connections between a client and a target host.
Geocoder accuracy ranking
(2014)
Finding an address on a map is sometimes tricky: the chosen map application may be unfamiliar with the enclosed region. There are several geocoders on the market, they have different databases and algorithms to compute the query. Consequently, the geocoding results differ in their quality. Fortunately the geocoders provide a rich set of metadata. The workflow described in this paper compares this metadata with the aim to find out which geocoder is offering the best-fitting coordinate for a given address.
Analyses of metagenomes in life sciences present new opportunities as well as challenges to the scientific community and call for advanced computational methods and workflows. The large amount of data collected from samples via next-generation sequencing (NGS) technologies render manual approaches to sequence comparison and annotation unsuitable. Rather, fast and efficient computational pipelines are needed to provide comprehensive statistics and summaries and enable the researcher to choose appropriate tools for more specific analyses. The workflow presented here builds upon previous pipelines designed for automated clustering and annotation of raw sequence reads obtained from next-generation sequencing technologies such as 454 and Illumina. Employing specialized algorithms, the sequence reads are processed at three different levels. First, raw reads are clustered at high similarity cutoff to yield clusters which can be exported as multifasta files for further analyses. Independently, open reading frames (ORFs) are predicted from raw reads and clustered at two strictness levels to yield sets of non-redundant sequences and ORF families. Furthermore, single ORFs are annotated by performing searches against the Pfam database
This book presents an agile and model-driven approach to manage scientific workflows. The approach is based on the Extreme Model Driven Design (XMDD) paradigm and aims at simplifying and automating the complex data analysis processes carried out by scientists in their day-to-day work. Besides documenting the impact the workflow modeling might have on the work of natural scientists, this book serves three major purposes: 1. It acts as a primer for practitioners who are interested to learn how to think in terms of services and workflows when facing domain-specific scientific processes. 2. It provides interesting material for readers already familiar with this kind of tools, because it introduces systematically both the technologies used in each case study and the basic concepts behind them. 3. As the addressed thematic field becomes increasingly relevant for lectures in both computer science and experimental sciences, it also provides helpful material for teachers that plan similar courses.
Geometric generalization is a fundamental concept in the digital mapping process. An increasing amount of spatial data is provided on the web as well as a range of tools to process it. This jABC workflow is used for the automatic testing of web-based generalization services like mapshaper.org by executing its functionality, overlaying both datasets before and after the transformation and displaying them visually in a .tif file. Mostly Web Services and command line tools are used to build an environment where ESRI shapefiles can be uploaded, processed through a chosen generalization service and finally visualized in Irfanview.
In the geoinformatics field, remote sensing data is often used for analyzing the characteristics of the current investigation area. This includes DEMs, which are simple raster grids containing grey scales representing the respective elevation values. The project CREADED that is presented in this paper aims at making these monochrome raster images more significant and more intuitively interpretable. For this purpose, an executable interactive model for creating a colored and relief-shaded Digital Elevation Model (DEM) has been designed using the jABC framework. The process is based on standard jABC-SIBs and SIBs that provide specific GIS functions, which are available as Web services, command line tools and scripts.
This paper describes the implementation of a workflow model for service-oriented computing of potential areas for wind turbines in jABC. By implementing a re-executable model the manual effort of a multi-criteria site analysis can be reduced. The aim is to determine the shift of typical geoprocessing tools of geographic information systems (GIS) from the desktop to the web. The analysis is based on a vector data set and mainly uses web services of the “Center for Spatial Information Science and Systems” (CSISS). This paper discusses effort, benefits and problems associated with the use of the web services.
Location analyses are among the most common tasks while working with spatial data and geographic information systems. Automating the most frequently used procedures is therefore an important aspect of improving their usability. In this context, this project aims to design and implement a workflow, providing some basic tools for a location analysis. For the implementation with jABC, the workflow was applied to the problem of finding a suitable location for placing an artificial reef. For this analysis three parameters (bathymetry, slope and grain size of the ground material) were taken into account, processed, and visualized with the The Generic Mapping Tools (GMT), which were integrated into the workflow as jETI-SIBs. The implemented workflow thereby showed that the approach to combine jABC with GMT resulted in an user-centric yet user-friendly tool with high-quality cartographic outputs.
Creation of topographic maps
(2014)
Location analyses are among the most common tasks while working with spatial data and geographic information systems. Automating the most frequently used procedures is therefore an important aspect of improving their usability. In this context, this project aims to design and implement a workflow, providing some basic tools for a location analysis. For the implementation with jABC, the workflow was applied to the problem of finding a suitable location for placing an artificial reef. For this analysis three parameters (bathymetry, slope and grain size of the ground material) were taken into account, processed, and visualized with the The Generic Mapping Tools (GMT), which were integrated into the workflow as jETI-SIBs. The implemented workflow thereby showed that the approach to combine jABC with GMT resulted in an user-centric yet user-friendly tool with high-quality cartographic outputs.
GraffDok is an application helping to maintain an overview over sprayed images somewhere in a city. At the time of writing it aims at vandalism rather than at beautiful photographic graffiti in an underpass. Looking at hundreds of tags and scribbles on monuments, house walls, etc. it would be interesting to not only record them in writing but even make them accessible electronically, including images.
GraffDok’s workflow is simple and only requires an EXIF-GPS-tagged photograph of a graffito. It automatically determines its location by using reverse geocoding with the given GPS-coordinates and the Gisgraphy WebService. While asking the user for some more meta data, GraffDok analyses the image in parallel with this and tries to detect fore- and background – before extracting the drawing lines and make them stand alone. The command line based tool ImageMagick is used here as well as for accessing EXIF data.
Any meta data is written to csv-files, which will stay easily accessible and can be integrated in TeX-files as well. The latter ones are converted to PDF at the end of the workflow, containing a table about all graffiti and a summary for each – including the generated characteristic graffiti pattern image.
Cloud-RAID
(2014)