Refine
Year of publication
- 2018 (89) (remove)
Document Type
- Other (52)
- Article (19)
- Doctoral Thesis (14)
- Monograph/Edited Volume (4)
Language
- English (89)
Is part of the Bibliography
- yes (89) (remove)
Keywords
- E-Learning (3)
- Security Metrics (3)
- Security Risk Assessment (3)
- real-time rendering (3)
- 3D printing (2)
- Angriffserkennung (2)
- Answer set programming (2)
- Big Data (2)
- Cloud-Security (2)
- Energy (2)
Institute
- Hasso-Plattner-Institut für Digital Engineering GmbH (89) (remove)
We analyze the problem of response suggestion in a closed domain along a real-world scenario of a digital library. We present a text-processing pipeline to generate question-answer pairs from chat transcripts. On this limited amount of training data, we compare retrieval-based, conditioned-generation, and dedicated representation learning approaches for response suggestion. Our results show that retrieval-based methods that strive to find similar, known contexts are preferable over parametric approaches from the conditioned-generation family, when the training data is limited. We, however, identify a specific representation learning approach that is competitive to the retrieval-based approaches despite the training data limitation.
Microservice Architectures (MSA) structure applications as a collection of loosely coupled services that implement business capabilities. The key advantages of MSA include inherent support for continuous deployment of large complex applications, agility and enhanced productivity. However, studies indicate that most MSA are homogeneous, and introduce shared vulnerabilites, thus vulnerable to multi-step attacks, which are economics-of-scale incentives to attackers. In this paper, we address the issue of shared vulnerabilities in microservices with a novel solution based on the concept of Moving Target Defenses (MTD). Our mechanism works by performing risk analysis against microservices to detect and prioritize vulnerabilities. Thereafter, security risk-oriented software diversification is employed, guided by a defined diversification index. The diversification is performed at runtime, leveraging both model and template based automatic code generation techniques to automatically transform programming languages and container images of the microservices. Consequently, the microservices attack surfaces are altered thereby introducing uncertainty for attackers while reducing the attackability of the microservices. Our experiments demonstrate the efficiency of our solution, with an average success rate of over 70% attack surface randomization.
Resource constrained smart micro-grid architectures describe a class of smart micro-grid architectures that handle communications operations over a lossy network and depend on a distributed collection of power generation and storage units. Disadvantaged communities with no or intermittent access to national power networks can benefit from such a micro-grid model by using low cost communication devices to coordinate the power generation, consumption, and storage. Furthermore, this solution is both cost-effective and environmentally-friendly. One model for such micro-grids, is for users to agree to coordinate a power sharing scheme in which individual generator owners sell excess unused power to users wanting access to power. Since the micro-grid relies on distributed renewable energy generation sources which are variable and only partly predictable, coordinating micro-grid operations with distributed algorithms is necessity for grid stability. Grid stability is crucial in retaining user trust in the dependability of the micro-grid, and user participation in the power sharing scheme, because user withdrawals can cause the grid to breakdown which is undesirable. In this chapter, we present a distributed architecture for fair power distribution and billing on microgrids. The architecture is designed to operate efficiently over a lossy communication network, which is an advantage for disadvantaged communities. We build on the architecture to discuss grid coordination notably how tasks such as metering, power resource allocation, forecasting, and scheduling can be handled. All four tasks are managed by a feedback control loop that monitors the performance and behaviour of the micro-grid, and based on historical data makes decisions to ensure the smooth operation of the grid. Finally, since lossy networks are undependable, differentiating system failures from adversarial manipulations is an important consideration for grid stability. We therefore provide a characterisation of potential adversarial models and discuss possible mitigation measures.
3D point cloud technology facilitates the automated and highly detailed digital acquisition of real-world environments such as assets, sites, cities, and countries; the acquired 3D point clouds represent an essential category of geodata used in a variety of geoinformation applications and systems. In this paper, we present a web-based system for the interactive and collaborative exploration and inspection of arbitrary large 3D point clouds. Our approach is based on standard WebGL on the client side and is able to render 3D point clouds with billions of points. It uses spatial data structures and level-of-detail representations to manage the 3D point cloud data and to deploy out-of-core and web-based rendering concepts. By providing functionality for both, thin-client and thick-client applications, the system scales for client devices that are vastly different in computing capabilities. Different 3D point-based rendering techniques and post-processing effects are provided to enable task-specific and data-specific filtering and highlighting, e.g., based on per-point surface categories or temporal information. A set of interaction techniques allows users to collaboratively work with the data, e.g., by measuring distances and areas, by annotating, or by selecting and extracting data subsets. Additional value is provided by the system's ability to display additional, context-providing geodata alongside 3D point clouds and to integrate task-specific processing and analysis operations. We have evaluated the presented techniques and the prototype system with different data sets from aerial, mobile, and terrestrial acquisition campaigns with up to 120 billion points to show their practicality and feasibility.
Spatio-temporal data denotes a category of data that contains spatial as well as temporal components. For example, time-series of geo-data, thematic maps that change over time, or tracking data of moving entities can be interpreted as spatio-temporal data.
In today's automated world, an increasing number of data sources exist, which constantly generate spatio-temporal data. This includes for example traffic surveillance systems, which gather movement data about human or vehicle movements, remote-sensing systems, which frequently scan our surroundings and produce digital representations of cities and landscapes, as well as sensor networks in different domains, such as logistics, animal behavior study, or climate research.
For the analysis of spatio-temporal data, in addition to automatic statistical and data mining methods, exploratory analysis methods are employed, which are based on interactive visualization. These analysis methods let users explore a data set by interactively manipulating a visualization, thereby employing the human cognitive system and knowledge of the users to find patterns and gain insight into the data.
This thesis describes a software framework for the visualization of spatio-temporal data, which consists of GPU-based techniques to enable the interactive visualization and exploration of large spatio-temporal data sets. The developed techniques include data management, processing, and rendering, facilitating real-time processing and visualization of large geo-temporal data sets. It includes three main contributions:
- Concept and Implementation of a GPU-Based Visualization Pipeline.
The developed visualization methods are based on the concept of a GPU-based visualization pipeline, in which all steps -- processing, mapping, and rendering -- are implemented on the GPU. With this concept, spatio-temporal data is represented directly in GPU memory, using shader programs to process and filter the data, apply mappings to visual properties, and finally generate the geometric representations for a visualization during the rendering process. Data processing, filtering, and mapping are thereby executed in real-time, enabling dynamic control over the mapping and a visualization process which can be controlled interactively by a user.
- Attributed 3D Trajectory Visualization.
A visualization method has been developed for the interactive exploration of large numbers of 3D movement trajectories. The trajectories are visualized in a virtual geographic environment, supporting basic geometries such as lines, ribbons, spheres, or tubes. Interactive mapping can be applied to visualize the values of per-node or per-trajectory attributes, supporting shape, height, size, color, texturing, and animation as visual properties. Using the dynamic mapping system, several kind of visualization methods have been implemented, such as focus+context visualization of trajectories using interactive density maps, and space-time cube visualization to focus on the temporal aspects of individual movements.
- Geographic Network Visualization.
A method for the interactive exploration of geo-referenced networks has been developed, which enables the visualization of large numbers of nodes and edges in a geographic context. Several geographic environments are supported, such as a 3D globe, as well as 2D maps using different map projections, to enable the analysis of networks in different contexts and scales. Interactive filtering, mapping, and selection can be applied to analyze these geographic networks, and visualization methods for specific types of networks, such as coupled 3D networks or temporal networks have been implemented.
As a demonstration of the developed visualization concepts, interactive visualization tools for two distinct use cases have been developed. The first contains the visualization of attributed 3D movement trajectories of airplanes around an airport. It allows users to explore and analyze the trajectories of approaching and departing aircrafts, which have been recorded over the period of a month. By applying the interactive visualization methods for trajectory visualization and interactive density maps, analysts can derive insight from the data, such as common flight paths, regular and irregular patterns, or uncommon incidents such as missed approaches on the airport.
The second use case involves the visualization of climate networks, which are geographic networks in the climate research domain. They represent the dynamics of the climate system using a network structure that expresses statistical interrelationships between different regions. The interactive tool allows climate analysts to explore these large networks, analyzing the network's structure and relating it to the geographic background. Interactive filtering and selection enables them to find patterns in the climate data and identify e.g. clusters in the networks or flow patterns.
High-throughput RNA sequencing (RNAseq) produces large data sets containing expression levels of thousands of genes. The analysis of RNAseq data leads to a better understanding of gene functions and interactions, which eventually helps to study diseases like cancer and develop effective treatments. Large-scale RNAseq expression studies on cancer comprise samples from multiple cancer types and aim to identify their distinct molecular characteristics. Analyzing samples from different cancer types implies analyzing samples from different tissue origin. Such multi-tissue RNAseq data sets require a meaningful analysis that accounts for the inherent tissue-related bias: The identified characteristics must not originate from the differences in tissue types, but from the actual differences in cancer types. However, current analysis procedures do not incorporate that aspect. As a result, we propose to integrate a tissue-awareness into the analysis of multi-tissue RNAseq data. We introduce an extension for gene selection that provides a tissue-wise context for every gene and can be flexibly combined with any existing gene selection approach. We suggest to expand conventional evaluation by additional metrics that are sensitive to the tissue-related bias. Evaluations show that especially low complexity gene selection approaches profit from introducing tissue-awareness.
In recent years, the ever-growing amount of documents on the Web as well as in closed systems for private or business contexts led to a considerable increase of valuable textual information about topics, events, and entities. It is a truism that the majority of information (i.e., business-relevant data) is only available in unstructured textual form. The text mining research field comprises various practice areas that have the common goal of harvesting high-quality information from textual data. These information help addressing users' information needs.
In this thesis, we utilize the knowledge represented in user-generated content (UGC) originating from various social media services to improve text mining results. These social media platforms provide a plethora of information with varying focuses. In many cases, an essential feature of such platforms is to share relevant content with a peer group. Thus, the data exchanged in these communities tend to be focused on the interests of the user base. The popularity of social media services is growing continuously and the inherent knowledge is available to be utilized. We show that this knowledge can be used for three different tasks.
Initially, we demonstrate that when searching persons with ambiguous names, the information from Wikipedia can be bootstrapped to group web search results according to the individuals occurring in the documents. We introduce two models and different means to handle persons missing in the UGC source. We show that the proposed approaches outperform traditional algorithms for search result clustering. Secondly, we discuss how the categorization of texts according to continuously changing community-generated folksonomies helps users to identify new information related to their interests. We specifically target temporal changes in the UGC and show how they influence the quality of different tag recommendation approaches. Finally, we introduce an algorithm to attempt the entity linking problem, a necessity for harvesting entity knowledge from large text collections. The goal is the linkage of mentions within the documents with their real-world entities. A major focus lies on the efficient derivation of coherent links.
For each of the contributions, we provide a wide range of experiments on various text corpora as well as different sources of UGC.
The evaluation shows the added value that the usage of these sources provides and confirms the appropriateness of leveraging user-generated content to serve different information needs.
An energy consumption model for multiModal wireless sensor networks based on wake-up radio receivers
(2018)
Energy consumption is a major concern in Wireless Sensor Networks. A significant waste of energy occurs due to the idle listening and overhearing problems, which are typically avoided by turning off the radio, while no transmission is ongoing. The classical approach for allowing the reception of messages in such situations is to use a low-duty-cycle protocol, and to turn on the radio periodically, which reduces the idle listening problem, but requires timers and usually unnecessary wakeups. A better solution is to turn on the radio only on demand by using a Wake-up Radio Receiver (WuRx). In this paper, an energy model is presented to estimate the energy saving in various multi-hop network topologies under several use cases, when a WuRx is used instead of a classical low-duty-cycling protocol. The presented model also allows for estimating the benefit of various WuRx properties like using addressing or not.
An Information System Supporting the Eliciting of Expert Knowledge for Successful IT Projects
(2018)
In order to guarantee the success of an IT project, it is necessary for a company to possess expert knowledge. The difficulty arises when experts no longer work for the company and it then becomes necessary to use their knowledge, in order to realise an IT project. In this paper, the ExKnowIT information system which supports the eliciting of expert knowledge for successful IT projects, is presented and consists of the following modules: (1) the identification of experts for successful IT projects, (2) the eliciting of expert knowledge on completed IT projects, (3) the expert knowledge base on completed IT projects, (4) the Group Method for Data Handling (GMDH) algorithm, (5) new knowledge in support of decisions regarding the selection of a manager for a new IT project. The added value of our system is that these three approaches, namely, the elicitation of expert knowledge, the success of an IT project and the discovery of new knowledge, gleaned from the expert knowledge base, otherwise known as the decision model, complement each other.
ASEDS
(2018)
The Massive adoption of social media has provided new ways for individuals to express their opinion and emotion online. In 2016, Facebook introduced a new reactions feature that allows users to express their psychological emotions regarding published contents using so-called Facebook reactions. In this paper, a framework for predicting the distribution of Facebook post reactions is presented. For this purpose, we collected an enormous amount of Facebook posts associated with their reactions labels using the proposed scalable Facebook crawler. The training process utilizes 3 million labeled posts for more than 64,000 unique Facebook pages from diverse categories. The evaluation on standard benchmarks using the proposed features shows promising results compared to previous research. The final model is able to predict the reaction distribution on Facebook posts with a recall score of 0.90 for "Joy" emotion.