Refine
Has Fulltext
- no (84) (remove)
Document Type
- Other (84) (remove)
Language
- English (84)
Is part of the Bibliography
- yes (84)
Keywords
- E-Learning (4)
- Scrum (4)
- MOOC (3)
- Security Metrics (3)
- Security Risk Assessment (3)
- Teamwork (3)
- 3D printing (2)
- Android (2)
- Blockchain (2)
- Cloud-Security (2)
Institute
- Hasso-Plattner-Institut für Digital Engineering GmbH (84) (remove)
When I started my PhD, I wanted to do something related to systems but I wasn't sure exactly what. I didn't consider data management systems initially, because I was unaware of the richness of the systems work that data management systems were build on. I thought the field was mainly about SQL. Luckily, that view changed quickly.
Web-based E-Learning uses Internet technologies and digital media to deliver education content to learners. Many universities in recent years apply their capacity in producing Massive Open Online Courses (MOOCs). They have been offering MOOCs with an expectation of rendering a comprehensive online apprenticeship. Typically, an online content delivery process requires an Internet connection. However, access to the broadband has never been a readily available resource in many regions. In Africa, poor and no networks are yet predominantly experienced by Internet users, frequently causing offline each moment a digital device disconnect from a network. As a result, a learning process is always disrupted, delayed and terminated in such regions. This paper raises the concern of E-Learning in poor and low bandwidths, in fact, it highlights the needs for an Offline-Enabled mode. The paper also explores technical approaches beamed to enhance the user experience inWeb-based E-Learning, particular in Africa.
In the course of patient treatments, psychotherapists aim to meet the challenges of being both a trusted, knowledgeable conversation partner and a diligent documentalist. We are developing the digital whiteboard system Tele-Board MED (TBM), which allows the therapist to take digital notes during the session together with the patient. This study investigates what therapists are experiencing when they document with TBM in patient sessions for the first time and whether this documentation saves them time when writing official clinical documents. As the core of this study, we conducted four anamnesis session dialogues with behavior psychotherapists and volunteers acting in the role of patients. Following a mixed-method approach, the data collection and analysis involved self-reported emotion samples, user experience curves and questionnaires. We found that even in the very first patient session with TBM, therapists come to feel comfortable, develop a positive feeling and can concentrate on the patient. Regarding administrative documentation tasks, we found with the TBM report generation feature the therapists save 60% of the time they normally spend on writing case reports to the health insurance.
The classification of vulnerabilities is a fundamental step to derive formal attributes that allow a deeper analysis. Therefore, it is required that this classification has to be performed timely and accurate. Since the current situation demands a manual interaction in the classification process, the timely processing becomes a serious issue. Thus, we propose an automated alternative to the manual classification, because the amount of identified vulnerabilities per day cannot be processed manually anymore. We implemented two different approaches that are able to automatically classify vulnerabilities based on the vulnerability description. We evaluated our approaches, which use Neural Networks and the Naive Bayes methods respectively, on the base of publicly known vulnerabilities.
Business process simulation is an important means for quantitative analysis of a business process and to compare different process alternatives. With the Business Process Model and Notation (BPMN) being the state-of-the-art language for the graphical representation of business processes, many existing process simulators support already the simulation of BPMN diagrams. However, they do not provide well-defined interfaces to integrate new concepts in the simulation environment. In this work, we present the design and architecture of a proof-of-concept implementation of an open and extensible BPMN process simulator. It also supports the simulation of multiple BPMN processes at a time and relies on the building blocks of the well-founded discrete event simulation. The extensibility is assured by a plug-in concept. Its feasibility is demonstrated by extensions supporting new BPMN concepts, such as the simulation of business rule activities referencing decision models and batch activities.
In university teaching today, it is common practice to record regular lectures and special events such as conferences and speeches. With these recordings, a large fundus of video teaching material can be created quickly and easily. Typically, lectures have a length of about one and a half hours and usually take place once or twice a week based on the credit hours. Depending on the number of lectures and other events recorded, the number of recordings available is increasing rapidly, which means that an appropriate form of provisioning is essential for the students. This is usually done in the form of lecture video platforms. In this work, we have investigated how lecture video platforms and the contained knowledge can be improved and accessed more easily by an increasing number of students. We came up with a multistep process we have applied to our own lecture video web portal that can be applied to other solutions as well.
Embedded smart home — remote lab MOOC with optional real hardware experience for over 4000 students
(2018)
MOOCs (Massive Open Online Courses) become more and more popular for learners of all ages to study further or to learn new subjects of interest. The purpose of this paper is to introduce a different MOOC course style. Typically, video content is shown teaching the student new information. After watching a video, self-test questions can be answered. Finally, the student answers weekly exams and final exams like the self test questions. Out of the points that have been scored for weekly and final exams a certificate can be issued. Our approach extends the possibility to receive points for the final score with practical programming exercises on real hardware. It allows the student to do embedded programming by communicating over GPIO pins to control LEDs and measure sensor values. Additionally, they can visualize values on an embedded display using web technologies, which are an essential part of embedded and smart home devices to communicate with common APIs. Students have the opportunity to solve all tasks within the online remote lab and at home on the same kind of hardware. The evaluation of this MOOCs indicates the interesting design for students to learn an engineering technique with new technology approaches in an appropriate, modern, supporting and motivating way of teaching.
When students watch learning videos online, they usually need to watch several hours of video content. In the end, not every minute of a video is relevant for the exam. Additionally, students need to add notes to clarify issues of a lecture. There are several possibilities to enhance the metadata of a video, e.g. a typical way to add user-specific information to an online video is a comment functionality, which allows users to share their thoughts and questions with the public. In contrast to common video material which can be found online, lecture videos are used for exam preparation. Due to this difference, the idea comes up to annotate lecture videos with markers and personal notes for a better understanding of the taught content. Especially, students learning for an exam use their notes to refresh their memories. To ease this learning method with lecture videos, we introduce the annotation feature in our video lecture archive. This functionality supports the students with keeping track of their thoughts by providing an intuitive interface to easily add, modify or remove their ideas. This annotation function is integrated in the video player. Hence, scrolling to a separate annotation area on the website is not necessary. Furthermore, the annotated notes can be exported together with the slide content to a PDF file, which can then be printed easily. Lecture video annotations support and motivate students to learn and watch videos from an E-Learning video archive.
An efficient Design Space Exploration (DSE) is imperative for the design of modern, highly complex embedded systems in order to steer the development towards optimal design points. The early evaluation of design decisions at system-level abstraction layer helps to find promising regions for subsequent development steps in lower abstraction levels by diminishing the complexity of the search problem. In recent works, symbolic techniques, especially Answer Set Programming (ASP) modulo Theories (ASPmT), have been shown to find feasible solutions of highly complex system-level synthesis problems with non-linear constraints very efficiently. In this paper, we present a novel approach to a holistic system-level DSE based on ASPmT. To this end, we include additional background theories that concurrently guarantee compliance with hard constraints and perform the simultaneous optimization of several design objectives. We implement and compare our approach with a state-of-the-art preference handling framework for ASP. Experimental results indicate that our proposed method produces better solutions with respect to both diversity and convergence to the true Pareto front.
Operational decisions in business processes can be modeled by using the Decision Model and Notation (DMN). The complementary use of DMN for decision modeling and of the Business Process Model and Notation (BPMN) for process design realizes the separation of concerns principle. For supporting separation of concerns during the design phase, it is crucial to understand which aspects of decision-making enclosed in a process model should be captured by a dedicated decision model. Whereas existing work focuses on the extraction of decision models from process control flow, the connection of process-related data and decision models is still unexplored. In this paper, we investigate how process-related data used for making decisions can be represented in process models and we distinguish a set of BPMN patterns capturing such information. Then, we provide a formal mapping of the identified BPMN patterns to corresponding DMN models and apply our approach to a real-world healthcare process.
Modern server systems with large NUMA architectures necessitate (i) data being distributed over the available computing nodes and (ii) NUMA-aware query processing to enable effective parallel processing in database systems. As these architectures incur significant latency and throughout penalties for accessing non-local data, queries should be executed as close as possible to the data. To further increase both performance and efficiency, data that is not relevant for the query result should be skipped as early as possible. One way to achieve this goal is horizontal partitioning to improve static partition pruning. As part of our ongoing work on workload-driven partitioning, we have implemented a recent approach called aggressive data skipping and extended it to handle both analytical as well as transactional access patterns. In this paper, we evaluate this approach with the workload and data of a production enterprise system of a Global 2000 company. The results show that over 80% of all tuples can be skipped in average while the resulting partitioning schemata are surprisingly stable over time.
Logical modeling has been widely used to understand and expand the knowledge about protein interactions among different pathways. Realizing this, the caspo-ts system has been proposed recently to learn logical models from time series data. It uses Answer Set Programming to enumerate Boolean Networks (BNs) given prior knowledge networks and phosphoproteomic time series data. In the resulting sequence of solutions, similar BNs are typically clustered together. This can be problematic for large scale problems where we cannot explore the whole solution space in reasonable time. Our approach extends the caspo-ts system to cope with the important use case of finding diverse solutions of a problem with a large number of solutions. We first present the algorithm for finding diverse solutions and then we demonstrate the results of the proposed approach on two different benchmark scenarios in systems biology: (1) an artificial dataset to model TCR signaling and (2) the HPN-DREAM challenge dataset to model breast cancer cell lines.
Metamaterial Devices
(2018)
In our hands-on demonstration, we show several objects, the functionality of which is defined by the objects' internal micro-structure. Such metamaterial machines can (1) be mechanisms based on their microstructures, (2) employ simple mechanical computation, or (3) change their outside to interact with their environment. They are 3D printed from one piece and we support their creating by providing interactive software tools.
Cloud storage brokerage is an abstraction aimed at providing value-added services. However, Cloud Service Brokers are challenged by several security issues including enlarged attack surfaces due to integration of disparate components and API interoperability issues. Therefore, appropriate security risk assessment methods are required to identify and evaluate these security issues, and examine the efficiency of countermeasures. A possible approach for satisfying these requirements is employment of threat modeling concepts, which have been successfully applied in traditional paradigms. In this work, we employ threat models including attack trees, attack graphs and Data Flow Diagrams against a Cloud Service Broker (CloudRAID) and analyze these security threats and risks. Furthermore, we propose an innovative technique for combining Common Vulnerability Scoring System (CVSS) and Common Configuration Scoring System (CCSS) base scores in probabilistic attack graphs to cater for configuration-based vulnerabilities which are typically leveraged for attacking cloud storage systems. This approach is necessary since existing schemes do not provide sufficient security metrics, which are imperatives for comprehensive risk assessments. We demonstrate the efficiency of our proposal by devising CCSS base scores for two common attacks against cloud storage: Cloud Storage Enumeration Attack and Cloud Storage Exploitation Attack. These metrics are then used in Attack Graph Metric-based risk assessment. Our experimental evaluation shows that our approach caters for the aforementioned gaps and provides efficient security hardening options. Therefore, our proposals can be employed to improve cloud security.
The problem of constructing and maintaining a tree topology in a distributed manner is a challenging task in WSNs. This is because the nodes have limited computational and memory resources and the network changes over time. We propose the Dynamic Gallager-Humblet-Spira (D-GHS) algorithm that builds and maintains a minimum spanning tree. To do so, we divide D-GHS into four phases, namely neighbor discovery, tree construction, data collection, and tree maintenance. In the neighbor discovery phase, the nodes collect information about their neighbors and the link quality. In the tree construction, D-GHS finds the minimum spanning tree by executing the Gallager-Humblet-Spira algorithm. In the data collection phase, the sink roots the minimum spanning tree at itself, and each node sends data packets. In the tree maintenance phase, the nodes repair the tree when communication failures occur. The emulation results show that D-GHS reduces the number of control messages and the energy consumption, at the cost of a slight increase in memory size and convergence time.
An energy consumption model for multiModal wireless sensor networks based on wake-up radio receivers
(2018)
Energy consumption is a major concern in Wireless Sensor Networks. A significant waste of energy occurs due to the idle listening and overhearing problems, which are typically avoided by turning off the radio, while no transmission is ongoing. The classical approach for allowing the reception of messages in such situations is to use a low-duty-cycle protocol, and to turn on the radio periodically, which reduces the idle listening problem, but requires timers and usually unnecessary wakeups. A better solution is to turn on the radio only on demand by using a Wake-up Radio Receiver (WuRx). In this paper, an energy model is presented to estimate the energy saving in various multi-hop network topologies under several use cases, when a WuRx is used instead of a classical low-duty-cycling protocol. The presented model also allows for estimating the benefit of various WuRx properties like using addressing or not.
Scrum2kanban
(2018)
Using university capstone courses to teach agile software development methodologies has become commonplace, as agile methods have gained support in professional software development. This usually means students are introduced to and work with the currently most popular agile methodology: Scrum. However, as the agile methods employed in the industry change and are adapted to different contexts, university courses must follow suit. A prime example of this is the Kanban method, which has recently gathered attention in the industry. In this paper, we describe a capstone course design, which adds the hands-on learning of the lean principles advocated by Kanban into a capstone project run with Scrum. This both ensures that students are aware of recent process frameworks and ideas as well as gain a more thorough overview of how agile methods can be employed in practice. We describe the details of the course and analyze the participating students' perceptions as well as our observations. We analyze the development artifacts, created by students during the course in respect to the two different development methodologies. We further present a summary of the lessons learned as well as recommendations for future similar courses. The survey conducted at the end of the course revealed an overwhelmingly positive attitude of students towards the integration of Kanban into the course.
802.15.4 security protects against the replay, injection, and eavesdropping of 802.15.4 frames. A core concept of 802.15.4 security is the use of frame counters for both nonce generation and anti-replay protection. While being functional, frame counters (i) cause an increased energy consumption as they incur a per-frame overhead of 4 bytes and (ii) only provide sequential freshness. The Last Bits (LB) optimization does reduce the per-frame overhead of frame counters, yet at the cost of an increased RAM consumption and occasional energy-and time-consuming resynchronization actions. Alternatively, the timeslotted channel hopping (TSCH) media access control (MAC) protocol of 802.15.4 avoids the drawbacks of frame counters by replacing them with timeslot indices, but findings of Yang et al. question the security of TSCH in general. In this paper, we assume the use of ContikiMAC, which is a popular asynchronous MAC protocol for 802.15.4 networks. Under this assumption, we propose an Intra-Layer Optimization for 802.15.4 Security (ILOS), which intertwines 802.15.4 security and ContikiMAC. In effect, ILOS reduces the security-related per-frame overhead even more than the LB optimization, as well as achieves strong freshness. Furthermore, unlike the LB optimization, ILOS neither incurs an increased RAM consumption nor requires resynchronization actions. Beyond that, ILOS integrates with and advances other security supplements to ContikiMAC. We implemented ILOS using OpenMotes and the Contiki operating system.
CurEx
(2018)
The integration of diverse structured and unstructured information sources into a unified, domain-specific knowledge base is an important task in many areas. A well-maintained knowledge base enables data analysis in complex scenarios, such as risk analysis in the financial sector or investigating large data leaks, such as the Paradise or Panama papers. Both the creation of such knowledge bases, as well as their continuous maintenance and curation involves many complex tasks and considerable manual effort. With CurEx, we present a modular system that allows structured and unstructured data sources to be integrated into a domain-specific knowledge base. In particular, we (i) enable the incremental improvement of each individual integration component; (ii) enable the selective generation of multiple knowledge graphs from the information contained in the knowledge base; and (iii) provide two distinct user interfaces tailored to the needs of data engineers and end-users respectively. The former has curation capabilities and controls the integration process, whereas the latter focuses on the exploration of the generated knowledge graph.
Beacon in the Dark
(2018)
The large amount of heterogeneous data in these email corpora renders experts' investigations by hand infeasible. Auditors or journalists, e.g., who are looking for irregular or inappropriate content or suspicious patterns, are in desperate need for computer-aided exploration tools to support their investigations.
We present our Beacon system for the exploration of such corpora at different levels of detail. A distributed processing pipeline combines text mining methods and social network analysis to augment the already semi-structured nature of emails. The user interface ties into the resulting cleaned and enriched dataset. For the interface design we identify three objectives expert users have: gain an initial overview of the data to identify leads to investigate, understand the context of the information at hand, and have meaningful filters to iteratively focus onto a subset of emails. To this end we make use of interactive visualisations based on rearranged and aggregated extracted information to reveal salient patterns.
The detection of all inclusion dependencies (INDs) in an unknown dataset is at the core of any data profiling effort. Apart from the discovery of foreign key relationships, INDs can help perform data integration, integrity checking, schema (re-)design, and query optimization. With the advent of Big Data, the demand increases for efficient INDs discovery algorithms that can scale with the input data size. To this end, we propose S-INDD++ as a scalable system for detecting unary INDs in large datasets. S-INDD++ applies a new stepwise partitioning technique that helps discard a large number of attributes in early phases of the detection by processing the first partitions of smaller sizes. S-INDD++ also extends the concept of the attribute clustering to decide which attributes to be discarded based on the clustering result of each partition. Moreover, in contrast to the state-of-the-art, S-INDD++ does not require the partition to fit into the main memory-which is a highly appreciable property in the face of the ever growing datasets. We conducted an exhaustive evaluation of S-INDD++ by applying it to large datasets with thousands attributes and more than 266 million tuples. The results show the high superiority of S-INDD++ over the state-of-the-art. S-INDD++ reduced up to 50 % of the runtime in comparison with BINDER, and up to 98 % in comparison with S-INDD.
One particular challenge in the Internet of Things is the management of many heterogeneous things. The things are typically constrained devices with limited memory, power, network and processing capacity. Configuring every device manually is a tedious task. We propose an interoperable way to configure an IoT network automatically using existing standards. The proposed NETCONF-MQTT bridge intermediates between the constrained devices (speaking MQTT) and the network management standard NETCONF. The NETCONF-MQTT bridge generates dynamically YANG data models from the semantic description of the device capabilities based on the oneM2M ontology. We evaluate the approach for two use cases, i.e. describing an actuator and a sensor scenario.
Live migration is an important feature in modern software-defined datacenters and cloud computing environments. Dynamic resource management, load balance, power saving and fault tolerance are all dependent on the live migration feature. Despite the importance of live migration, the cost of live migration cannot be ignored and may result in service availability degradation. Live migration cost includes the migration time, downtime, CPU overhead, network and power consumption. There are many research articles that discuss the problem of live migration cost with different scopes like analyzing the cost and relate it to the parameters that control it, proposing new migration algorithms that minimize the cost and also predicting the migration cost. For the best of our knowledge, most of the papers that discuss the migration cost problem focus on open source hypervisors. For the research articles focus on VMware environments, none of the published articles proposed migration time, network overhead and power consumption modeling for single and multiple VMs live migration. In this paper, we propose empirical models for the live migration time, network overhead and power consumption for single and multiple VMs migration. The proposed models are obtained using a VMware based testbed.
For the last ten years, almost every theoretical result concerning the expected run time of a randomized search heuristic used drift theory, making it the arguably most important tool in this domain. Its success is due to its ease of use and its powerful result: drift theory allows the user to derive bounds on the expected first-hitting time of a random process by bounding expected local changes of the process - the drift. This is usually far easier than bounding the expected first-hitting time directly. Due to the widespread use of drift theory, it is of utmost importance to have the best drift theorems possible. We improve the fundamental additive, multiplicative, and variable drift theorems by stating them in a form as general as possible and providing examples of why the restrictions we keep are still necessary. Our additive drift theorem for upper bounds only requires the process to be nonnegative, that is, we remove unnecessary restrictions like a finite, discrete, or bounded search space. As corollaries, the same is true for our upper bounds in the case of variable and multiplicative drift.
One of the most important aspects of a randomized algorithm is bounding its expected run time on various problems. Formally speaking, this means bounding the expected first-hitting time of a random process. The two arguably most popular tools to do so are the fitness level method and drift theory. The fitness level method considers arbitrary transition probabilities but only allows the process to move toward the goal. On the other hand, drift theory allows the process to move into any direction as long as it move closer to the goal in expectation; however, this tendency has to be monotone and, thus, the transition probabilities cannot be arbitrary. We provide a result that combines the benefit of these two approaches: our result gives a lower and an upper bound for the expected first-hitting time of a random process over {0,..., n} that is allowed to move forward and backward by 1 and can use arbitrary transition probabilities. In case that the transition probabilities are known, our bounds coincide and yield the exact value of the expected first-hitting time. Further, we also state the stationary distribution as well as the mixing time of a special case of our scenario.
For theoretical analyses there are two specifics distinguishing GP from many other areas of evolutionary computation. First, the variable size representations, in particular yielding a possible bloat (i.e. the growth of individuals with redundant parts). Second, the role and realization of crossover, which is particularly central in GP due to the tree-based representation. Whereas some theoretical work on GP has studied the effects of bloat, crossover had a surprisingly little share in this work. We analyze a simple crossover operator in combination with local search, where a preference for small solutions minimizes bloat (lexicographic parsimony pressure); the resulting algorithm is denoted Concatenation Crossover GP. For this purpose three variants of the wellstudied Majority test function with large plateaus are considered. We show that the Concatenation Crossover GP can efficiently optimize these test functions, while local search cannot be efficient for all three variants independent of employing bloat control.
High-throughput RNA sequencing (RNAseq) produces large data sets containing expression levels of thousands of genes. The analysis of RNAseq data leads to a better understanding of gene functions and interactions, which eventually helps to study diseases like cancer and develop effective treatments. Large-scale RNAseq expression studies on cancer comprise samples from multiple cancer types and aim to identify their distinct molecular characteristics. Analyzing samples from different cancer types implies analyzing samples from different tissue origin. Such multi-tissue RNAseq data sets require a meaningful analysis that accounts for the inherent tissue-related bias: The identified characteristics must not originate from the differences in tissue types, but from the actual differences in cancer types. However, current analysis procedures do not incorporate that aspect. As a result, we propose to integrate a tissue-awareness into the analysis of multi-tissue RNAseq data. We introduce an extension for gene selection that provides a tissue-wise context for every gene and can be flexibly combined with any existing gene selection approach. We suggest to expand conventional evaluation by additional metrics that are sensitive to the tissue-related bias. Evaluations show that especially low complexity gene selection approaches profit from introducing tissue-awareness.
User-generated content on social media platforms is a rich source of latent information about individual variables. Crawling and analyzing this content provides a new approach for enterprises to personalize services and put forward product recommendations. In the past few years, brands made a gradual appearance on social media platforms for advertisement, customers support and public relation purposes and by now it became a necessity throughout all branches. This online identity can be represented as a brand personality that reflects how a brand is perceived by its customers. We exploited recent research in text analysis and personality detection to build an automatic brand personality prediction model on top of the (Five-Factor Model) and (Linguistic Inquiry and Word Count) features extracted from publicly available benchmarks. The proposed model reported significant accuracy in predicting specific personality traits form brands. For evaluating our prediction results on actual brands, we crawled the Facebook API for 100k posts from the most valuable brands' pages in the USA and we visualize exemplars of comparison results and present suggestions for future directions.
This paper investigates the applicability of CMOS decoupling cells for mitigating the Single Event Transient (SET) effects in standard combinational gates. The concept is based on the insertion of two decoupling cells between the gate's output and the power/ground terminals. To verify the proposed hardening approach, extensive SPICE simulations have been performed with standard combinational cells designed in IHP's 130 nm bulk CMOS technology. Obtained simulation results have shown that the insertion of decoupling cells results in the increase of the gate's critical charge, thus reducing the gate's soft error rate (SER). Moreover, the decoupling cells facilitate the suppression of SET pulses propagating through the gate. It has been shown that the decoupling cells may be a competitive alternative to gate upsizing and gate duplication for hardening the gates with lower critical charge and multiple (3 or 4) inputs, as well as for filtering the short SET pulses induced by low-LET particles.
Studies indicate that reliable access to power is an important enabler for economic growth. To this end, modern energy management systems have seen a shift from reliance on time-consuming manual procedures , to highly automated management , with current energy provisioning systems being run as cyber-physical systems . Operating energy grids as a cyber-physical system offers the advantage of increased reliability and dependability , but also raises issues of security and privacy. In this chapter, we provide an overview of the contents of this book showing the interrelation between the topics of the chapters in terms of smart energy provisioning. We begin by discussing the concept of smart-grids in general, proceeding to narrow our focus to smart micro-grids in particular. Lossy networks also provide an interesting framework for enabling the implementation of smart micro-grids in remote/rural areas, where deploying standard smart grids is economically and structurally infeasible. To this end, we consider an architectural design for a smart micro-grid suited to low-processing capable devices. We model malicious behaviour, and propose mitigation measures based properties to distinguish normal from malicious behaviour .
Monitoring is a key prerequisite for self-adaptive software and many other forms of operating software. Monitoring relevant lower level phenomena like the occurrences of exceptions and diagnosis data requires to carefully examine which detailed information is really necessary and feasible to monitor. Adaptive monitoring permits observing a greater variety of details with less overhead, if most of the time the MAPE-K loop can operate using only a small subset of all those details. However, engineering such an adaptive monitoring is a major engineering effort on its own that further complicates the development of self-adaptive software. The proposed approach overcomes the outlined problems by providing generic adaptive monitoring via runtime models. It reduces the effort to introduce and apply adaptive monitoring by avoiding additional development effort for controlling the monitoring adaptation. Although the generic approach is independent from the monitoring purpose, it still allows for substantial savings regarding the monitoring resource consumption as demonstrated by an example.
Modern routing algorithms reduce query time by depending heavily on preprocessed data. The recently developed Navigation Data Standard (NDS) enforces a separation between algorithms and map data, rendering preprocessing inapplicable. Furthermore, map data is partitioned into tiles with respect to their geographic coordinates. With the limited memory found in portable devices, the number of tiles loaded becomes the major factor for run time. We study routing under these restrictions and present new algorithms as well as empirical evaluations. Our results show that, on average, the most efficient algorithm presented uses more than 20 times fewer tile loads than a normal A*.
Minimising Information Loss on Anonymised High Dimensional Data with Greedy In-Memory Processing
(2018)
Minimising information loss on anonymised high dimensional data is important for data utility. Syntactic data anonymisation algorithms address this issue by generating datasets that are neither use-case specific nor dependent on runtime specifications. This results in anonymised datasets that can be re-used in different scenarios which is performance efficient. However, syntactic data anonymisation algorithms incur high information loss on high dimensional data, making the data unusable for analytics. In this paper, we propose an optimised exact quasi-identifier identification scheme, based on the notion of k-anonymity, to generate anonymised high dimensional datasets efficiently, and with low information loss. The optimised exact quasi-identifier identification scheme works by identifying and eliminating maximal partial unique column combination (mpUCC) attributes that endanger anonymity. By using in-memory processing to handle the attribute selection procedure, we significantly reduce the processing time required. We evaluated the effectiveness of our proposed approach with an enriched dataset drawn from multiple real-world data sources, and augmented with synthetic values generated in close alignment with the real-world data distributions. Our results indicate that in-memory processing drops attribute selection time for the mpUCC candidates from 400s to 100s, while significantly reducing information loss. In addition, we achieve a time complexity speed-up of O(3(n/3)) approximate to O(1.4422(n)).
We analyze the problem of response suggestion in a closed domain along a real-world scenario of a digital library. We present a text-processing pipeline to generate question-answer pairs from chat transcripts. On this limited amount of training data, we compare retrieval-based, conditioned-generation, and dedicated representation learning approaches for response suggestion. Our results show that retrieval-based methods that strive to find similar, known contexts are preferable over parametric approaches from the conditioned-generation family, when the training data is limited. We, however, identify a specific representation learning approach that is competitive to the retrieval-based approaches despite the training data limitation.
PlAnalyzer
(2018)
In this work we propose PIAnalyzer, a novel approach to analyze PendingIntent related vulnerabilities. We empirically evaluate PIAnalyzer on a set of 1000 randomly selected applications from the Google Play Store and find 1358 insecure usages of Pendinglntents, including 70 severe vulnerabilities. We manually inspected ten reported vulnerabilities out of which nine correctly reported vulnerabilities, indicating a high precision. The evaluation shows that PIAnalyzer is efficient with an average execution time of 13 seconds per application.
Currently we are witnessing profound changes in the geospatial domain. Driven by recent ICT developments, such as web services, serviceoriented computing or open-source software, an explosion of geodata and geospatial applications or rapidly growing communities of non-specialist users, the crucial issue is the provision and integration of geospatial intelligence in these rapidly changing, heterogeneous developments. This paper introduces the concept of Servicification into geospatial data processing. Its core idea is the provision of expertise through a flexible number of web-based software service modules. Selection and linkage of these services to user profiles, application tasks, data resources, or additional software allow for the compilation of flexible, time-sensitive geospatial data handling processes. Encapsulated in a string of discrete services, the approach presented here aims to provide non-specialist users with geospatial expertise required for the effective, professional solution of a defined application problem. Providing users with geospatial intelligence in the form of web-based, modular services, is a completely different approach to geospatial data processing. This novel concept puts geospatial intelligence, made available through services encapsulating rule bases and algorithms, in the centre and at the disposal of the users, regardless of their expertise.
Recently blockchain technology has been introduced to execute interacting business processes in a secure and transparent way. While the foundations for process enactment on blockchain have been researched, the execution of decisions on blockchain has not been addressed yet. In this paper we argue that decisions are an essential aspect of interacting business processes, and, therefore, also need to be executed on blockchain. The immutable representation of decision logic can be used by the interacting processes, so that decision taking will be more secure, more transparent, and better auditable. The approach is based on a mapping of the DMN language S-FEEL to Solidity code to be run on the Ethereum blockchain. The work is evaluated by a proof-of-concept prototype and an empirical cost evaluation.
OpenLL
(2018)
Today's rendering APIs lack robust functionality and capabilities for dynamic, real-time text rendering and labeling, which represent key requirements for 3D application design in many fields. As a consequence, most rendering systems are barely or not at all equipped with respective capabilities. This paper drafts the unified text rendering and labeling API OpenLL intended to complement common rendering APIs, frameworks, and transmission formats. For it, various uses of static and dynamic placement of labels are showcased and a text interaction technique is presented. Furthermore, API design constraints with respect to state-of-the-art text rendering techniques are discussed. This contribution is intended to initiate a community-driven specification of a free and open label library.
The emergence of cloud computing allows users to easily host their Virtual Machines with no up-front investment and the guarantee of always available anytime anywhere. But with the Virtual Machine (VM) is hosted outside of user's premise, the user loses the physical control of the VM as it could be running on untrusted host machines in the cloud. Malicious host administrator could launch live memory dumping, Spectre, or Meltdown attacks in order to extract sensitive information from the VM's memory, e.g. passwords or cryptographic keys of applications running in the VM. In this paper, inspired by the moving target defense (MTD) scheme, we propose a novel approach to increase the security of application's sensitive data in the VM by continuously moving the sensitive data among several memory allocations (blocks) in Random Access Memory (RAM). A movement function is added into the application source code in order for the function to be running concurrently with the application's main function. Our approach could reduce the possibility of VM's sensitive data in the memory to be leaked into memory dump file by 2 5% and secure the sensitive data from Spectre and Meltdown attacks. Our approach's overhead depends on the number and the size of the sensitive data.
Comparative text mining extends from genre analysis and political bias detection to the revelation of cultural and geographic differences, through to the search for prior art across patents and scientific papers. These applications use cross-collection topic modeling for the exploration, clustering, and comparison of large sets of documents, such as digital libraries. However, topic modeling on documents from different collections is challenging because of domain-specific vocabulary. We present a cross-collection topic model combined with automatic domain term extraction and phrase segmentation. This model distinguishes collection-specific and collection-independent words based on information entropy and reveals commonalities and differences of multiple text collections. We evaluate our model on patents, scientific papers, newspaper articles, forum posts, and Wikipedia articles. In comparison to state-of-the-art cross-collection topic modeling, our model achieves up to 13% higher topic coherence, up to 4% lower perplexity, and up to 31% higher document classification accuracy. More importantly, our approach is the first topic model that ensures disjunct general and specific word distributions, resulting in clear-cut topic representations.
An Information System Supporting the Eliciting of Expert Knowledge for Successful IT Projects
(2018)
In order to guarantee the success of an IT project, it is necessary for a company to possess expert knowledge. The difficulty arises when experts no longer work for the company and it then becomes necessary to use their knowledge, in order to realise an IT project. In this paper, the ExKnowIT information system which supports the eliciting of expert knowledge for successful IT projects, is presented and consists of the following modules: (1) the identification of experts for successful IT projects, (2) the eliciting of expert knowledge on completed IT projects, (3) the expert knowledge base on completed IT projects, (4) the Group Method for Data Handling (GMDH) algorithm, (5) new knowledge in support of decisions regarding the selection of a manager for a new IT project. The added value of our system is that these three approaches, namely, the elicitation of expert knowledge, the success of an IT project and the discovery of new knowledge, gleaned from the expert knowledge base, otherwise known as the decision model, complement each other.
The relentless improvement of silicon photonics is making optical interconnects and networks appealing for use in miniaturized systems, where electrical interconnects cannot keep up with the growing levels of core integration due to bandwidth density and power efficiency limitations. At the same time, solutions such as 3D stacking or 2.5D integration open the door to a fully dedicated process optimization for the photonic die. However, an architecture-level integration challenge arises between the electronic network and the optical one in such tightly-integrated parallel systems. It consists of adapting signaling rates, matching the different levels of communication parallelism, handling cross-domain flow control, addressing re-synchronization concerns, and avoiding protocol-dependent deadlock. The associated energy and performance overhead may offset the inherent benefits of the emerging technology itself. This paper explores a hybrid CMOS-ECL bridge architecture between 3D-stacked technology-heterogeneous networks-on-chip (NoCs). The different ways of overcoming the serialization challenge (i.e., through an improvement of the signaling rate and/or through space-/wavelength division multiplexing options) give rise to a configuration space that the paper explores, in search for the most energy-efficient configuration for high-performance.
Mobile expressive rendering gained increasing popularity among users seeking casual creativity by image stylization and supports the development of mobile artists as a new user group. In particular, neural style transfer has advanced as a core technology to emulate characteristics of manifold artistic styles. However, when it comes to creative expression, the technology still faces inherent limitations in providing low-level controls for localized image stylization. This work enhances state-of-the-art neural style transfer techniques by a generalized user interface with interactive tools to facilitate a creative and localized editing process. Thereby, we first propose a problem characterization representing trade-offs between visual quality, run-time performance, and user control. We then present MaeSTrO, a mobile app for orchestration of neural style transfer techniques using iterative, multi-style generative and adaptive neural networks that can be locally controlled by on-screen painting metaphors. At this, first user tests indicate different levels of satisfaction for the implemented techniques and interaction design.
Microservice Architectures (MSA) structure applications as a collection of loosely coupled services that implement business capabilities. The key advantages of MSA include inherent support for continuous deployment of large complex applications, agility and enhanced productivity. However, studies indicate that most MSA are homogeneous, and introduce shared vulnerabilites, thus vulnerable to multi-step attacks, which are economics-of-scale incentives to attackers. In this paper, we address the issue of shared vulnerabilities in microservices with a novel solution based on the concept of Moving Target Defenses (MTD). Our mechanism works by performing risk analysis against microservices to detect and prioritize vulnerabilities. Thereafter, security risk-oriented software diversification is employed, guided by a defined diversification index. The diversification is performed at runtime, leveraging both model and template based automatic code generation techniques to automatically transform programming languages and container images of the microservices. Consequently, the microservices attack surfaces are altered thereby introducing uncertainty for attackers while reducing the attackability of the microservices. Our experiments demonstrate the efficiency of our solution, with an average success rate of over 70% attack surface randomization.
Unified logging system for monitoring multiple cloud storage providers in cloud storage broker
(2018)
With the increasing demand for personal and enterprise data storage service, Cloud Storage Broker (CSB) provides cloud storage service using multiple Cloud Service Providers (CSPs) with guaranteed Quality of Service (QoS), such as data availability and security. However monitoring cloud storage usage in multiple CSPs has become a challenge for CSB due to lack of standardized logging format for cloud services that causes each CSP to implement its own format. In this paper we propose a unified logging system that can be used by CSB to monitor cloud storage usage across multiple CSPs. We gather cloud storage log files from three different CSPs and normalise these into our proposed log format that can be used for further analysis process. We show that our work enables a coherent view suitable for data navigation, monitoring, and analytics.
CSBAuditor
(2018)
Cloud Storage Brokers (CSB) provide seamless and concurrent access to multiple Cloud Storage Services (CSS) while abstracting cloud complexities from end-users. However, this multi-cloud strategy faces several security challenges including enlarged attack surfaces, malicious insider threats, security complexities due to integration of disparate components and API interoperability issues. Novel security approaches are imperative to tackle these security issues. Therefore, this paper proposes CSBAuditor, a novel cloud security system that continuously audits CSB resources, to detect malicious activities and unauthorized changes e.g. bucket policy misconfigurations, and remediates these anomalies. The cloud state is maintained via a continuous snapshotting mechanism thereby ensuring fault tolerance. We adopt the principles of chaos engineering by integrating Broker Monkey, a component that continuously injects failure into our reference CSB system, Cloud RAID. Hence, CSBAuditor is continuously tested for efficiency i.e. its ability to detect the changes injected by Broker Monkey. CSBAuditor employs security metrics for risk analysis by computing severity scores for detected vulnerabilities using the Common Configuration Scoring System, thereby overcoming the limitation of insufficient security metrics in existing cloud auditing schemes. CSBAuditor has been tested using various strategies including chaos engineering failure injection strategies. Our experimental evaluation validates the efficiency of our approach against the aforementioned security issues with a detection and recovery rate of over 96 %.
ASEDS
(2018)
The Massive adoption of social media has provided new ways for individuals to express their opinion and emotion online. In 2016, Facebook introduced a new reactions feature that allows users to express their psychological emotions regarding published contents using so-called Facebook reactions. In this paper, a framework for predicting the distribution of Facebook post reactions is presented. For this purpose, we collected an enormous amount of Facebook posts associated with their reactions labels using the proposed scalable Facebook crawler. The training process utilizes 3 million labeled posts for more than 64,000 unique Facebook pages from diverse categories. The evaluation on standard benchmarks using the proposed features shows promising results compared to previous research. The final model is able to predict the reaction distribution on Facebook posts with a recall score of 0.90 for "Joy" emotion.
Leveraging spatio-temporal soccer data to define a graphical query language for game recordings
(2019)
For professional soccer clubs, performance and video analysis are an integral part of the preparation and post-processing of games. Coaches, scouts, and video analysts extract information about strengths and weaknesses of their team as well as opponents by manually analyzing video recordings of past games. Since video recordings are an unstructured data source, it is a complex and time-intensive task to find specific game situations and identify similar patterns. In this paper, we present a novel approach to detect patterns and situations (e.g., playmaking and ball passing of midfielders) based on trajectory data. The application uses the metaphor of a tactic board to offer a graphical query language. With this interactive tactic board, the user can model a game situation or mark a specific situation in the video recording for which all matching occurrences in various games are immediately displayed, and the user can directly jump to the corresponding game scene. Through the additional visualization of key performance indicators (e.g.,the physical load of the players), the user can get a better overall assessment of situations. With the capabilities to find specific game situations and complex patterns in video recordings, the interactive tactic board serves as a useful tool to improve the video analysis process of professional sports teams.
Rapid advances in location-acquisition technologies have led to large amounts of trajectory data. This data is the foundation for a broad spectrum of services driven and improved by trajectory data mining. However, for hybrid transactional and analytical workloads, the storing and processing of rapidly accumulated trajectory data is a non-trivial task. In this paper, we present a detailed survey about state-of-the-art trajectory data management systems. To determine the relevant aspects and requirements for such systems, we developed a trajectory data mining framework, which summarizes the different steps in the trajectory data mining process. Based on the derived requirements, we analyze different concepts to store, compress, index, and process spatio-temporal data. There are various trajectory management systems, which are optimized for scalability, data footprint reduction, elasticity, or query performance. To get a comprehensive overview, we describe and compare different exciting systems. Additionally, the observed similarities in the general structure of different systems are consolidated in a general blueprint of trajectory management systems.
What Stays in Mind?
(2018)