TY - JOUR
A1 - Peng, Junjie
A1 - Liu, Danxu
A1 - Wang, Yingtao
A1 - Zeng, Ying
A1 - Cheng, Feng
A1 - Zhang, Wenqiang
T1 - Weight-based strategy for an I/O-intensive application at a cloud data center
JF - Concurrency and computation : practice & experience
N2 - Applications with different characteristics in the cloud may have different resources preferences. However, traditional resource allocation and scheduling strategies rarely take into account the characteristics of applications. Considering that an I/O-intensive application is a typical type of application and that frequent I/O accesses, especially small files randomly accessing the disk, may lead to an inefficient use of resources and reduce the quality of service (QoS) of applications, a weight allocation strategy is proposed based on the available resources that a physical server can provide as well as the characteristics of the applications. Using the weight obtained, a resource allocation and scheduling strategy is presented based on the specific application characteristics in the data center. Extensive experiments show that the strategy is correct and can guarantee a high concurrency of I/O per second (IOPS) in a cloud data center with high QoS. Additionally, the strategy can efficiently improve the utilization of the disk and resources of the data center without affecting the service quality of applications.
KW - IOPS
KW - process scheduling
KW - random I
KW - O
KW - small files
KW - weight
Y1 - 2018
U6 - https://doi.org/10.1002/cpe.4648
SN - 1532-0626
SN - 1532-0634
VL - 30
IS - 19
PB - Wiley
CY - Hoboken
ER -
TY - JOUR
A1 - Schaub, Torsten H.
A1 - Woltran, Stefan
T1 - Answer set programming unleashed!
JF - Künstliche Intelligenz
N2 - Answer Set Programming faces an increasing popularity for problem solving in various domains. While its modeling language allows us to express many complex problems in an easy way, its solving technology enables their effective resolution. In what follows, we detail some of the key factors of its success. Answer Set Programming [ASP; Brewka et al. Commun ACM 54(12):92–103, (2011)] is seeing a rapid proliferation in academia and industry due to its easy and flexible way to model and solve knowledge-intense combinatorial (optimization) problems. To this end, ASP offers a high-level modeling language paired with high-performance solving technology. As a result, ASP systems provide out-off-the-box, general-purpose search engines that allow for enumerating (optimal) solutions. They are represented as answer sets, each being a set of atoms representing a solution. The declarative approach of ASP allows a user to concentrate on a problem’s specification rather than the computational means to solve it. This makes ASP a prime candidate for rapid prototyping and an attractive tool for teaching key AI techniques since complex problems can be expressed in a succinct and elaboration tolerant way. This is eased by the tuning of ASP’s modeling language to knowledge representation and reasoning (KRR). The resulting impact is nicely reflected by a growing range of successful applications of ASP [Erdem et al. AI Mag 37(3):53–68, 2016; Falkner et al. Industrial applications of answer set programming. K++nstliche Intelligenz (2018)]
Y1 - 2018
U6 - https://doi.org/10.1007/s13218-018-0550-z
SN - 0933-1875
SN - 1610-1987
VL - 32
IS - 2-3
SP - 105
EP - 108
PB - Springer
CY - Heidelberg
ER -
TY - GEN
A1 - Schaub, Torsten H.
A1 - Woltran, Stefan
T1 - Special issue on answer set programming
T2 - Künstliche Intelligenz
Y1 - 2018
U6 - https://doi.org/10.1007/s13218-018-0554-8
SN - 0933-1875
SN - 1610-1987
VL - 32
IS - 2-3
SP - 101
EP - 103
PB - Springer
CY - Heidelberg
ER -
TY - JOUR
A1 - Schäfer, Robin
A1 - Stede, Manfred
T1 - Argument mining on twitter
BT - a survey
JF - Information technology : it ; Methoden und innovative Anwendungen der Informatik und Informationstechnik ; Organ der Fachbereiche 3 und 4 der GI e.V. und des Fachbereichs 6 der ITG
N2 - In the last decade, the field of argument mining has grown notably. However, only relatively few studies have investigated argumentation in social media and specifically on Twitter. Here, we provide the, to our knowledge, first critical in-depth survey of the state of the art in tweet-based argument mining. We discuss approaches to modelling the structure of arguments in the context of tweet corpus annotation, and we review current progress in the task of detecting argument components and their relations in tweets. We also survey the intersection of argument mining and stance detection, before we conclude with an outlook.
KW - Argument Mining
KW - Twitter
KW - Stance Detection
Y1 - 2021
U6 - https://doi.org/10.1515/itit-2020-0053
SN - 1611-2776
SN - 2196-7032
VL - 63
IS - 1
SP - 45
EP - 58
PB - De Gruyter
CY - Berlin
ER -
TY - JOUR
A1 - Ayzel, Georgy
A1 - Heistermann, Maik
T1 - The effect of calibration data length on the performance of a conceptual hydrological model versus LSTM and GRU
BT - a case study for six basins from the CAMELS dataset
JF - Computers & geosciences : an international journal devoted to the publication of papers on all aspects of geocomputation and to the distribution of computer programs and test data sets ; an official journal of the International Association for Mathematical Geology
N2 - We systematically explore the effect of calibration data length on the performance of a conceptual hydrological model, GR4H, in comparison to two Artificial Neural Network (ANN) architectures: Long Short-Term Memory Networks (LSTM) and Gated Recurrent Units (GRU), which have just recently been introduced to the field of hydrology. We implemented a case study for six river basins across the contiguous United States, with 25 years of meteorological and discharge data. Nine years were reserved for independent validation; two years were used as a warm-up period, one year for each of the calibration and validation periods, respectively; from the remaining 14 years, we sampled increasing amounts of data for model calibration, and found pronounced differences in model performance. While GR4H required less data to converge, LSTM and GRU caught up at a remarkable rate, considering their number of parameters. Also, LSTM and GRU exhibited the higher calibration instability in comparison to GR4H. These findings confirm the potential of modern deep-learning architectures in rainfall runoff modelling, but also highlight the noticeable differences between them in regard to the effect of calibration data length.
KW - Artificial neural networks
KW - Calibration
KW - Deep learning
KW - Rainfall-runoff
KW - modelling
Y1 - 2021
U6 - https://doi.org/10.1016/j.cageo.2021.104708
SN - 0098-3004
SN - 1873-7803
VL - 149
PB - Elsevier
CY - Amsterdam
ER -
TY - JOUR
A1 - Kossmann, Jan
A1 - Halfpap, Stefan
A1 - Jankrift, Marcel
A1 - Schlosser, Rainer
T1 - Magic mirror in my hand, which is the best in the land?
BT - an experimental evaluation of index selection algorithms
JF - Proceedings of the VLDB Endowment
N2 - Indexes are essential for the efficient processing of database workloads. Proposed solutions for the relevant and challenging index selection problem range from metadata-based simple heuristics, over sophisticated multi-step algorithms, to approaches that yield optimal results. The main challenges are (i) to accurately determine the effect of an index on the workload cost while considering the interaction of indexes and (ii) a large number of possible combinations resulting from workloads containing many queries and massive schemata with possibly thousands of attributes.
In this work, we describe and analyze eight index selection algorithms that are based on different concepts and compare them along different dimensions, such as solution quality, runtime, multi-column support, solution granularity, and complexity. In particular, we analyze the solutions of the algorithms for the challenging analytical Join Order, TPC-H, and TPC-DS benchmarks. Afterward, we assess strengths and weaknesses, infer insights for index selection in general and each approach individually, before we give recommendations on when to use which approach.
Y1 - 2020
U6 - https://doi.org/10.14778/3407790.3407832
SN - 2150-8097
VL - 13
IS - 11
SP - 2382
EP - 2395
PB - Association for Computing Machinery
CY - New York
ER -
TY - JOUR
A1 - Kaya, Adem
A1 - Freitag, Melina A.
T1 - Conditioning analysis for discrete Helmholtz problems
JF - Computers and mathematics with applications : an international journal
N2 - In this paper, we examine conditioning of the discretization of the Helmholtz problem. Although the discrete Helmholtz problem has been studied from different perspectives, to the best of our knowledge, there is no conditioning analysis for it. We aim to fill this gap in the literature. We propose a novel method in 1D to observe the near-zero eigenvalues of a symmetric indefinite matrix. Standard classification of ill-conditioning based on the matrix condition number is not true for the discrete Helmholtz problem. We relate the ill-conditioning of the discretization of the Helmholtz problem with the condition number of the matrix. We carry out analytical conditioning analysis in 1D and extend our observations to 2D with numerical observations. We examine several discretizations. We find different regions in which the condition number of the problem shows different characteristics. We also explain the general behavior of the solutions in these regions.
KW - Helmholtz problem
KW - Condition number
KW - Ill-conditioning
KW - Indefinite
KW - matrices
Y1 - 2022
U6 - https://doi.org/10.1016/j.camwa.2022.05.016
SN - 0898-1221
SN - 1873-7668
VL - 118
SP - 171
EP - 182
PB - Elsevier Science
CY - Amsterdam
ER -
TY - JOUR
A1 - Mattis, Toni
A1 - Beckmann, Tom
A1 - Rein, Patrick
A1 - Hirschfeld, Robert
T1 - First-class concepts
BT - Reified architectural knowledge beyond dominant decompositions
JF - Journal of object technology : JOT / ETH Zürich, Department of Computer Science
N2 - Ideally, programs are partitioned into independently maintainable and understandable modules. As a system grows, its architecture gradually loses the capability to accommodate new concepts in a modular way. While refactoring is expensive and not always possible, and the programming language might lack dedicated primary language constructs to express certain cross-cutting concerns, programmers are still able to explain and delineate convoluted concepts through secondary means: code comments, use of whitespace and arrangement of code, documentation, or communicating tacit knowledge.
Secondary constructs are easy to change and provide high flexibility in communicating cross-cutting concerns and other concepts among programmers. However, such secondary constructs usually have no reified representation that can be explored and manipulated as first-class entities through the programming environment.
In this exploratory work, we discuss novel ways to express a wide range of concepts, including cross-cutting concerns, patterns, and lifecycle artifacts independently of the dominant decomposition imposed by an existing architecture. We propose the representation of concepts as first-class objects inside the programming environment that retain the capability to change as easily as code comments. We explore new tools that allow programmers to view, navigate, and change programs based on conceptual perspectives. In a small case study, we demonstrate how such views can be created and how the programming experience changes from draining programmers' attention by stretching it across multiple modules toward focusing it on cohesively presented concepts. Our designs are geared toward facilitating multiple secondary perspectives on a system to co-exist in symbiosis with the original architecture, hence making it easier to explore, understand, and explain complex contexts and narratives that are hard or impossible to express using primary modularity constructs.
KW - software engineering
KW - modularity
KW - exploratory programming
KW - program
KW - comprehension
KW - remodularization
KW - architecture recovery
Y1 - 2022
U6 - https://doi.org/10.5381/jot.2022.21.2.a6
SN - 1660-1769
VL - 21
IS - 2
SP - 1
EP - 15
PB - ETH Zürich, Department of Computer Science
CY - Zürich
ER -
TY - JOUR
A1 - Koumarelas, Ioannis
A1 - Jiang, Lan
A1 - Naumann, Felix
T1 - Data preparation for duplicate detection
JF - Journal of data and information quality : (JDIQ)
N2 - Data errors represent a major issue in most application workflows. Before any important task can take place, a certain data quality has to be guaranteed by eliminating a number of different errors that may appear in data. Typically, most of these errors are fixed with data preparation methods, such as whitespace removal. However, the particular error of duplicate records, where multiple records refer to the same entity, is usually eliminated independently with specialized techniques. Our work is the first to bring these two areas together by applying data preparation operations under a systematic approach prior to performing duplicate detection.
Our process workflow can be summarized as follows: It begins with the user providing as input a sample of the gold standard, the actual dataset, and optionally some constraints to domain-specific data preparations, such as address normalization. The preparation selection operates in two consecutive phases. First, to vastly reduce the search space of ineffective data preparations, decisions are made based on the improvement or worsening of pair similarities. Second, using the remaining data preparations an iterative leave-one-out classification process removes preparations one by one and determines the redundant preparations based on the achieved area under the precision-recall curve (AUC-PR). Using this workflow, we manage to improve the results of duplicate detection up to 19% in AUC-PR.
KW - data preparation
KW - data wrangling
KW - record linkage
KW - duplicate detection
KW - similarity measures
Y1 - 2020
U6 - https://doi.org/10.1145/3377878
SN - 1936-1955
SN - 1936-1963
VL - 12
IS - 3
PB - Association for Computing Machinery
CY - New York
ER -
TY - JOUR
A1 - Kossmann, Jan
A1 - Schlosser, Rainer
T1 - Self-driving database systems
BT - a conceptual approach
JF - Distributed and parallel databases
N2 - Challenges for self-driving database systems, which tune their physical design and configuration autonomously, are manifold: Such systems have to anticipate future workloads, find robust configurations efficiently, and incorporate knowledge gained by previous actions into later decisions. We present a component-based framework for self-driving database systems that enables database integration and development of self-managing functionality with low overhead by relying on separation of concerns. By keeping the components of the framework reusable and exchangeable, experiments are simplified, which promotes further research in that area. Moreover, to optimize multiple mutually dependent features, e.g., index selection and compression configurations, we propose a linear programming (LP) based algorithm to derive an efficient tuning order automatically. Afterwards, we demonstrate the applicability and scalability of our approach with reproducible examples.
KW - database systems
KW - self-driving
KW - recursive tuning
KW - workload prediction
KW - robustness
Y1 - 2020
U6 - https://doi.org/10.1007/s10619-020-07288-w
SN - 0926-8782
SN - 1573-7578
VL - 38
IS - 4
SP - 795
EP - 817
PB - Springer
CY - Dordrecht
ER -
TY - THES
A1 - Hosp, Sven
T1 - Modifizierte Cross-Party Codes zur schnellen Mehrbit-Fehlerkorrektur
Y1 - 2015
ER -
TY - JOUR
A1 - Schneider, Johannes
A1 - Wenig, Phillip
A1 - Papenbrock, Thorsten
T1 - Distributed detection of sequential anomalies in univariate time series
JF - The VLDB journal : the international journal on very large data bases
N2 - The automated detection of sequential anomalies in time series is an essential task for many applications, such as the monitoring of technical systems, fraud detection in high-frequency trading, or the early detection of disease symptoms. All these applications require the detection to find all sequential anomalies possibly fast on potentially very large time series. In other words, the detection needs to be effective, efficient and scalable w.r.t. the input size. Series2Graph is an effective solution based on graph embeddings that are robust against re-occurring anomalies and can discover sequential anomalies of arbitrary length and works without training data. Yet, Series2Graph is no t scalable due to its single-threaded approach; it cannot, in particular, process arbitrarily large sequences due to the memory constraints of a single machine. In this paper, we propose our distributed anomaly detection system, short DADS, which is an efficient and scalable adaptation of Series2Graph. Based on the actor programming model, DADS distributes the input time sequence, intermediate state and the computation to all processors of a cluster in a way that minimizes communication costs and synchronization barriers. Our evaluation shows that DADS is orders of magnitude faster than S2G, scales almost linearly with the number of processors in the cluster and can process much larger input sequences due to its scale-out property.
KW - Distributed programming
KW - Sequential anomaly
KW - Actor model
KW - Data mining
KW - Time series
Y1 - 2021
U6 - https://doi.org/10.1007/s00778-021-00657-6
SN - 1066-8888
SN - 0949-877X
VL - 30
IS - 4
SP - 579
EP - 602
PB - Springer
CY - Berlin
ER -
TY - JOUR
A1 - Kleemann, Steven
T1 - Cyber warfare and the "humanization" of international humanitarian law
JF - International journal of cyber warfare and terrorism
N2 - Cyber warfare is a timely and relevant issue and one of the most controversial in international humanitarian law (IHL). The aim of IHL is to set rules and limits in terms of means and methods of warfare. In this context, a key question arises: Has digital warfare rules or limits, and if so, how are these applicable? Traditional principles, developed over a long period, are facing a new dimension of challenges due to the rise of cyber warfare. This paper argues that to overcome this new issue, it is critical that new humanity-oriented approaches is developed with regard to cyber warfare. The challenge is to establish a legal regime for cyber-attacks, successfully addressing human rights norms and standards. While clarifying this from a legal perspective, the authors can redesign the sensitive equilibrium between humanity and military necessity, weighing the humanitarian aims of IHL and the protection of civilians-in combination with international human rights law and other relevant legal regimes-in a different manner than before.
KW - cyber-attack
KW - cyberwar
KW - IHL
KW - IHRL
KW - international human rights
KW - international humanitarian law
KW - law and technology
KW - new technologies
Y1 - 2021
SN - 978-1-7998-6177-5
U6 - https://doi.org/10.4018/IJCWT.2021040101
SN - 1947-3435
SN - 1947-3443
VL - 11
IS - 2
SP - 1
EP - 11
PB - IGI Global
CY - Hershey
ER -
TY - THES
A1 - Köhlmann, Wiebke
T1 - Zugänglichkeit virtueller Klassenzimmer für Blinde
N2 - E-Learning-Anwendungen bieten Chancen für die gesetzlich vorgeschriebene Inklusion von Lernenden mit Beeinträchtigungen. Die gleichberechtigte Teilhabe von blinden Lernenden an Veranstaltungen in virtuellen Klassenzimmern ist jedoch durch den synchronen, multimedialen Charakter und den hohen Informationsumfang dieser Lösungen kaum möglich.
Die vorliegende Arbeit untersucht die Zugänglichkeit virtueller Klassenzimmer für blinde Nutzende, um eine möglichst gleichberechtigte Teilhabe an synchronen, kollaborativen Lernszenarien zu ermöglichen. Im Rahmen einer Produktanalyse werden dazu virtuelle Klassenzimmer auf ihre Zugänglichkeit und bestehende Barrieren untersucht und Richtlinien für die zugängliche Gestaltung von virtuellen Klassenzimmern definiert. Anschließend wird ein alternatives Benutzungskonzept zur Darstellung und Bedienung virtueller Klassenzimmer auf einem zweidimensionalen taktilen Braille-Display entwickelt, um eine möglichst gleichberechtigte Teilhabe blinder Lernender an synchronen Lehrveranstaltungen zu ermöglichen. Nach einer ersten Evaluation mit blinden Probanden erfolgt die prototypische Umsetzung des Benutzungskonzepts für ein Open-Source-Klassenzimmer. Die abschließende Evaluation der prototypischen Umsetzung zeigt die Verbesserung der Zugänglichkeit von virtuellen Klassenzimmern für blinde Lernende unter Verwendung eines taktilen Flächendisplays und bestätigt die Wirksamkeit der im Rahmen dieser Arbeit entwickelten Konzepte.
Y1 - 2016
SN - 978-3-8325-4273-3
PB - Logos
CY - Berlin
ER -
TY - JOUR
A1 - Neher, Dieter
A1 - Kniepert, Juliane
A1 - Elimelech, Arik
A1 - Koster, L. Jan Anton
T1 - A New Figure of Merit for Organic Solar Cells with Transport-limited Photocurrents
JF - Scientific reports
N2 - Compared to their inorganic counterparts, organic semiconductors suffer from relatively low charge carrier mobilities. Therefore, expressions derived for inorganic solar cells to correlate characteristic performance parameters to material properties are prone to fail when applied to organic devices. This is especially true for the classical Shockley-equation commonly used to describe current-voltage (JV)-curves, as it assumes a high electrical conductivity of the charge transporting material. Here, an analytical expression for the JV-curves of organic solar cells is derived based on a previously published analytical model. This expression, bearing a similar functional dependence as the Shockley-equation, delivers a new figure of merit α to express the balance between free charge recombination and extraction in low mobility photoactive materials. This figure of merit is shown to determine critical device parameters such as the apparent series resistance and the fill factor.
KW - Electronic and spintronic devices
KW - Semiconductors
Y1 - 2016
U6 - https://doi.org/10.1038/srep24861
SN - 2045-2322
VL - 6
PB - Nature Publishing Group
CY - London
ER -
TY - BOOK
ED - Lambrecht, Anna-Lena
ED - Margaria, Tizian
T1 - Process design for natural scientists
BT - an agile model-driven approach
T3 - Communications in computer and information science ; 500
N2 - This book presents an agile and model-driven approach to manage scientific workflows. The approach is based on the Extreme Model Driven Design (XMDD) paradigm and aims at simplifying and automating the complex data analysis processes carried out by scientists in their day-to-day work. Besides documenting the impact the workflow modeling might have on the work of natural scientists, this book serves three major purposes: 1. It acts as a primer for practitioners who are interested to learn how to think in terms of services and workflows when facing domain-specific scientific processes. 2. It provides interesting material for readers already familiar with this kind of tools, because it introduces systematically both the technologies used in each case study and the basic concepts behind them. 3. As the addressed thematic field becomes increasingly relevant for lectures in both computer science and experimental sciences, it also provides helpful material for teachers that plan similar courses.
Y1 - 2014
SN - 978-3-662-45005-5
PB - Springer
CY - Wiesbaden
ER -
TY - THES
A1 - Wust, Johannes
T1 - Mixed workload managment for in-memory databases
BT - executing mixed workloads of enterprise applications with TAMEX
Y1 - 2015
ER -
TY - JOUR
A1 - Teske, Daniel
T1 - Geocoder accuracy ranking
JF - Process design for natural scientists: an agile model-driven approach
N2 - Finding an address on a map is sometimes tricky: the chosen map application may be unfamiliar with the enclosed region. There are several geocoders on the market, they have different databases and algorithms to compute the query. Consequently, the geocoding results differ in their quality. Fortunately the geocoders provide a rich set of metadata. The workflow described in this paper compares this metadata with the aim to find out which geocoder is offering the best-fitting coordinate for a given address.
Y1 - 2014
SN - 978-3-662-45005-5
SN - 1865-0929
IS - 500
SP - 161
EP - 174
PB - Springer
CY - Berlin
ER -
TY - JOUR
A1 - Sens, Henriette
T1 - Web-Based map generalization tools put to the test: a jABC workflow
JF - Process Design for Natural Scientists: an agile model-driven approach
N2 - Geometric generalization is a fundamental concept in the digital mapping process. An increasing amount of spatial data is provided on the web as well as a range of tools to process it. This jABC workflow is used for the automatic testing of web-based generalization services like mapshaper.org by executing its functionality, overlaying both datasets before and after the transformation and displaying them visually in a .tif file. Mostly Web Services and command line tools are used to build an environment where ESRI shapefiles can be uploaded, processed through a chosen generalization service and finally visualized in Irfanview.
Y1 - 2014
SN - 978-3-662-45005-5
SN - 1865-0929
IS - 500
SP - 175
EP - 185
PB - Springer
CY - Berlin
ER -
TY - JOUR
A1 - Noack, Franziska
T1 - CREADED: Colored-Relief application for digital elevation data
JF - Process design for natural scientists: an agile model-driven approach
N2 - In the geoinformatics field, remote sensing data is often used for analyzing the characteristics of the current investigation area. This includes DEMs, which are simple raster grids containing grey scales representing the respective elevation values. The project CREADED that is presented in this paper aims at making these monochrome raster images more significant and more intuitively interpretable. For this purpose, an executable interactive model for creating a colored and relief-shaded Digital Elevation Model (DEM) has been designed using the jABC framework. The process is based on standard jABC-SIBs and SIBs that provide specific GIS functions, which are available as Web services, command line tools and scripts.
Y1 - 2014
SN - 978-3-662-45005-5
SN - 1865-0929
IS - 500
SP - 186
EP - 199
PB - Springer
CY - Berlin
ER -