TY - JOUR A1 - Peng, Junjie A1 - Liu, Danxu A1 - Wang, Yingtao A1 - Zeng, Ying A1 - Cheng, Feng A1 - Zhang, Wenqiang T1 - Weight-based strategy for an I/O-intensive application at a cloud data center JF - Concurrency and computation : practice & experience N2 - Applications with different characteristics in the cloud may have different resources preferences. However, traditional resource allocation and scheduling strategies rarely take into account the characteristics of applications. Considering that an I/O-intensive application is a typical type of application and that frequent I/O accesses, especially small files randomly accessing the disk, may lead to an inefficient use of resources and reduce the quality of service (QoS) of applications, a weight allocation strategy is proposed based on the available resources that a physical server can provide as well as the characteristics of the applications. Using the weight obtained, a resource allocation and scheduling strategy is presented based on the specific application characteristics in the data center. Extensive experiments show that the strategy is correct and can guarantee a high concurrency of I/O per second (IOPS) in a cloud data center with high QoS. Additionally, the strategy can efficiently improve the utilization of the disk and resources of the data center without affecting the service quality of applications. KW - IOPS KW - process scheduling KW - random I KW - O KW - small files KW - weight Y1 - 2018 U6 - https://doi.org/10.1002/cpe.4648 SN - 1532-0626 SN - 1532-0634 VL - 30 IS - 19 PB - Wiley CY - Hoboken ER - TY - JOUR A1 - Schaub, Torsten H. A1 - Woltran, Stefan T1 - Answer set programming unleashed! JF - Künstliche Intelligenz N2 - Answer Set Programming faces an increasing popularity for problem solving in various domains. While its modeling language allows us to express many complex problems in an easy way, its solving technology enables their effective resolution. In what follows, we detail some of the key factors of its success. Answer Set Programming [ASP; Brewka et al. Commun ACM 54(12):92–103, (2011)] is seeing a rapid proliferation in academia and industry due to its easy and flexible way to model and solve knowledge-intense combinatorial (optimization) problems. To this end, ASP offers a high-level modeling language paired with high-performance solving technology. As a result, ASP systems provide out-off-the-box, general-purpose search engines that allow for enumerating (optimal) solutions. They are represented as answer sets, each being a set of atoms representing a solution. The declarative approach of ASP allows a user to concentrate on a problem’s specification rather than the computational means to solve it. This makes ASP a prime candidate for rapid prototyping and an attractive tool for teaching key AI techniques since complex problems can be expressed in a succinct and elaboration tolerant way. This is eased by the tuning of ASP’s modeling language to knowledge representation and reasoning (KRR). The resulting impact is nicely reflected by a growing range of successful applications of ASP [Erdem et al. AI Mag 37(3):53–68, 2016; Falkner et al. Industrial applications of answer set programming. K++nstliche Intelligenz (2018)] Y1 - 2018 U6 - https://doi.org/10.1007/s13218-018-0550-z SN - 0933-1875 SN - 1610-1987 VL - 32 IS - 2-3 SP - 105 EP - 108 PB - Springer CY - Heidelberg ER - TY - GEN A1 - Schaub, Torsten H. A1 - Woltran, Stefan T1 - Special issue on answer set programming T2 - Künstliche Intelligenz Y1 - 2018 U6 - https://doi.org/10.1007/s13218-018-0554-8 SN - 0933-1875 SN - 1610-1987 VL - 32 IS - 2-3 SP - 101 EP - 103 PB - Springer CY - Heidelberg ER - TY - JOUR A1 - Schäfer, Robin A1 - Stede, Manfred T1 - Argument mining on twitter BT - a survey JF - Information technology : it ; Methoden und innovative Anwendungen der Informatik und Informationstechnik ; Organ der Fachbereiche 3 und 4 der GI e.V. und des Fachbereichs 6 der ITG N2 - In the last decade, the field of argument mining has grown notably. However, only relatively few studies have investigated argumentation in social media and specifically on Twitter. Here, we provide the, to our knowledge, first critical in-depth survey of the state of the art in tweet-based argument mining. We discuss approaches to modelling the structure of arguments in the context of tweet corpus annotation, and we review current progress in the task of detecting argument components and their relations in tweets. We also survey the intersection of argument mining and stance detection, before we conclude with an outlook. KW - Argument Mining KW - Twitter KW - Stance Detection Y1 - 2021 U6 - https://doi.org/10.1515/itit-2020-0053 SN - 1611-2776 SN - 2196-7032 VL - 63 IS - 1 SP - 45 EP - 58 PB - De Gruyter CY - Berlin ER - TY - JOUR A1 - Ayzel, Georgy A1 - Heistermann, Maik T1 - The effect of calibration data length on the performance of a conceptual hydrological model versus LSTM and GRU BT - a case study for six basins from the CAMELS dataset JF - Computers & geosciences : an international journal devoted to the publication of papers on all aspects of geocomputation and to the distribution of computer programs and test data sets ; an official journal of the International Association for Mathematical Geology N2 - We systematically explore the effect of calibration data length on the performance of a conceptual hydrological model, GR4H, in comparison to two Artificial Neural Network (ANN) architectures: Long Short-Term Memory Networks (LSTM) and Gated Recurrent Units (GRU), which have just recently been introduced to the field of hydrology. We implemented a case study for six river basins across the contiguous United States, with 25 years of meteorological and discharge data. Nine years were reserved for independent validation; two years were used as a warm-up period, one year for each of the calibration and validation periods, respectively; from the remaining 14 years, we sampled increasing amounts of data for model calibration, and found pronounced differences in model performance. While GR4H required less data to converge, LSTM and GRU caught up at a remarkable rate, considering their number of parameters. Also, LSTM and GRU exhibited the higher calibration instability in comparison to GR4H. These findings confirm the potential of modern deep-learning architectures in rainfall runoff modelling, but also highlight the noticeable differences between them in regard to the effect of calibration data length. KW - Artificial neural networks KW - Calibration KW - Deep learning KW - Rainfall-runoff KW - modelling Y1 - 2021 U6 - https://doi.org/10.1016/j.cageo.2021.104708 SN - 0098-3004 SN - 1873-7803 VL - 149 PB - Elsevier CY - Amsterdam ER - TY - JOUR A1 - Kossmann, Jan A1 - Halfpap, Stefan A1 - Jankrift, Marcel A1 - Schlosser, Rainer T1 - Magic mirror in my hand, which is the best in the land? BT - an experimental evaluation of index selection algorithms JF - Proceedings of the VLDB Endowment N2 - Indexes are essential for the efficient processing of database workloads. Proposed solutions for the relevant and challenging index selection problem range from metadata-based simple heuristics, over sophisticated multi-step algorithms, to approaches that yield optimal results. The main challenges are (i) to accurately determine the effect of an index on the workload cost while considering the interaction of indexes and (ii) a large number of possible combinations resulting from workloads containing many queries and massive schemata with possibly thousands of attributes.
In this work, we describe and analyze eight index selection algorithms that are based on different concepts and compare them along different dimensions, such as solution quality, runtime, multi-column support, solution granularity, and complexity. In particular, we analyze the solutions of the algorithms for the challenging analytical Join Order, TPC-H, and TPC-DS benchmarks. Afterward, we assess strengths and weaknesses, infer insights for index selection in general and each approach individually, before we give recommendations on when to use which approach. Y1 - 2020 U6 - https://doi.org/10.14778/3407790.3407832 SN - 2150-8097 VL - 13 IS - 11 SP - 2382 EP - 2395 PB - Association for Computing Machinery CY - New York ER - TY - JOUR A1 - Kaya, Adem A1 - Freitag, Melina A. T1 - Conditioning analysis for discrete Helmholtz problems JF - Computers and mathematics with applications : an international journal N2 - In this paper, we examine conditioning of the discretization of the Helmholtz problem. Although the discrete Helmholtz problem has been studied from different perspectives, to the best of our knowledge, there is no conditioning analysis for it. We aim to fill this gap in the literature. We propose a novel method in 1D to observe the near-zero eigenvalues of a symmetric indefinite matrix. Standard classification of ill-conditioning based on the matrix condition number is not true for the discrete Helmholtz problem. We relate the ill-conditioning of the discretization of the Helmholtz problem with the condition number of the matrix. We carry out analytical conditioning analysis in 1D and extend our observations to 2D with numerical observations. We examine several discretizations. We find different regions in which the condition number of the problem shows different characteristics. We also explain the general behavior of the solutions in these regions. KW - Helmholtz problem KW - Condition number KW - Ill-conditioning KW - Indefinite KW - matrices Y1 - 2022 U6 - https://doi.org/10.1016/j.camwa.2022.05.016 SN - 0898-1221 SN - 1873-7668 VL - 118 SP - 171 EP - 182 PB - Elsevier Science CY - Amsterdam ER - TY - JOUR A1 - Mattis, Toni A1 - Beckmann, Tom A1 - Rein, Patrick A1 - Hirschfeld, Robert T1 - First-class concepts BT - Reified architectural knowledge beyond dominant decompositions JF - Journal of object technology : JOT / ETH Zürich, Department of Computer Science N2 - Ideally, programs are partitioned into independently maintainable and understandable modules. As a system grows, its architecture gradually loses the capability to accommodate new concepts in a modular way. While refactoring is expensive and not always possible, and the programming language might lack dedicated primary language constructs to express certain cross-cutting concerns, programmers are still able to explain and delineate convoluted concepts through secondary means: code comments, use of whitespace and arrangement of code, documentation, or communicating tacit knowledge.
Secondary constructs are easy to change and provide high flexibility in communicating cross-cutting concerns and other concepts among programmers. However, such secondary constructs usually have no reified representation that can be explored and manipulated as first-class entities through the programming environment.
In this exploratory work, we discuss novel ways to express a wide range of concepts, including cross-cutting concerns, patterns, and lifecycle artifacts independently of the dominant decomposition imposed by an existing architecture. We propose the representation of concepts as first-class objects inside the programming environment that retain the capability to change as easily as code comments. We explore new tools that allow programmers to view, navigate, and change programs based on conceptual perspectives. In a small case study, we demonstrate how such views can be created and how the programming experience changes from draining programmers' attention by stretching it across multiple modules toward focusing it on cohesively presented concepts. Our designs are geared toward facilitating multiple secondary perspectives on a system to co-exist in symbiosis with the original architecture, hence making it easier to explore, understand, and explain complex contexts and narratives that are hard or impossible to express using primary modularity constructs. KW - software engineering KW - modularity KW - exploratory programming KW - program KW - comprehension KW - remodularization KW - architecture recovery Y1 - 2022 U6 - https://doi.org/10.5381/jot.2022.21.2.a6 SN - 1660-1769 VL - 21 IS - 2 SP - 1 EP - 15 PB - ETH Zürich, Department of Computer Science CY - Zürich ER - TY - JOUR A1 - Koumarelas, Ioannis A1 - Jiang, Lan A1 - Naumann, Felix T1 - Data preparation for duplicate detection JF - Journal of data and information quality : (JDIQ) N2 - Data errors represent a major issue in most application workflows. Before any important task can take place, a certain data quality has to be guaranteed by eliminating a number of different errors that may appear in data. Typically, most of these errors are fixed with data preparation methods, such as whitespace removal. However, the particular error of duplicate records, where multiple records refer to the same entity, is usually eliminated independently with specialized techniques. Our work is the first to bring these two areas together by applying data preparation operations under a systematic approach prior to performing duplicate detection.
Our process workflow can be summarized as follows: It begins with the user providing as input a sample of the gold standard, the actual dataset, and optionally some constraints to domain-specific data preparations, such as address normalization. The preparation selection operates in two consecutive phases. First, to vastly reduce the search space of ineffective data preparations, decisions are made based on the improvement or worsening of pair similarities. Second, using the remaining data preparations an iterative leave-one-out classification process removes preparations one by one and determines the redundant preparations based on the achieved area under the precision-recall curve (AUC-PR). Using this workflow, we manage to improve the results of duplicate detection up to 19% in AUC-PR. KW - data preparation KW - data wrangling KW - record linkage KW - duplicate detection KW - similarity measures Y1 - 2020 U6 - https://doi.org/10.1145/3377878 SN - 1936-1955 SN - 1936-1963 VL - 12 IS - 3 PB - Association for Computing Machinery CY - New York ER - TY - JOUR A1 - Kossmann, Jan A1 - Schlosser, Rainer T1 - Self-driving database systems BT - a conceptual approach JF - Distributed and parallel databases N2 - Challenges for self-driving database systems, which tune their physical design and configuration autonomously, are manifold: Such systems have to anticipate future workloads, find robust configurations efficiently, and incorporate knowledge gained by previous actions into later decisions. We present a component-based framework for self-driving database systems that enables database integration and development of self-managing functionality with low overhead by relying on separation of concerns. By keeping the components of the framework reusable and exchangeable, experiments are simplified, which promotes further research in that area. Moreover, to optimize multiple mutually dependent features, e.g., index selection and compression configurations, we propose a linear programming (LP) based algorithm to derive an efficient tuning order automatically. Afterwards, we demonstrate the applicability and scalability of our approach with reproducible examples. KW - database systems KW - self-driving KW - recursive tuning KW - workload prediction KW - robustness Y1 - 2020 U6 - https://doi.org/10.1007/s10619-020-07288-w SN - 0926-8782 SN - 1573-7578 VL - 38 IS - 4 SP - 795 EP - 817 PB - Springer CY - Dordrecht ER - TY - THES A1 - Hosp, Sven T1 - Modifizierte Cross-Party Codes zur schnellen Mehrbit-Fehlerkorrektur Y1 - 2015 ER - TY - JOUR A1 - Schneider, Johannes A1 - Wenig, Phillip A1 - Papenbrock, Thorsten T1 - Distributed detection of sequential anomalies in univariate time series JF - The VLDB journal : the international journal on very large data bases N2 - The automated detection of sequential anomalies in time series is an essential task for many applications, such as the monitoring of technical systems, fraud detection in high-frequency trading, or the early detection of disease symptoms. All these applications require the detection to find all sequential anomalies possibly fast on potentially very large time series. In other words, the detection needs to be effective, efficient and scalable w.r.t. the input size. Series2Graph is an effective solution based on graph embeddings that are robust against re-occurring anomalies and can discover sequential anomalies of arbitrary length and works without training data. Yet, Series2Graph is no t scalable due to its single-threaded approach; it cannot, in particular, process arbitrarily large sequences due to the memory constraints of a single machine. In this paper, we propose our distributed anomaly detection system, short DADS, which is an efficient and scalable adaptation of Series2Graph. Based on the actor programming model, DADS distributes the input time sequence, intermediate state and the computation to all processors of a cluster in a way that minimizes communication costs and synchronization barriers. Our evaluation shows that DADS is orders of magnitude faster than S2G, scales almost linearly with the number of processors in the cluster and can process much larger input sequences due to its scale-out property. KW - Distributed programming KW - Sequential anomaly KW - Actor model KW - Data mining KW - Time series Y1 - 2021 U6 - https://doi.org/10.1007/s00778-021-00657-6 SN - 1066-8888 SN - 0949-877X VL - 30 IS - 4 SP - 579 EP - 602 PB - Springer CY - Berlin ER - TY - JOUR A1 - Kleemann, Steven T1 - Cyber warfare and the "humanization" of international humanitarian law JF - International journal of cyber warfare and terrorism N2 - Cyber warfare is a timely and relevant issue and one of the most controversial in international humanitarian law (IHL). The aim of IHL is to set rules and limits in terms of means and methods of warfare. In this context, a key question arises: Has digital warfare rules or limits, and if so, how are these applicable? Traditional principles, developed over a long period, are facing a new dimension of challenges due to the rise of cyber warfare. This paper argues that to overcome this new issue, it is critical that new humanity-oriented approaches is developed with regard to cyber warfare. The challenge is to establish a legal regime for cyber-attacks, successfully addressing human rights norms and standards. While clarifying this from a legal perspective, the authors can redesign the sensitive equilibrium between humanity and military necessity, weighing the humanitarian aims of IHL and the protection of civilians-in combination with international human rights law and other relevant legal regimes-in a different manner than before. KW - cyber-attack KW - cyberwar KW - IHL KW - IHRL KW - international human rights KW - international humanitarian law KW - law and technology KW - new technologies Y1 - 2021 SN - 978-1-7998-6177-5 U6 - https://doi.org/10.4018/IJCWT.2021040101 SN - 1947-3435 SN - 1947-3443 VL - 11 IS - 2 SP - 1 EP - 11 PB - IGI Global CY - Hershey ER - TY - THES A1 - Köhlmann, Wiebke T1 - Zugänglichkeit virtueller Klassenzimmer für Blinde N2 - E-Learning-Anwendungen bieten Chancen für die gesetzlich vorgeschriebene Inklusion von Lernenden mit Beeinträchtigungen. Die gleichberechtigte Teilhabe von blinden Lernenden an Veranstaltungen in virtuellen Klassenzimmern ist jedoch durch den synchronen, multimedialen Charakter und den hohen Informationsumfang dieser Lösungen kaum möglich. Die vorliegende Arbeit untersucht die Zugänglichkeit virtueller Klassenzimmer für blinde Nutzende, um eine möglichst gleichberechtigte Teilhabe an synchronen, kollaborativen Lernszenarien zu ermöglichen. Im Rahmen einer Produktanalyse werden dazu virtuelle Klassenzimmer auf ihre Zugänglichkeit und bestehende Barrieren untersucht und Richtlinien für die zugängliche Gestaltung von virtuellen Klassenzimmern definiert. Anschließend wird ein alternatives Benutzungskonzept zur Darstellung und Bedienung virtueller Klassenzimmer auf einem zweidimensionalen taktilen Braille-Display entwickelt, um eine möglichst gleichberechtigte Teilhabe blinder Lernender an synchronen Lehrveranstaltungen zu ermöglichen. Nach einer ersten Evaluation mit blinden Probanden erfolgt die prototypische Umsetzung des Benutzungskonzepts für ein Open-Source-Klassenzimmer. Die abschließende Evaluation der prototypischen Umsetzung zeigt die Verbesserung der Zugänglichkeit von virtuellen Klassenzimmern für blinde Lernende unter Verwendung eines taktilen Flächendisplays und bestätigt die Wirksamkeit der im Rahmen dieser Arbeit entwickelten Konzepte. Y1 - 2016 SN - 978-3-8325-4273-3 PB - Logos CY - Berlin ER - TY - JOUR A1 - Neher, Dieter A1 - Kniepert, Juliane A1 - Elimelech, Arik A1 - Koster, L. Jan Anton T1 - A New Figure of Merit for Organic Solar Cells with Transport-limited Photocurrents JF - Scientific reports N2 - Compared to their inorganic counterparts, organic semiconductors suffer from relatively low charge carrier mobilities. Therefore, expressions derived for inorganic solar cells to correlate characteristic performance parameters to material properties are prone to fail when applied to organic devices. This is especially true for the classical Shockley-equation commonly used to describe current-voltage (JV)-curves, as it assumes a high electrical conductivity of the charge transporting material. Here, an analytical expression for the JV-curves of organic solar cells is derived based on a previously published analytical model. This expression, bearing a similar functional dependence as the Shockley-equation, delivers a new figure of merit α to express the balance between free charge recombination and extraction in low mobility photoactive materials. This figure of merit is shown to determine critical device parameters such as the apparent series resistance and the fill factor. KW - Electronic and spintronic devices KW - Semiconductors Y1 - 2016 U6 - https://doi.org/10.1038/srep24861 SN - 2045-2322 VL - 6 PB - Nature Publishing Group CY - London ER - TY - BOOK ED - Lambrecht, Anna-Lena ED - Margaria, Tizian T1 - Process design for natural scientists BT - an agile model-driven approach T3 - Communications in computer and information science ; 500 N2 - This book presents an agile and model-driven approach to manage scientific workflows. The approach is based on the Extreme Model Driven Design (XMDD) paradigm and aims at simplifying and automating the complex data analysis processes carried out by scientists in their day-to-day work. Besides documenting the impact the workflow modeling might have on the work of natural scientists, this book serves three major purposes: 1. It acts as a primer for practitioners who are interested to learn how to think in terms of services and workflows when facing domain-specific scientific processes. 2. It provides interesting material for readers already familiar with this kind of tools, because it introduces systematically both the technologies used in each case study and the basic concepts behind them. 3. As the addressed thematic field becomes increasingly relevant for lectures in both computer science and experimental sciences, it also provides helpful material for teachers that plan similar courses. Y1 - 2014 SN - 978-3-662-45005-5 PB - Springer CY - Wiesbaden ER - TY - THES A1 - Wust, Johannes T1 - Mixed workload managment for in-memory databases BT - executing mixed workloads of enterprise applications with TAMEX Y1 - 2015 ER - TY - JOUR A1 - Teske, Daniel T1 - Geocoder accuracy ranking JF - Process design for natural scientists: an agile model-driven approach N2 - Finding an address on a map is sometimes tricky: the chosen map application may be unfamiliar with the enclosed region. There are several geocoders on the market, they have different databases and algorithms to compute the query. Consequently, the geocoding results differ in their quality. Fortunately the geocoders provide a rich set of metadata. The workflow described in this paper compares this metadata with the aim to find out which geocoder is offering the best-fitting coordinate for a given address. Y1 - 2014 SN - 978-3-662-45005-5 SN - 1865-0929 IS - 500 SP - 161 EP - 174 PB - Springer CY - Berlin ER - TY - JOUR A1 - Sens, Henriette T1 - Web-Based map generalization tools put to the test: a jABC workflow JF - Process Design for Natural Scientists: an agile model-driven approach N2 - Geometric generalization is a fundamental concept in the digital mapping process. An increasing amount of spatial data is provided on the web as well as a range of tools to process it. This jABC workflow is used for the automatic testing of web-based generalization services like mapshaper.org by executing its functionality, overlaying both datasets before and after the transformation and displaying them visually in a .tif file. Mostly Web Services and command line tools are used to build an environment where ESRI shapefiles can be uploaded, processed through a chosen generalization service and finally visualized in Irfanview. Y1 - 2014 SN - 978-3-662-45005-5 SN - 1865-0929 IS - 500 SP - 175 EP - 185 PB - Springer CY - Berlin ER - TY - JOUR A1 - Noack, Franziska T1 - CREADED: Colored-Relief application for digital elevation data JF - Process design for natural scientists: an agile model-driven approach N2 - In the geoinformatics field, remote sensing data is often used for analyzing the characteristics of the current investigation area. This includes DEMs, which are simple raster grids containing grey scales representing the respective elevation values. The project CREADED that is presented in this paper aims at making these monochrome raster images more significant and more intuitively interpretable. For this purpose, an executable interactive model for creating a colored and relief-shaded Digital Elevation Model (DEM) has been designed using the jABC framework. The process is based on standard jABC-SIBs and SIBs that provide specific GIS functions, which are available as Web services, command line tools and scripts. Y1 - 2014 SN - 978-3-662-45005-5 SN - 1865-0929 IS - 500 SP - 186 EP - 199 PB - Springer CY - Berlin ER -