publish.UP Institut für Informatik und Computational Science

Hybrid metabolic network completion (2019)

Frioux, Clémence ; Schaub, Torsten H. ; Schellhorn, Sebastian ; Siegel, Anne ; Wanko, Philipp

Metabolic networks play a crucial role in biology since they capture all chemical reactions in an organism. While there are networks of high quality for many model organisms, networks for less studied organisms are often of poor quality and suffer from incompleteness. To this end, we introduced in previous work an answer set programming (ASP)-based approach to metabolic network completion. Although this qualitative approach allows for restoring moderately degraded networks, it fails to restore highly degraded ones. This is because it ignores quantitative constraints capturing reaction rates. To address this problem, we propose a hybrid approach to metabolic network completion that integrates our qualitative ASP approach with quantitative means for capturing reaction rates. We begin by formally reconciling existing stoichiometric and topological approaches to network completion in a unified formalism. With it, we develop a hybrid ASP encoding and rely upon the theory reasoning capacities of the ASP system dingo for solving the resulting logic program with linear constraints over reals. We empirically evaluate our approach by means of the metabolic network of Escherichia coli. Our analysis shows that our novel approach yields greatly superior results than obtainable from purely qualitative or quantitative approaches.

Remixing entity linking evaluation datasets for focused benchmarking (2019)

Waitelonis, Jörg ; Jürges, Henrik ; Sack, Harald

In recent years, named entity linking (NEL) tools were primarily developed in terms of a general approach, whereas today numerous tools are focusing on specific domains such as e.g. the mapping of persons and organizations only, or the annotation of locations or events in microposts. However, the available benchmark datasets necessary for the evaluation of NEL tools do not reflect this focalizing trend. We have analyzed the evaluation process applied in the NEL benchmarking framework GERBIL [in: Proceedings of the 24th International Conference on World Wide Web (WWW’15), International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland, 2015, pp. 1133–1143, Semantic Web 9(5) (2018), 605–625] and all its benchmark datasets. Based on these insights we have extended the GERBIL framework to enable a more fine grained evaluation and in depth analysis of the available benchmark datasets with respect to different emphases. This paper presents the implementation of an adaptive filter for arbitrary entities and customized benchmark creation as well as the automated determination of typical NEL benchmark dataset properties, such as the extent of content-related ambiguity and diversity. These properties are integrated on different levels, which also enables to tailor customized new datasets out of the existing ones by remixing documents based on desired emphases. Besides a new system library to enrich provided NIF [in: International Semantic Web Conference (ISWC’13), Lecture Notes in Computer Science, Vol. 8219, Springer, Berlin, Heidelberg, 2013, pp. 98–113] datasets with statistical information, best practices for dataset remixing are presented, and an in depth analysis of the performance of entity linking systems on special focus datasets is presented.

Detect me if you can (2019)

Alhosseini Almodarresi Yasin, Seyed Ali ; Bin Tareaf, Raad ; Najafi, Pejman ; Meinel, Christoph

Spam Bots have become a threat to online social networks with their malicious behavior, posting misinformation messages and influencing online platforms to fulfill their motives. As spam bots have become more advanced over time, creating algorithms to identify bots remains an open challenge. Learning low-dimensional embeddings for nodes in graph structured data has proven to be useful in various domains. In this paper, we propose a model based on graph convolutional neural networks (GCNN) for spam bot detection. Our hypothesis is that to better detect spam bots, in addition to defining a features set, the social graph must also be taken into consideration. GCNNs are able to leverage both the features of a node and aggregate the features of a node’s neighborhood. We compare our approach, with two methods that work solely on a features set and on the structure of the graph. To our knowledge, this work is the first attempt of using graph convolutional neural networks in spam bot detection.

Counting Complexity for Reasoning in Abstract Argumentation (2019)

Fichte, Johannes Klaus ; Hecher, Markus ; Meier, Arne

In this paper, we consider counting and projected model counting of extensions in abstract argumentation for various semantics. When asking for projected counts we are interested in counting the number of extensions of a given argumentation framework while multiple extensions that are identical when restricted to the projected arguments count as only one projected extension. We establish classical complexity results and parameterized complexity results when the problems are parameterized by treewidth of the undirected argumentation graph. To obtain upper bounds for counting projected extensions, we introduce novel algorithms that exploit small treewidth of the undirected argumentation graph of the input instance by dynamic programming (DP). Our algorithms run in time double or triple exponential in the treewidth depending on the considered semantics. Finally, we take the exponential time hypothesis (ETH) into account and establish lower bounds of bounded treewidth algorithms for counting extensions and projected extension.

Lower Bound Founded Logic of Here-and-There (2019)

Cabalar, Pedro ; Fandinno, Jorge ; Schaub, Torsten H. ; Schellhorn, Sebastian

A distinguishing feature of Answer Set Programming is that all atoms belonging to a stable model must be founded. That is, an atom must not only be true but provably true. This can be made precise by means of the constructive logic of Here-and-There, whose equilibrium models correspond to stable models. One way of looking at foundedness is to regard Boolean truth values as ordered by letting true be greater than false. Then, each Boolean variable takes the smallest truth value that can be proven for it. This idea was generalized by Aziz to ordered domains and applied to constraint satisfaction problems. As before, the idea is that a, say integer, variable gets only assigned to the smallest integer that can be justified. In this paper, we present a logical reconstruction of Aziz’ idea in the setting of the logic of Here-and-There. More precisely, we start by defining the logic of Here-and-There with lower bound founded variables along with its equilibrium models and elaborate upon its formal properties. Finally, we compare our approach with related ones and sketch future work.

plasp 3 (2019)

Dimopoulos, Yannis ; Gebser, Martin ; Lühne, Patrick ; Romero Davila, Javier ; Schaub, Torsten H.

We describe the new version of the Planning Domain Definition Language (PDDL)-to-Answer Set Programming (ASP) translator plasp. First, it widens the range of accepted PDDL features. Second, it contains novel planning encodings, some inspired by Satisfiability Testing (SAT) planning and others exploiting ASP features such as well-foundedness. All of them are designed for handling multivalued fluents in order to capture both PDDL as well as SAS planning formats. Third, enabled by multishot ASP solving, it offers advanced planning algorithms also borrowed from SAT planning. As a result, plasp provides us with an ASP-based framework for studying a variety of planning techniques in a uniform setting. Finally, we demonstrate in an empirical analysis that these techniques have a significant impact on the performance of ASP planning.

Reduction of spatially structured errors in Wide-Swath altimetric satellite data using data assimilation (2019)

Metref, Sammy ; Cosme, Emmanuel ; Le Sommer, Julien ; Poel, Nora ; Brankart, Jean-Michel ; Verron, Jacques ; Gomez Navarro, Laura

The Surface Water and Ocean Topography (SWOT) mission is a next generation satellite mission expected to provide a 2 km-resolution observation of the sea surface height (SSH) on a two-dimensional swath. Processing SWOT data will be challenging because of the large amount of data, the mismatch between a high spatial resolution and a low temporal resolution, and the observation errors. The present paper focuses on the reduction of the spatially structured errors of SWOT SSH data. It investigates a new error reduction method and assesses its performance in an observing system simulation experiment. The proposed error-reduction method first projects the SWOT SSH onto a subspace spanned by the SWOT spatially structured errors. This projection is removed from the SWOT SSH to obtain a detrended SSH. The detrended SSH is then processed within an ensemble data assimilation analysis to retrieve a full SSH field. In the latter step, the detrending is applied to both the SWOT data and an ensemble of model-simulated SSH fields. Numerical experiments are performed with synthetic SWOT observations and an ensemble from a North Atlantic, 1/60 degrees simulation of the ocean circulation (NATL60). The data assimilation analysis is carried out with an ensemble Kalman filter. The results are assessed with root mean square errors, power spectrum density, and spatial coherence. They show that a significant part of the large scale SWOT errors is reduced. The filter analysis also reduces the small scale errors and allows for an accurate recovery of the energy of the signal down to 25 km scales. In addition, using the SWOT nadir data to adjust the SSH detrending further reduces the errors.

Interactive objects in physical computing and their role in the learning process (2019)

Przybylla, Mareen

The target article discusses the question of how educational makerspaces can become places supportive of knowledge construction. This question is too often neglected by people who run makerspaces, as they mostly explain how to use different tools and focus on the creation of a product. In makerspaces, often pupils also engage in physical computing activities and thus in the creation of interactive artifacts containing embedded systems, such as smart shoes or wristbands, plant monitoring systems or drink mixing machines. This offers the opportunity to reflect on teaching physical computing in computer science education, where similarly often the creation of the product is so strongly focused upon that the reflection of the learning process is pushed into the background.

Gelfond-Zhang aggregates as propositional formulas (2019)

Cabalar, Pedro ; Fandinno, Jorge ; Schaub, Torsten H. ; Schellhorn, Sebastian

Answer Set Programming (ASP) has become a popular and widespread paradigm for practical Knowledge Representation thanks to its expressiveness and the available enhancements of its input language. One of such enhancements is the use of aggregates, for which different semantic proposals have been made. In this paper, we show that any ASP aggregate interpreted under Gelfond and Zhang's (GZ) semantics can be replaced (under strong equivalence) by a propositional formula. Restricted to the original GZ syntax, the resulting formula is reducible to a disjunction of conjunctions of literals but the formulation is still applicable even when the syntax is extended to allow for arbitrary formulas (including nested aggregates) in the condition. Once GZ-aggregates are represented as formulas, we establish a formal comparison (in terms of the logic of Here-and-There) to Ferraris' (F) aggregates, which are defined by a different formula translation involving nested implications. In particular, we prove that if we replace an F-aggregate by a GZ-aggregate in a rule head, we do not lose answer sets (although more can be gained). This extends the previously known result that the opposite happens in rule bodies, i.e., replacing a GZ-aggregate by an F-aggregate in the body may yield more answer sets. Finally, we characterize a class of aggregates for which GZ- and F-semantics coincide.

Joint detection of malicious domains and infected clients (2019)

Prasse, Paul ; Knaebel, Rene ; Machlica, Lukas ; Pevny, Tomas ; Scheffer, Tobias

Detection of malware-infected computers and detection of malicious web domains based on their encrypted HTTPS traffic are challenging problems, because only addresses, timestamps, and data volumes are observable. The detection problems are coupled, because infected clients tend to interact with malicious domains. Traffic data can be collected at a large scale, and antivirus tools can be used to identify infected clients in retrospect. Domains, by contrast, have to be labeled individually after forensic analysis. We explore transfer learning based on sluice networks; this allows the detection models to bootstrap each other. In a large-scale experimental study, we find that the model outperforms known reference models and detects previously unknown malware, previously unknown malware families, and previously unknown malicious domains.

Institut für Informatik und Computational Science

Refine

Has Fulltext

Author

Year of publication

Document Type

Language

Is part of the Bibliography

Keywords

Institute

15 search hits