publish.UP Search

RHEEMix in the data jungle (2020)

Kruse, Sebastian ; Kaoudi, Zoi ; Contreras-Rojas, Bertty ; Chawla, Sanjay ; Naumann, Felix ; Quiané-Ruiz, Jorge-Arnulfo

Data analytics are moving beyond the limits of a single platform. In this paper, we present the cost-based optimizer of Rheem, an open-source cross-platform system that copes with these new requirements. The optimizer allocates the subtasks of data analytic tasks to the most suitable platforms. Our main contributions are: (i) a mechanism based on graph transformations to explore alternative execution strategies; (ii) a novel graph-based approach to determine efficient data movement plans among subtasks and platforms; and (iii) an efficient plan enumeration algorithm, based on a novel enumeration algebra. We extensively evaluate our optimizer under diverse real tasks. We show that our optimizer can perform tasks more than one order of magnitude faster when using multiple platforms than when using a single platform.

The influence of reward on facial mimicry (2020)

Trilla, Irene ; Drimalla, Hanna ; Bajbouj, Malek ; Dziobek, Isabel

Recent findings suggest a role of oxytocin on the tendency to spontaneously mimic the emotional facial expressions of others. Oxytocin-related increases of facial mimicry, however, seem to be dependent on contextual factors. Given previous literature showing that people preferentially mimic emotional expressions of individuals associated with high (vs. low) rewards, we examined whether the reward value of the mimicked agent is one factor influencing the oxytocin effects on facial mimicry. To test this hypothesis, 60 male adults received 24 IU of either intranasal oxytocin or placebo in a double-blind, between-subject experiment. Next, the value of male neutral faces was manipulated using an associative learning task with monetary rewards. After the reward associations were learned, participants watched videos of the same faces displaying happy and angry expressions. Facial reactions to the emotional expressions were measured with electromyography. We found that participants judged as more pleasant the face identities associated with high reward values than with low reward values. However, happy expressions by low rewarding faces were more spontaneously mimicked than high rewarding faces. Contrary to our expectations, we did not find a significant direct effect of intranasal oxytocin on facial mimicry, nor on the reward-driven modulation of mimicry. Our results support the notion that mimicry is a complex process that depends on contextual factors, but failed to provide conclusive evidence of a role of oxytocin on the modulation of facial mimicry.

Action-dependent processing of touch in the human parietal operculum and posterior insula (2020)

Limanowski, Jakub ; Lopes, Pedro ; Keck, Janis ; Baudisch, Patrick ; Friston, Karl ; Blankenburg, Felix

Somatosensory input generated by one's actions (i.e., self-initiated body movements) is generally attenuated. Conversely, externally caused somatosensory input is enhanced, for example, during active touch and the haptic exploration of objects. Here, we used functional magnetic resonance imaging (fMRI) to ask how the brain accomplishes this delicate weighting of self-generated versus externally caused somatosensory components. Finger movements were either self-generated by our participants or induced by functional electrical stimulation (FES) of the same muscles. During half of the trials, electrotactile impulses were administered when the (actively or passively) moving finger reached a predefined flexion threshold. fMRI revealed an interaction effect in the contralateral posterior insular cortex (pIC), which responded more strongly to touch during self-generated than during FES-induced movements. A network analysis via dynamic causal modeling revealed that connectivity from the secondary somatosensory cortex via the pIC to the supplementary motor area was generally attenuated during self-generated relative to FES-induced movements-yet specifically enhanced by touch received during self-generated, but not FES-induced movements. Together, these results suggest a crucial role of the parietal operculum and the posterior insula in differentiating self-generated from externally caused somatosensory information received from one's moving limb.

ganon (2020)

Piro, Vitor C. ; Dadi, Temesgen H. ; Seiler, Enrico ; Reinert, Knut ; Renard, Bernhard Y.

Motivation: The exponential growth of assembled genome sequences greatly benefits metagenomics studies. However, currently available methods struggle to manage the increasing amount of sequences and their frequent updates. Indexing the current RefSeq can take days and hundreds of GB of memory on large servers. Few methods address these issues thus far, and even though many can theoretically handle large amounts of references, time/memory requirements are prohibitive in practice. As a result, many studies that require sequence classification use often outdated and almost never truly up-to-date indices. Results: Motivated by those limitations, we created ganon, a k-mer-based read classification tool that uses Interleaved Bloom Filters in conjunction with a taxonomic clustering and a k-mer counting/filtering scheme. Ganon provides an efficient method for indexing references, keeping them updated. It requires <55 min to index the complete RefSeq of bacteria, archaea, fungi and viruses. The tool can further keep these indices up-to-date in a fraction of the time necessary to create them. Ganon makes it possible to query against very large reference sets and therefore it classifies significantly more reads and identifies more species than similar methods. When classifying a high-complexity CAMI challenge dataset against complete genomes from RefSeq, ganon shows strongly increased precision with equal or better sensitivity compared with state-of-the-art tools. With the same dataset against the complete RefSeq, ganon improved the F1-score by 65% at the genus level. It supports taxonomy- and assembly-level classification, multiple indices and hierarchical classification.

Complexity of independency and cliquy trees (2020)

Casel, Katrin ; Dreier, Jan ; Fernau, Henning ; Gobbert, Moritz ; Kuinke, Philipp ; Villaamil, Fernando Sánchez ; Schmid, Markus L. ; van Leeuwen, Erik Jan

An independency (cliquy) tree of an n-vertex graph G is a spanning tree of G in which the set of leaves induces an independent set (clique). We study the problems of minimizing or maximizing the number of leaves of such trees, and fully characterize their parameterized complexity. We show that all four variants of deciding if an independency/cliquy tree with at least/most l leaves exists parameterized by l are either Para-NP- or W[1]-hard. We prove that minimizing the number of leaves of a cliquy tree parameterized by the number of internal vertices is Para-NP-hard too. However, we show that minimizing the number of leaves of an independency tree parameterized by the number k of internal vertices has an O*(4(k))-time algorithm and a 2k vertex kernel. Moreover, we prove that maximizing the number of leaves of an independency/cliquy tree parameterized by the number k of internal vertices both have an O*(18(k))-time algorithm and an O(k 2(k)) vertex kernel, but no polynomial kernel unless the polynomial hierarchy collapses to the third level. Finally, we present an O(3(n) . f(n))-time algorithm to find a spanning tree where the leaf set has a property that can be decided in f (n) time and has minimum or maximum size.

RHEEMix in the data jungle (2020)

Kruse, Sebastian ; Kaoudi, Zoi ; Contreras-Rojas, Bertty ; Chawla, Sanjay ; Naumann, Felix ; Quiane-Ruiz, Jorge-Arnulfo

Data analytics are moving beyond the limits of a single platform. In this paper, we present the cost-based optimizer of Rheem, an open-source cross-platform system that copes with these new requirements. The optimizer allocates the subtasks of data analytic tasks to the most suitable platforms. Our main contributions are: (i) a mechanism based on graph transformations to explore alternative execution strategies; (ii) a novel graph-based approach to determine efficient data movement plans among subtasks and platforms; and (iii) an efficient plan enumeration algorithm, based on a novel enumeration algebra. We extensively evaluate our optimizer under diverse real tasks. We show that our optimizer can perform tasks more than one order of magnitude faster when using multiple platforms than when using a single platform.

Partial order resolution of event logs for process conformance checking (2020)

van der Aa, Han ; Leopold, Henrik ; Weidlich, Matthias

While supporting the execution of business processes, information systems record event logs. Conformance checking relies on these logs to analyze whether the recorded behavior of a process conforms to the behavior of a normative specification. A key assumption of existing conformance checking techniques, however, is that all events are associated with timestamps that allow to infer a total order of events per process instance. Unfortunately, this assumption is often violated in practice. Due to synchronization issues, manual event recordings, or data corruption, events are only partially ordered. In this paper, we put forward the problem of partial order resolution of event logs to close this gap. It refers to the construction of a probability distribution over all possible total orders of events of an instance. To cope with the order uncertainty in real-world data, we present several estimators for this task, incorporating different notions of behavioral abstraction. Moreover, to reduce the runtime of conformance checking based on partial order resolution, we introduce an approximation method that comes with a bounded error in terms of accuracy. Our experiments with real-world and synthetic data reveal that our approach improves accuracy over the state-of-the-art considerably.

Constrained expectation maximisation algorithm for estimating ARMA models in state space representation (2020)

Galka, Andreas ; Moontaha, Sidratul ; Siniatchkin, Michael

This paper discusses the fitting of linear state space models to given multivariate time series in the presence of constraints imposed on the four main parameter matrices of these models. Constraints arise partly from the assumption that the models have a block-diagonal structure, with each block corresponding to an ARMA process, that allows the reconstruction of independent source components from linear mixtures, and partly from the need to keep models identifiable. The first stage of parameter fitting is performed by the expectation maximisation (EM) algorithm. Due to the identifiability constraint, a subset of the diagonal elements of the dynamical noise covariance matrix needs to be constrained to fixed values (usually unity). For this kind of constraints, so far, no closed-form update rules were available. We present new update rules for this situation, both for updating the dynamical noise covariance matrix directly and for updating a matrix square-root of this matrix. The practical applicability of the proposed algorithm is demonstrated by a low-dimensional simulation example. The behaviour of the EM algorithm, as observed in this example, illustrates the well-known fact that in practical applications, the EM algorithm should be combined with a different algorithm for numerical optimisation, such as a quasi-Newton algorithm.

Meta-analysis uncovers genome-wide significant variants for rapid kidney function decline (2020)

Gorski, Mathias ; Jung, Bettina ; Li, Yong ; Matias-Garcia, Pamela R. ; Wuttke, Matthias ; Coassin, Stefan ; Thio, Chris H. L. ; Kleber, Marcus E. ; Winkler, Thomas W. ; Wanner, Veronika ; Chai, Jin-Fang ; Chu, Audrey Y. ; Cocca, Massimiliano ; Feitosa, Mary F. ; Ghasemi, Sahar ; Hoppmann, Anselm ; Horn, Katrin ; Li, Man ; Nutile, Teresa ; Scholz, Markus ; Sieber, Karsten B. ; Teumer, Alexander ; Tin, Adrienne ; Wang, Judy ; Tayo, Bamidele O. ; Ahluwalia, Tarunveer S. ; Almgren, Peter ; Bakker, Stephan J. L. ; Banas, Bernhard ; Bansal, Nisha ; Biggs, Mary L. ; Boerwinkle, Eric ; Böttinger, Erwin ; Brenner, Hermann ; Carroll, Robert J. ; Chalmers, John ; Chee, Miao-Li ; Chee, Miao-Ling ; Cheng, Ching-Yu ; Coresh, Josef ; de Borst, Martin H. ; Degenhardt, Frauke ; Eckardt, Kai-Uwe ; Endlich, Karlhans ; Franke, Andre ; Freitag-Wolf, Sandra ; Gampawar, Piyush ; Gansevoort, Ron T. ; Ghanbari, Mohsen ; Gieger, Christian ; Hamet, Pavel ; Ho, Kevin ; Hofer, Edith ; Holleczek, Bernd ; Foo, Valencia Hui Xian ; Hutri-Kahonen, Nina ; Hwang, Shih-Jen ; Ikram, M. Arfan ; Josyula, Navya Shilpa ; Kahonen, Mika ; Khor, Chiea-Chuen ; Koenig, Wolfgang ; Kramer, Holly ; Kraemer, Bernhard K. ; Kuehnel, Brigitte ; Lange, Leslie A. ; Lehtimaki, Terho ; Lieb, Wolfgang ; Loos, Ruth J. F. ; Lukas, Mary Ann ; Lyytikainen, Leo-Pekka ; Meisinger, Christa ; Meitinger, Thomas ; Melander, Olle ; Milaneschi, Yuri ; Mishra, Pashupati P. ; Mononen, Nina ; Mychaleckyj, Josyf C. ; Nadkarni, Girish N. ; Nauck, Matthias ; Nikus, Kjell ; Ning, Boting ; Nolte, Ilja M. ; O'Donoghue, Michelle L. ; Orho-Melander, Marju ; Pendergrass, Sarah A. ; Penninx, Brenda W. J. H. ; Preuss, Michael H. ; Psaty, Bruce M. ; Raffield, Laura M. ; Raitakari, Olli T. ; Rettig, Rainer ; Rheinberger, Myriam ; Rice, Kenneth M. ; Rosenkranz, Alexander R. ; Rossing, Peter ; Rotter, Jerome ; Sabanayagam, Charumathi ; Schmidt, Helena ; Schmidt, Reinhold ; Schoettker, Ben ; Schulz, Christina-Alexandra ; Sedaghat, Sanaz ; Shaffer, Christian M. ; Strauch, Konstantin ; Szymczak, Silke ; Taylor, Kent D. ; Tremblay, Johanne ; Chaker, Layal ; van der Harst, Pim ; van der Most, Peter J. ; Verweij, Niek ; Voelker, Uwe ; Waldenberger, Melanie ; Wallentin, Lars ; Waterworth, Dawn M. ; White, Harvey D. ; Wilson, James G. ; Wong, Tien-Yin ; Woodward, Mark ; Yang, Qiong ; Yasuda, Masayuki ; Yerges-Armstrong, Laura M. ; Zhang, Yan ; Snieder, Harold ; Wanner, Christoph ; Boger, Carsten A. ; Kottgen, Anna ; Kronenberg, Florian ; Pattaro, Cristian ; Heid, Iris M.

Rapid decline of glomerular filtration rate estimated from creatinine (eGFRcrea) is associated with severe clinical endpoints. In contrast to cross-sectionally assessed eGFRcrea, the genetic basis for rapid eGFRcrea decline is largely unknown. To help define this, we meta-analyzed 42 genome-wide association studies from the Chronic Kidney Diseases Genetics Consortium and United Kingdom Biobank to identify genetic loci for rapid eGFRcrea decline. Two definitions of eGFRcrea decline were used: 3 mL/min/1.73m(2)/year or more ("Rapid3"; encompassing 34,874 cases, 107,090 controls) and eGFRcrea decline 25% or more and eGFRcrea under 60 mL/min/1.73m(2) at follow-up among those with eGFRcrea 60 mL/min/1.73m(2) or more at baseline ("CKDi25"; encompassing 19,901 cases, 175,244 controls). Seven independent variants were identified across six loci for Rapid3 and/or CKDi25: consisting of five variants at four loci with genome-wide significance (near UMOD-PDILT (2), PRKAG2, WDR72, OR2S2) and two variants among 265 known eGFRcrea variants (near GATM, LARP4B). All these loci were novel for Rapid3 and/or CKDi25 and our bioinformatic follow-up prioritized variants and genes underneath these loci. The OR2S2 locus is novel for any eGFRcrea trait including interesting candidates. For the five genome-wide significant lead variants, we found supporting effects for annual change in blood urea nitrogen or cystatin-based eGFR, but not for GATM or (LARP4B). Individuals at high compared to those at low genetic risk (8-14 vs. 0-5 adverse alleles) had a 1.20-fold increased risk of acute kidney injury (95% confidence interval 1.08-1.33). Thus, our identified loci for rapid kidney function decline may help prioritize therapeutic targets and identify mechanisms and individuals at risk for sustained deterioration of kidney function.

Multiplicative Up-Drift (2020)

Doerr, Benjamin ; Kötzing, Timo

Drift analysis aims at translating the expected progress of an evolutionary algorithm (or more generally, a random process) into a probabilistic guarantee on its run time (hitting time). So far, drift arguments have been successfully employed in the rigorous analysis of evolutionary algorithms, however, only for the situation that the progress is constant or becomes weaker when approaching the target. Motivated by questions like how fast fit individuals take over a population, we analyze random processes exhibiting a (1+delta)-multiplicative growth in expectation. We prove a drift theorem translating this expected progress into a hitting time. This drift theorem gives a simple and insightful proof of the level-based theorem first proposed by Lehre (2011). Our version of this theorem has, for the first time, the best-possible near-linear dependence on 1/delta} (the previous results had an at least near-quadratic dependence), and it only requires a population size near-linear in delta (this was super-quadratic in previous results). These improvements immediately lead to stronger run time guarantees for a number of applications. We also discuss the case of large delta and show stronger results for this setting.

Refine

Has Fulltext

Author

Year of publication

Document Type

Language

Is part of the Bibliography

Keywords

Institute

42 search hits