TY - GEN A1 - Loster, Michael A1 - Naumann, Felix A1 - Ehmueller, Jan A1 - Feldmann, Benjamin T1 - CurEx BT - a system for extracting, curating, and exploring domain-specific knowledge graphs from text T2 - Proceedings of the 27th ACM International Conference on Information and Knowledge Management N2 - The integration of diverse structured and unstructured information sources into a unified, domain-specific knowledge base is an important task in many areas. A well-maintained knowledge base enables data analysis in complex scenarios, such as risk analysis in the financial sector or investigating large data leaks, such as the Paradise or Panama papers. Both the creation of such knowledge bases, as well as their continuous maintenance and curation involves many complex tasks and considerable manual effort. With CurEx, we present a modular system that allows structured and unstructured data sources to be integrated into a domain-specific knowledge base. In particular, we (i) enable the incremental improvement of each individual integration component; (ii) enable the selective generation of multiple knowledge graphs from the information contained in the knowledge base; and (iii) provide two distinct user interfaces tailored to the needs of data engineers and end-users respectively. The former has curation capabilities and controls the integration process, whereas the latter focuses on the exploration of the generated knowledge graph. Y1 - 2018 SN - 978-1-4503-6014-2 U6 - https://doi.org/10.1145/3269206.3269229 SP - 1883 EP - 1886 PB - Association for Computing Machinery CY - New York ER - TY - GEN A1 - Kruse, Sebastian A1 - Kaoudi, Zoi A1 - Quiane-Ruiz, Jorge-Arnulfo A1 - Chawla, Sanjay A1 - Naumann, Felix A1 - Contreras-Rojas, Bertty T1 - Optimizing Cross-Platform Data Movement T2 - 2019 IEEE 35th International Conference on Data Engineering (ICDE) N2 - Data analytics are moving beyond the limits of a single data processing platform. A cross-platform query optimizer is necessary to enable applications to run their tasks over multiple platforms efficiently and in a platform-agnostic manner. For the optimizer to be effective, it must consider data movement costs across different data processing platforms. In this paper, we present the graph-based data movement strategy used by RHEEM, our open-source cross-platform system. In particular, we (i) model the data movement problem as a new graph problem, which we prove to be NP-hard, and (ii) propose a novel graph exploration algorithm, which allows RHEEM to discover multiple hidden opportunities for cross-platform data processing. Y1 - 2019 SN - 978-1-5386-7474-1 SN - 978-1-5386-7475-8 U6 - https://doi.org/10.1109/ICDE.2019.00162 SN - 1084-4627 SN - 1063-6382 SP - 1642 EP - 1645 PB - IEEE CY - New York ER -