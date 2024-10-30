Schließen

  Window aggregation is a core operation in data stream processing. Existing aggregation techniques focus on reducing latency, eliminating redundant computations, or minimizing memory usage. However, each technique operates under different assumptions with respect to workload characteristics, such as properties of aggregation functions (e.g., invertible, associative), window types (e.g., sliding, sessions), windowing measures (e.g., time- or count-based), and stream (dis)order. In this article, we present Scotty, an efficient and general open-source operator for sliding-window aggregation in stream processing systems, such as Apache Flink, Apache Beam, Apache Samza, Apache Kafka, Apache Spark, and Apache Storm. One can easily extend Scotty with user-defined aggregation functions and window types. Scotty implements the concept of general stream slicing and derives workload characteristics from aggregation queries to improve performance without sacrificing its general applicability. We provide an in-depth view on the algorithms of the general stream slicing approach. Our experiments show that Scotty outperforms alternative solutions.

Author details:Jonas Traub, Philipp Marian Grulich, Alejandro Rodriguez Cuellar, Sebastian Bress, Asterios Katsifodimos, Tilmann RablGND, Volker MarklGND
DOI:https://doi.org/10.1145/3433675
ISSN:0362-5915
ISSN:1557-4644
Title of parent work (English):ACM transactions on database systems : TODS / Association for Computing Machinery
Subtitle (English):General and efficient open-source window aggregation for stream processing systems
Publisher:Association for Computing Machinery
Place of publishing:New York
Publication type:Article
Language:English
Date of first publication:2021/03/27
Publication year:2021
Release date:2024/10/30
Tag:Apache Beam; Apache Flink; Apache Kafka; Apache Samza; Apache Spark; Apache Storm; Scotty; Streams; Window; aggregate sharing; aggregation; open-source; session window; sliding-window; stream processing; tumbling window
Volume:46
Issue:1
Article number:1
Number of pages:46
Funding institution:German Ministry for Education and ResearchFederal Ministry of Education & Research (BMBF) [01IS18025A, 01IS18037A]; EU Horizon 2020 Opertus Mundi project [870228]; [SFB 1404 FONDA]
Organizational units:Digital Engineering Fakultät / Hasso-Plattner-Institut für Digital Engineering GmbH
DDC classification:0 Informatik, Informationswissenschaft, allgemeine Werke / 00 Informatik, Wissen, Systeme / 004 Datenverarbeitung; Informatik
5 Naturwissenschaften und Mathematik / 51 Mathematik / 510 Mathematik
Peer review:Referiert

