Workload-Driven Horizontal Partitioning and Pruning for Large HTAP Systems

Boissier, Martin; Kurzynski, Daniel

doi:10.1109/ICDEW.2018.00026

Modern server systems with large NUMA architectures necessitate (i) data being distributed over the available computing nodes and (ii) NUMA-aware query processing to enable effective parallel processing in database systems. As these architectures incur significant latency and throughout penalties for accessing non-local data, queries should be executed as close as possible to the data. To further increase both performance and efficiency, data that is not relevant for the query result should be skipped as early as possible. One way to achieve this goal is horizontal partitioning to improve static partition pruning. As part of our ongoing work on workload-driven partitioning, we have implemented a recent approach called aggressive data skipping and extended it to handle both analytical as well as transactional access patterns. In this paper, we evaluate this approach with the workload and data of a production enterprise system of a Global 2000 company. The results show that over 80% of all tuples can be skipped in average while theModern server systems with large NUMA architectures necessitate (i) data being distributed over the available computing nodes and (ii) NUMA-aware query processing to enable effective parallel processing in database systems. As these architectures incur significant latency and throughout penalties for accessing non-local data, queries should be executed as close as possible to the data. To further increase both performance and efficiency, data that is not relevant for the query result should be skipped as early as possible. One way to achieve this goal is horizontal partitioning to improve static partition pruning. As part of our ongoing work on workload-driven partitioning, we have implemented a recent approach called aggressive data skipping and extended it to handle both analytical as well as transactional access patterns. In this paper, we evaluate this approach with the workload and data of a production enterprise system of a Global 2000 company. The results show that over 80% of all tuples can be skipped in average while the resulting partitioning schemata are surprisingly stable over time.… show more

Author details:	Martin Boissier ORCiD, Daniel Kurzynski
DOI:	https://doi.org/10.1109/ICDEW.2018.00026
ISBN:	978-1-5386-6306-6
Title of parent work (English):	2018 IEEE 34th International Conference on Data Engineering Workshops (ICDEW)
Publisher:	IEEE
Place of publishing:	New York
Publication type:	Other
Language:	English
Date of first publication:	2018/07/05
Publication year:	2018
Release date:	2022/03/21
Number of pages:	6
First page:	116
Last Page:	121
Organizational units:	Digital Engineering Fakultät / Hasso-Plattner-Institut für Digital Engineering GmbH
DDC classification:	0 Informatik, Informationswissenschaft, allgemeine Werke / 00 Informatik, Wissen, Systeme / 000 Informatik, Informationswissenschaft, allgemeine Werke
Peer review:	Referiert

Workload-Driven Horizontal Partitioning and Pruning for Large HTAP Systems

Export metadata

Additional Services