Workload-Driven Horizontal Partitioning and Pruning for Large HTAP Systems

Boissier, Martin; Kurzynski, Daniel

doi:10.1109/ICDEW.2018.00026

Modern server systems with large NUMA architectures necessitate (i) data being distributed over the available computing nodes and (ii) NUMA-aware query processing to enable effective parallel processing in database systems. As these architectures incur significant latency and throughout penalties for accessing non-local data, queries should be executed as close as possible to the data. To further increase both performance and efficiency, data that is not relevant for the query result should be skipped as early as possible. One way to achieve this goal is horizontal partitioning to improve static partition pruning. As part of our ongoing work on workload-driven partitioning, we have implemented a recent approach called aggressive data skipping and extended it to handle both analytical as well as transactional access patterns. In this paper, we evaluate this approach with the workload and data of a production enterprise system of a Global 2000 company. The results show that over 80% of all tuples can be skipped in average while theModern server systems with large NUMA architectures necessitate (i) data being distributed over the available computing nodes and (ii) NUMA-aware query processing to enable effective parallel processing in database systems. As these architectures incur significant latency and throughout penalties for accessing non-local data, queries should be executed as close as possible to the data. To further increase both performance and efficiency, data that is not relevant for the query result should be skipped as early as possible. One way to achieve this goal is horizontal partitioning to improve static partition pruning. As part of our ongoing work on workload-driven partitioning, we have implemented a recent approach called aggressive data skipping and extended it to handle both analytical as well as transactional access patterns. In this paper, we evaluate this approach with the workload and data of a production enterprise system of a Global 2000 company. The results show that over 80% of all tuples can be skipped in average while the resulting partitioning schemata are surprisingly stable over time.… zeige mehr

Verfasserangaben:	Martin Boissier ORCiD, Daniel Kurzynski
DOI:	https://doi.org/10.1109/ICDEW.2018.00026
ISBN:	978-1-5386-6306-6
Titel des übergeordneten Werks (Englisch):	2018 IEEE 34th International Conference on Data Engineering Workshops (ICDEW)
Verlag:	IEEE
Verlagsort:	New York
Publikationstyp:	Sonstiges
Sprache:	Englisch
Datum der Erstveröffentlichung:	05.07.2018
Erscheinungsjahr:	2018
Datum der Freischaltung:	21.03.2022
Seitenanzahl:	6
Erste Seite:	116
Letzte Seite:	121
Organisationseinheiten:	Digital Engineering Fakultät / Hasso-Plattner-Institut für Digital Engineering GmbH
DDC-Klassifikation:	0 Informatik, Informationswissenschaft, allgemeine Werke / 00 Informatik, Wissen, Systeme / 000 Informatik, Informationswissenschaft, allgemeine Werke
Peer Review:	Referiert

Workload-Driven Horizontal Partitioning and Pruning for Large HTAP Systems

Metadaten exportieren

Weitere Dienste