Robust and budget-constrained encoding configurations for in-memory database systems
- Data encoding has been applied to database systems for decades as it mitigates bandwidth bottlenecks and reduces storage requirements. But even in the presence of these advantages, most in-memory database systems use data encoding only conservatively as the negative impact on runtime performance can be severe. Real-world systems with large parts being infrequently accessed and cost efficiency constraints in cloud environments require solutions that automatically and efficiently select encoding techniques, including heavy-weight compression. In this paper, we introduce workload-driven approaches to automaticaly determine memory budget-constrained encoding configurations using greedy heuristics and linear programming. We show for TPC-H, TPC-DS, and the Join Order Benchmark that optimized encoding configurations can reduce the main memory footprint significantly without a loss in runtime performance over state-of-the-art dictionary encoding. To yield robust selections, we extend the linear programming-based approach to incorporate queryData encoding has been applied to database systems for decades as it mitigates bandwidth bottlenecks and reduces storage requirements. But even in the presence of these advantages, most in-memory database systems use data encoding only conservatively as the negative impact on runtime performance can be severe. Real-world systems with large parts being infrequently accessed and cost efficiency constraints in cloud environments require solutions that automatically and efficiently select encoding techniques, including heavy-weight compression. In this paper, we introduce workload-driven approaches to automaticaly determine memory budget-constrained encoding configurations using greedy heuristics and linear programming. We show for TPC-H, TPC-DS, and the Join Order Benchmark that optimized encoding configurations can reduce the main memory footprint significantly without a loss in runtime performance over state-of-the-art dictionary encoding. To yield robust selections, we extend the linear programming-based approach to incorporate query runtime constraints and mitigate unexpected performance regressions.…
Author details: | Martin BoissierORCiD |
---|---|
DOI: | https://doi.org/10.14778/3503585.3503588 |
ISSN: | 2150-8097 |
Title of parent work (English): | Proceedings of the VLDB Endowment |
Publisher: | Association for Computing Machinery (ACM) |
Place of publishing: | [New York] |
Publication type: | Article |
Language: | English |
Date of first publication: | 2021/12/01 |
Publication year: | 2021 |
Release date: | 2024/01/30 |
Tag: | General Earth and Planetary Sciences; Geography, Planning and Development; Water Science and Technology |
Volume: | 15 |
Issue: | 4 |
Number of pages: | 14 |
First page: | 780 |
Last Page: | 793 |
Organizational units: | An-Institute / Hasso-Plattner-Institut für Digital Engineering gGmbH |
DDC classification: | 0 Informatik, Informationswissenschaft, allgemeine Werke / 00 Informatik, Wissen, Systeme / 004 Datenverarbeitung; Informatik |
License (German): | CC-BY-NC-ND - Namensnennung, nicht kommerziell, keine Bearbeitungen 4.0 International |