Refine
Has Fulltext
- no (3)
Year of publication
- 2022 (3) (remove)
Language
- English (3)
Is part of the Bibliography
- yes (3)
When I started my PhD, I wanted to do something related to systems but I wasn't sure exactly what. I didn't consider data management systems initially, because I was unaware of the richness of the systems work that data management systems were build on. I thought the field was mainly about SQL. Luckily, that view changed quickly.
Persistent memory's (PMem) byte-addressability and persistence at DRAM-like speed with SSD-like capacity have the potential to cause a major performance shift in database storage systems. With the availability of Intel Optane DC Persistent Memory, initial benchmarks evaluate the performance of real PMem hardware.
However, these results apply to only a single server and it is not yet clear how workloads compare across different PMem servers. In this paper, we propose PerMA-Bench, a con.gurable benchmark framework that allows users to evaluate the bandwidth, latency, and operations per second for customizable database-related PMem access.
Based on PerMA-Bench, we perform an extensive evaluation of PMem performance across four di.erent server configurations, containing both first- and second-generation Optane, with additional parameters such as DIMM power budget and number of DIMMs per server.
We validate our results with existing systems and show the impact of low-level design choices. We conduct a price-performance comparison that shows while there are large differences across Optane DIMMs, PMem is generally competitive with DRAM. We discuss our findings and identify eight general and implementation-specific aspects that influence PMem performance and should be considered in future work to improve PMem-aware designs.
Modern data analysis tasks often involve control flow statements, such as the iterations in PageRank and K-means. To achieve scalability, developers usually implement these tasks in distributed dataflow systems, such as Spark and Flink. Designers of such systems have to choose between providing imperative or functional control flow constructs to users. Imperative constructs are easier to use, but functional constructs are easier to compile to an efficient dataflow job. We propose Mitos, a system where control flow is both easy to use and efficient. Mitos relies on an intermediate representation based on the static single assignment form. This allows us to abstract away from specific control flow constructs and treat any imperative control flow uniformly both when building the dataflow job and when coordinating the distributed execution.