Implementing record and refinement for debugging timing-dependent communication
- Distributed applications are hard to debug because timing-dependent network communication is a source of non-deterministic behavior. Current approaches to debug non deterministic failures include post-mortem debugging as well as record and replay. However, the first impairs system performance to gather data, whereas the latter requires developers to understand the timing-dependent communication at a lower level of abstraction than they develop at. Furthermore, both approaches require intrusive core library modifications to gather data from live systems. In this paper, we present the Peek-At-Talk debugger for investigating non-deterministic failures with low overhead in a systematic, top-down method, with a particular focus on tool-building issues in the following areas: First, we show how our debugging framework Path Tools guides developers from failures to their root causes and gathers run-time data with low overhead. Second, we present Peek-At-Talk, an extension to our Path Tools framework to record non-deterministic communicationDistributed applications are hard to debug because timing-dependent network communication is a source of non-deterministic behavior. Current approaches to debug non deterministic failures include post-mortem debugging as well as record and replay. However, the first impairs system performance to gather data, whereas the latter requires developers to understand the timing-dependent communication at a lower level of abstraction than they develop at. Furthermore, both approaches require intrusive core library modifications to gather data from live systems. In this paper, we present the Peek-At-Talk debugger for investigating non-deterministic failures with low overhead in a systematic, top-down method, with a particular focus on tool-building issues in the following areas: First, we show how our debugging framework Path Tools guides developers from failures to their root causes and gathers run-time data with low overhead. Second, we present Peek-At-Talk, an extension to our Path Tools framework to record non-deterministic communication and refine behavioral data that connects source code with network events. Finally, we scope changes to the core library to record network communication without impacting other network applications.…
Verfasserangaben: | Tim FelgentreffORCiDGND, Michael Perscheid, Robert HirschfeldORCiDGND |
---|---|
DOI: | https://doi.org/10.1016/j.scico.2015.11.006 |
ISSN: | 0167-6423 |
ISSN: | 1872-7964 |
Titel des übergeordneten Werks (Englisch): | Science of computer programming |
Verlag: | Elsevier |
Verlagsort: | Amsterdam |
Publikationstyp: | Wissenschaftlicher Artikel |
Sprache: | Englisch |
Datum der Erstveröffentlichung: | 30.11.2016 |
Erscheinungsjahr: | 2017 |
Datum der Freischaltung: | 04.07.2022 |
Freies Schlagwort / Tag: | Distributed debugging; Dynamic analysis; Record and refinement; Record and replay |
Band: | 134 |
Seitenanzahl: | 15 |
Erste Seite: | 4 |
Letzte Seite: | 18 |
Organisationseinheiten: | An-Institute / Hasso-Plattner-Institut für Digital Engineering gGmbH |
DDC-Klassifikation: | 0 Informatik, Informationswissenschaft, allgemeine Werke / 00 Informatik, Wissen, Systeme / 000 Informatik, Informationswissenschaft, allgemeine Werke |
Peer Review: | Referiert |