Beacon in the Dark
- The large amount of heterogeneous data in these email corpora renders experts' investigations by hand infeasible. Auditors or journalists, e.g., who are looking for irregular or inappropriate content or suspicious patterns, are in desperate need for computer-aided exploration tools to support their investigations. We present our Beacon system for the exploration of such corpora at different levels of detail. A distributed processing pipeline combines text mining methods and social network analysis to augment the already semi-structured nature of emails. The user interface ties into the resulting cleaned and enriched dataset. For the interface design we identify three objectives expert users have: gain an initial overview of the data to identify leads to investigate, understand the context of the information at hand, and have meaningful filters to iteratively focus onto a subset of emails. To this end we make use of interactive visualisations based on rearranged and aggregated extracted information to reveal salient patterns.
Author details: | Tim Repke, Ralf KrestelORCiDGND, Jakob Edding, Moritz Hartmann, Jonas Hering, Dennis Kipping, Hendrik SchmidtGND, Nico Scordialo, Alexander Zenner |
---|---|
DOI: | https://doi.org/10.1145/3269206.3269231 |
ISBN: | 978-1-4503-6014-2 |
Title of parent work (English): | Proceedings of the 27th ACM International Conference on Information and Knowledge Management |
Subtitle (English): | a system for interactive exploration of large email Corpora |
Publisher: | Association for Computing Machinery |
Place of publishing: | New York |
Publication type: | Other |
Language: | English |
Date of first publication: | 2018/10/17 |
Publication year: | 2018 |
Release date: | 2022/03/04 |
Number of pages: | 4 |
First page: | 1871 |
Last Page: | 1874 |
Organizational units: | Digital Engineering Fakultät / Hasso-Plattner-Institut für Digital Engineering GmbH |
DDC classification: | 0 Informatik, Informationswissenschaft, allgemeine Werke / 00 Informatik, Wissen, Systeme / 000 Informatik, Informationswissenschaft, allgemeine Werke |
Peer review: | Referiert |