Refine
Document Type
- Article (2)
- Monograph/Edited Volume (1)
- Conference Proceeding (1)
- Doctoral Thesis (1)
Language
- English (5)
Is part of the Bibliography
- yes (5)
Keywords
- Cloud Computing (2)
- Forschungsprojekte (2)
- Future SOC Lab (2)
- In-Memory Technologie (2)
- Multicore Architekturen (2)
- maschinelles Lernen (2)
- Angriffserkennung (1)
- Anomaly detection (1)
- Automated parsing (1)
- Big Data (1)
HPI Future SOC Lab
(2015)
Das Future SOC Lab am HPI ist eine Kooperation des Hasso-Plattner-Instituts mit verschiedenen Industriepartnern. Seine Aufgabe ist die Ermöglichung und Förderung des Austausches zwischen Forschungsgemeinschaft und Industrie.
Am Lab wird interessierten Wissenschaftlern eine Infrastruktur von neuester Hard- und Software kostenfrei für Forschungszwecke zur Verfügung gestellt. Dazu zählen teilweise noch nicht am Markt verfügbare Technologien, die im normalen Hochschulbereich in der Regel nicht zu finanzieren wären, bspw. Server mit bis zu 64 Cores und 2 TB Hauptspeicher. Diese Angebote richten sich insbesondere an Wissenschaftler in den Gebieten Informatik und Wirtschaftsinformatik. Einige der Schwerpunkte sind Cloud Computing, Parallelisierung und In-Memory Technologien.
In diesem Technischen Bericht werden die Ergebnisse der Forschungsprojekte des Jahres 2015 vorgestellt. Ausgewählte Projekte stellten ihre Ergebnisse am 15. April 2015 und 4. November 2015 im Rahmen der Future SOC Lab Tag Veranstaltungen vor.
The last years have shown an increasing sophistication of attacks against enterprises. Traditional security solutions like firewalls, anti-virus systems and generally Intrusion Detection Systems (IDSs) are no longer sufficient to protect an enterprise against these advanced attacks. One popular approach to tackle this issue is to collect and analyze events generated across the IT landscape of an enterprise. This task is achieved by the utilization of Security Information and Event Management (SIEM) systems. However, the majority of the currently existing SIEM solutions is not capable of handling the massive volume of data and the diversity of event representations. Even if these solutions can collect the data at a central place, they are neither able to extract all relevant information from the events nor correlate events across various sources. Hence, only rather simple attacks are detected, whereas complex attacks, consisting of multiple stages, remain undetected. Undoubtedly, security operators of large enterprises are faced with a typical Big Data problem.
In this thesis, we propose and implement a prototypical SIEM system named Real-Time Event Analysis and Monitoring System (REAMS) that addresses the Big Data challenges of event data with common paradigms, such as data normalization, multi-threading, in-memory storage, and distributed processing. In particular, a mostly stream-based event processing workflow is proposed that collects, normalizes, persists and analyzes events in near real-time. In this regard, we have made various contributions in the SIEM context. First, we propose a high-performance normalization algorithm that is highly parallelized across threads and distributed across nodes. Second, we are persisting into an in-memory database for fast querying and correlation in the context of attack detection. Third, we propose various analysis layers, such as anomaly- and signature-based detection, that run on top of the normalized and correlated events. As a result, we demonstrate our capabilities to detect previously known as well as unknown attack patterns. Lastly, we have investigated the integration of cyber threat intelligence (CTI) into the analytical process, for instance, for correlating monitored user accounts with previously collected public identity leaks to identify possible compromised user accounts.
In summary, we show that a SIEM system can indeed monitor a large enterprise environment with a massive load of incoming events. As a result, complex attacks spanning across the whole network can be uncovered and mitigated, which is an advancement in comparison to existing SIEM systems on the market.
The relevance of identity data leaks on the Internet is more present than ever. Almost every week we read about leakage of databases with more than a million users in the news. Smaller but not less dangerous leaks happen even multiple times a day. The public availability of such leaked data is a major threat to the victims, but also creates the opportunity to learn not only about security of service providers but also the behavior of users when choosing passwords. Our goal is to analyze this data and generate knowledge that can be used to increase security awareness and security, respectively. This paper presents a novel approach to the processing and analysis of a vast majority of bigger and smaller leaks. We evolved from a semi-manual to a fully automated process that requires a minimum of human interaction. Our contribution is the concept and a prototype implementation of a leak processing workflow that includes the extraction of digital identities from structured and unstructured leak-files, the identification of hash routines and a quality control to ensure leak authenticity. By making use of parallel and distributed programming, we are able to make leaks almost immediately available for analysis and notification after they have been published. Based on the data collected, this paper reveals how easy it is for criminals to collect lots of passwords, which are plain text or only weakly hashed. We publish those results and hope to increase not only security awareness of Internet users but also security on a technical level on the service provider side.
After almost two decades of development, modern Security Information and Event Management (SIEM) systems still face issues with normalisation of heterogeneous data sources, high number of false positive alerts and long analysis times, especially in large-scale networks with high volumes of security events. In this paper, we present our own prototype of SIEM system, which is capable of dealing with these issues. For efficient data processing, our system employs in-memory data storage (SAP HANA) and our own technologies from the previous work, such as the Object Log Format (OLF) and high-speed event normalisation. We analyse normalised data using a combination of three different approaches for security analysis: misuse detection, query-based analytics, and anomaly detection. Compared to the previous work, we have significantly improved our unsupervised anomaly detection algorithms. Most importantly, we have developed a novel hybrid outlier detection algorithm that returns ranked clusters of anomalies. It lets an operator of a SIEM system to concentrate on the several top-ranked anomalies, instead of digging through an unsorted bundle of suspicious events. We propose to use anomaly detection in a combination with signatures and queries, applied on the same data, rather than as a full replacement for misuse detection. In this case, the majority of attacks will be captured with misuse detection, whereas anomaly detection will highlight previously unknown behaviour or attacks. We also propose that only the most suspicious event clusters need to be checked by an operator, whereas other anomalies, including false positive alerts, do not need to be explicitly checked if they have a lower ranking. We have proved our concepts and algorithms on a dataset of 160 million events from a network segment of a big multinational company and suggest that our approach and methods are highly relevant for modern SIEM systems.
The “HPI Future SOC Lab” is a cooperation of the Hasso Plattner Institute (HPI) and industry partners. Its mission is to enable and promote exchange and interaction between the research community and the industry partners.
The HPI Future SOC Lab provides researchers with free of charge access to a complete infrastructure of state of the art hard and software. This infrastructure includes components, which might be too expensive for an ordinary research environment, such as servers with up to 64 cores and 2 TB main memory. The offerings address researchers particularly from but not limited to the areas of computer science and business information systems. Main areas of research include cloud computing, parallelization, and In-Memory technologies.
This technical report presents results of research projects executed in 2017. Selected projects have presented their results on April 25th and November 15th 2017 at the Future SOC Lab Day events.