TY - INPR A1 - Prasse, Paul A1 - Gruben, Gerrit A1 - Machlika, Lukas A1 - Pevny, Tomas A1 - Sofka, Michal A1 - Scheffer, Tobias T1 - Malware Detection by HTTPS Traffic Analysis N2 - In order to evade detection by network-traffic analysis, a growing proportion of malware uses the encrypted HTTPS protocol. We explore the problem of detecting malware on client computers based on HTTPS traffic analysis. In this setting, malware has to be detected based on the host IP address, ports, timestamp, and data volume information of TCP/IP packets that are sent and received by all the applications on the client. We develop a scalable protocol that allows us to collect network flows of known malicious and benign applications as training data and derive a malware-detection method based on a neural networks and sequence classification. We study the method's ability to detect known and new, unknown malware in a large-scale empirical study. KW - machine learning KW - computer security Y1 - 2017 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-100942 ER - TY - JOUR A1 - Prasse, Paul A1 - Knaebel, Rene A1 - Machlica, Lukas A1 - Pevny, Tomas A1 - Scheffer, Tobias T1 - Joint detection of malicious domains and infected clients JF - Machine learning N2 - Detection of malware-infected computers and detection of malicious web domains based on their encrypted HTTPS traffic are challenging problems, because only addresses, timestamps, and data volumes are observable. The detection problems are coupled, because infected clients tend to interact with malicious domains. Traffic data can be collected at a large scale, and antivirus tools can be used to identify infected clients in retrospect. Domains, by contrast, have to be labeled individually after forensic analysis. We explore transfer learning based on sluice networks; this allows the detection models to bootstrap each other. In a large-scale experimental study, we find that the model outperforms known reference models and detects previously unknown malware, previously unknown malware families, and previously unknown malicious domains. KW - Machine learning KW - Neural networks KW - Computer security KW - Traffic data KW - Https traffic Y1 - 2019 U6 - https://doi.org/10.1007/s10994-019-05789-z SN - 0885-6125 SN - 1573-0565 VL - 108 IS - 8-9 SP - 1353 EP - 1368 PB - Springer CY - Dordrecht ER -