Abstract
Domain Generation Algorithms (DGA) are typically used by recent botnets to communicate with their command-and-control server, thus exacerbating the complexity of detecting them compared to older botnets using static IP addresses. As such, recent studies have been experimenting with different approaches to detect algorithmically generated domains using a variety of methods, including Deep Learning. This paper presents a Deep Learning approach based on autoencoders as a semi-supervised method requiring only legitimate domains for training. Semi-supervised methods have an advantage over supervised methods in that they require no labelled DGA data. The proposed autoencoder structure is based on a Neural Network (NN) processing the frequency of 2-grams in domain names. The method has been compared with supervised machine learning methods and cross-validated on a second unseen dataset to evaluate the generalization of results. Results confirmed an F-score of 73% on DGA detection outperforming a NN based on letter frequencies and a Random Forest approach based on 𝑛-grams scoring 71% and 65% respectively.
Original language | English |
---|---|
Title of host publication | AI-CyberSec 2021 - Workshop on Artificial Intelligence and Cyber Security |
Publisher | CEUR Workshop Proceedings |
Pages | 1-9 |
Number of pages | 9 |
Publication status | Published online - 14 Dec 2021 |
Event | AI-Cybersec Workshop 2021: Workshop on Artificial Intelligence and Cyber Security - Virtual Duration: 14 Dec 2021 → 14 Dec 2021 https://sites.google.com/view/ai-cybersec-2021/programme?authuser=0 |
Publication series
Name | CEUR Workshop Proceedings |
---|---|
ISSN (Electronic) | 1613-0073 |
Workshop
Workshop | AI-Cybersec Workshop 2021 |
---|---|
Abbreviated title | AI-CyberSec |
Period | 14/12/21 → 14/12/21 |
Internet address |