Technical research of detection algorithmically generated malicious domain names using machine learning methods

CSKH-01.2018 - (Abstract) - In recent years, many malware use domain generation algorithm for generating a large of domains to maintain their Command and Control (C&C) network infrastructure. In this paper, we present an approach for detecting malicious domain names using machine learning methods. This approach is using Viterbi algorithm and dictionary for constructing feature of domain names. The approach is demonstrated using a range of legitimate domains and a number of malicious algorithmically generated domain names. The numerical results show the efficiency of this method.

Tóm tắt - Trong những năm gần đây, nhiều phần mềm độc hại sử dụng thuật toán sinh tên miền tạo ra lượng lớn các tên miền để duy trì cơ sở hạ tầng mạng ra lệnh và điều khiển (C&C). Trong bài báo này, chúng tôi trình bày một cách tiếp cận để phát hiện tên miền độc hại bằng phương pháp học máy. Cách tiếp cận này sử dụng thuật toán Viterbi và tập từ điển để trích xuất các đặc trưng của tên miền. Cách tiếp cận được thể hiện bằng cách sử dụng một lượng lớn các tên miền hợp pháp và một lượng lớn tên miền độc hại được tạo ra bằng thuật toán sinh tên miền. Các kết quả thực nghiệm đã chỉ ra tính hiệu quả của phương pháp.

Xem toàn bộ bài báo ở đây.

REFERENCES

[1]. Hà Quang Thụy, Nguyễn Hà Nam, Nguyễn Trí Thành, “Giáo trình khai phá dữ liệu”, VNU Publishing, 2013.

[2]. Moran Baruch, “DGA Detection Using Machine Learning Methods”, Master Thesis, University of Jyväskylä, 2016.

[3]. Thomas Edgar and David Manz, “Research Methods for Cyber Security”, Syngress, 2017.

[4]. Xingguo Li, Junfeng Wang, and Xiao song Zhang, “Botnet Detection Technology Based on DNS”, Future Internet 2017, 9, 55.

[5]. Michael Sikorski, Andrew Honig, “Practical Malware Analysis: The Hands-On Guide to Dissecting Malicious Software”, No Starch Press, 2012.

[6]. Konrad Rieck, Philipp Trinius, Carsten Willems, and Thorsten Holz, “Automatic Analysis of Malware Behavior using Machine Learning”, 2011.

[7]. Daisuke Miyamoto, Hiroaki Hazeyama, Youki Kadobayashi, “An Evaluation of Machine Learning-based Methods for Detection of Phishing Sites”, 2017.

[8]. Jasper Abbink, “Popularity-based Detection of Domain Generation Algorithms, Master Thesis”, Delft University of Technology, 2017.

[9]. Christopher M. Bishop, “Pattern Recognition and Machine Learning”, Springer Science, 2006.

[10]. Ian Goodfellow, Yoshua Bengio, Aaron Courville, “Deep Learning”, MIT Press book, 2016.

[11]. M. Namazifar, Y. Pan, “Research Spotlight: Detecting Algorithmically Generated Domains”, Cisco, 2015.

[12]. Enoch Agyepong, William J. Buchanan, Kevin Jones, “Detection of Algorithmically Generated Malicious Domain”, Conference: 6th International Conference of Advanced Computer Science & Information Technology, 2018.

[13]. M. Antonakakis, R. Perdisci, Y. Nadji, N. Vasiloglou II, S. Abu-Nimeh, W. Lee, and D. Dagon, “From Throw-Away Traffic to Bots: Detecting the Rise of DGA-Based Malware”. In USENIX security symposium Vol. 12, 2012.

[14]. S. Yadav, A.K.K Reddy, A.L. Reddy, and S. Ranjan, (2010, November). “Detecting Algorithmically generated malicious domain names”. In Proceedings of the 10th ACM SIGCOMM conference on Internet measurement pp. 48-61. ACM.

[15]. G. Zhao, K. Xu, L. Xu, and B. Wu, (2015). “Detecting APT Malware Infections Based on Malicious DNS and Traffic A nalysis”. IEEE Access, 3, pp. 1132-1142, 2015.

[16]. N. Goodman, “A Survey of Advances in Botnet Technologies”. arXiv preprint arXiv:1702.01132, 2017.

[17]. V. Oujezsky, T. Horvath, and V. Skorpil, “Botnet C&C Traffic and Flow Lifespans Using Survival Analysis”. International Journal of Advances in Telecommunications, Electrotechnics, Signals and Systems, 6(1), pp. 38-44, 201.

[18]. R. Sharifnya, and M. Abadi, DFBotKiller: “Domain-flux botnet detection based on the history of group activities and failures in DNS traffic”. Digital Investigation, 12, pp. 15-26, 2015.

[19]. Kotsiantis, Sotiris B., I. Zaharakis, and P. Pintelas. “Supervised machine learning: A review of classification techniques”. (2007): 3-24.

[20]. A. Chailytko, and A. Trafimchuk, “DGA clustering and analysis: mastering modern, evolving threats”, 2015.

[21]. L. Bilge, E. Kirda, C. Kruegel, and M. Balduzzi, “Finding Malicious Domains Using Passive DNS Analysis”, In Ndss, 2011.

[22]. J. Kwon, J. Lee, H. Lee and A Perrig, PsyBoG: “A scalable botnet detection method for large-scale DNS traffic, Computer Networks”, 97, pp. 48-73, 2016.

[23]. J. Lee, and H. Lee, “GMAD: Graph-based Malware Activity Detection by DNS traffic analysis”, Computer Communications, 49, 33-47, 2014.

[24]. R. Sharifnya, and M. Abadi, “DFBotKiller: Domain-flux botnet detection based on the history of group activities and failures in DNS traffic”, Digital Investigation, 12, pp. 15-26, 2015.

[25]. Yu Fu, Lu Yu, Richard Brooks, “Poster: Zero-day Botnet Domain Generation Algorithm (DGA) Detection using Hidden Markov Models (HMMs)”, 38th IEEE Symposium on Security and Privacy, 2017.

[26]. Tianyu Wang, Li-Chiou Ch, “Detecting Algorithmically Generated Domains Using Data Visualization and N-Grams Methods”, Proceedings of Student-Faculty Research Day, 2017.

[27]. Jonathan Woodbridge, Hyrum S. Anderson, Anjum Ahuja, and Daniel Grant, “Predicting Domain Generation Algorithms with Long Short-Term Memory Networks”, arXiv:1611.00791, 2016.

Thông tin trích dẫn: Hieu Ho Duc, Dr. Huong Ho Van, "Technical research of detection algorithmically generated malicious domain names using machine learning methods", Nghiên cứu khoa học và công nghệ trong lĩnh vực An toàn thông tin, Tạp chí An toàn thông tin, Vol. 07, pp. 37 - 43, No. 01, 2018.