基于关联信息提取的恶意域名检测方法

张斌; 廖仁杰

doi:10.11959/j.issn.1000-436x.2021181

您当前的位置：

首页 >

文章列表页 >

基于关联信息提取的恶意域名检测方法

学术论文 | 更新时间：2024-06-05

- 基于关联信息提取的恶意域名检测方法
- Malicious domain name detection method based on associated information extraction
- 通信学报 2021年42卷第10期页码：162-172
- 作者机构：
  
  1. 信息工程大学密码工程学院，河南郑州 450001
  2. 河南省信息安全重点实验室，河南郑州 450001
- 作者简介：
  
  [ "张斌（1969- ），男，河南南阳人，博士，信息工程大学教授、博士生导师，主要研究方向为信息系统安全" ]
  [ "廖仁杰（1996- ），男，四川泸州人，信息工程大学硕士生，主要研究方向为基于机器学习的恶意域名检测" ]
- 基金信息：
  
  信息保障技术重点实验室开放基金资助项目(KJ-15-109);信息工程大学新兴科研方向培育基金资助项目(2016604703);信息工程大学科研基金资助项目(2019f3303)
- DOI：10.11959/j.issn.1000-436x.2021181
  中图分类号： TP393
- 网络出版日期：2021-10，
  
  纸质出版日期：2021-10-25
- 稿件说明：
移动端阅览
张斌, 廖仁杰. 基于关联信息提取的恶意域名检测方法[J]. 通信学报, 2021,42(10):162-172.

Bin ZHANG, Renjie LIAO. Malicious domain name detection method based on associated information extraction[J]. Journal on communications, 2021, 42(10): 162-172.
张斌, 廖仁杰. 基于关联信息提取的恶意域名检测方法[J]. 通信学报, 2021,42(10):162-172. DOI： 10.11959/j.issn.1000-436x.2021181.

Bin ZHANG, Renjie LIAO. Malicious domain name detection method based on associated information extraction[J]. Journal on communications, 2021, 42(10): 162-172. DOI： 10.11959/j.issn.1000-436x.2021181.

摘要

为提高基于域名关联信息的恶意域名检测准确率，提出了一种基于域名解析信息与请求时间相结合的恶意域名检测方法。首先，将域名解析记录表示为异质信息网络中的节点和边，以同时表征异质域名数据获得较高的域名信息利用率；其次，为避免采用稀疏邻接矩阵相乘操作提取关联信息时间复杂度较高的问题，提出了一种基于元路径的广度优先网络遍历算法，提高关联解析信息提取效率；针对弱连接域名由于缺少关联解析信息而漏检的问题，引入请求时间刻画域名之间相关性，提高检测样本覆盖率；最后，设计权重自适应的域名表示学习算法，将域名关联解析信息和关联请求时间信息向量化，通过域名特征向量之间的欧氏距离量化域名之间关联性，进而构建有监督分类器进行恶意域名检测。理论分析和实验结果表明，所提方法具有较高的域名关联信息提取效率，所得检测覆盖率和F1分数分别为97.7%和0.951。

Abstract

To improve the accuracy of malicious domain name detection based on the associated information

a detection method combining resolution information and query time was proposed.Firstly

the resolution information was mapped to nodes and edges in a heterogeneous information network

which improved the utilization rate.Secondly

considering the problem of high computational complexity in extracting associated information with matrix multiplication

an efficiency breadth-first network traversal algorithm based on meta-path was proposed.Then

the query time was used to detect the domain names lacking meta-path information

which improved the coverage rate.Finally

domain names were vectorized by representation learning with adaptive weight.The Euclidean distance between domain name feature vectors was used to quantify the correlation between domain names.Based on the vectors learned above

a supervised classifier was constructed to detect malicious domain names.Theoretical analysis and experimental results show that the proposed method preforms well in extraction domain name associated information.The coverage rate and F1 score are 97.7% and 0.951 respectively.

关键词

Keywords

references

ZHAUNIAROVICH Y , KHALIL I , YU T , et al . A survey on malicious domains detection through DNS data analysis [J ] . ACM Computing Surveys , 2018 , 51 ( 4 ): 1 - 36 .

GAO H Y , YEGNESWARAN V , JIANG J , et al . Reexamining DNS from a global recursive resolver perspective [J ] . IEEE/ACM Transactions on Networking , 2016 , 24 ( 1 ): 43 - 57 .

WANG X , ZHENG K F , NIU X X , et al . Detection of command and control in advanced persistent threat based on independent access [C ] // Proceedings of 2016 IEEE International Conference on Communications (ICC) . Piscataway:IEEE Press , 2016 : 1 - 6 .

彭成维 , 云晓春 , 张永铮 , 等 . 一种基于域名请求伴随关系的恶意域名检测方法 [J ] . 计算机研究与发展 , 2019 , 56 ( 6 ): 1263 - 1274 .

PENG C W , YUN X C , ZHANG Y Z , et al . Detecting malicious do-mains using co-occurrence relation between DNS query [J ] . Journal of Computer Research and Development , 2019 , 56 ( 6 ): 1263 - 1274 .

YEDIDIA J S , FREEMAN W T , WEISS Y . Understanding belief propagation and its generalizations [J ] . Exploring Artificial Intelligence in the New Millennium , 2003 , 8 : 236 - 239 .

MANADHATA P K , YADAV S , RAO P , et al . Detecting malicious domains via graph inference [M ] . Cham : Springer International Publishing , 2014 .

KHALIL I , YU T , GUAN B . Discovering malicious domains through passive DNS data graph analysis [C ] // Proceedings of the 11th ACM on Asia Conference on Computer and Communications Security . New York:ACM Press , 2016 : 663 - 674 .

LEE J , LEE H . GMAD:graph-based malware activity detection by DNS traffic analysis [J ] . Computer Communications , 2014 , 49 : 33 - 47 .

臧小东 , 龚俭 , 胡晓艳 . 基于 AGD 的恶意域名检测 [J ] . 通信学报 , 2018 , 39 ( 7 ): 15 - 25 .

ZANG X D , GONG J , HU X Y . Detecting malicious domain names based on AGD [J ] . Journal on Communications , 2018 , 39 ( 7 ): 15 - 25 .

PENG C W , YUN X C , ZHANG Y Z , et al . Discovering malicious domains through alias-canonical graph [C ] // Proceedings of 2017 IEEE Trustcom/BigDataSE/ICESS . Piscataway:IEEE Press , 2017 : 225 - 232 .

ZOU F T , ZHANG S Y , RAO W X , et al . Detecting malware based on DNS graph mining [J ] . International Journal of Distributed Sensor Networks , 2015 , 2015 : 1 - 12 .

SUN Y Z , HAN J W . Mining heterogeneous information networks:principles and methodologies [J ] . Synthesis Lectures on Data Mining and Knowledge Discovery , 2012 , 3 ( 2 ): 1 - 159 .

TANG J , QU M , WANG M Z , et al . LINE:large-scale information network embedding [C ] // Proceedings of the 24th International Conference on World Wide Web . New York:ACM Press , 2015 : 1067 - 1077 .

LEI K , FU Q A , NI J K , et al . Detecting malicious domains with behavioral modeling and graph embedding [C ] // Proceedings of 2019 IEEE 39th International Conference on Distributed Computing Systems (ICDCS) . Piscataway:IEEE Press , 2019 : 601 - 611 .

PENG C W , YUN X C , ZHANG Y Z , et al . MalShoot:shooting malicious domains through graph embedding on passive DNS data [M ] . Cham : Springer International Publishing , 2019 .

SUN X Q , TONG M K , YANG J H . HinDom:a robust malicious domain detection system based on heterogeneous information network with transductive classification [C ] // Proceeding of the 22nd International Symposium on Research in Attacks,Intrusions and Defenses . Berkley:USENIX Association , 2019 : 399 - 412 .

KIPF T N , WELLING M . Semi-supervised classification with graph convolutional networks [J ] . arXiv Preprint,arXiv:1609.02907 , 2016 .

LIU Z , LI S , ZHANG Y , et al . Ringer:systematic mining of malicious domains by dynamic graph convolutional network [C ] // Proceeding of the International Conference on Computational Science . Berlin:Springer , 2020 : 379 - 398 .

SUN X Q , YANG J H , WANG Z L , et al . HGDom:heterogeneous graph convolutional networks for malicious domain detection [C ] // Proceedings of 2020 IEEE/IFIP Network Operations and Management Symposium . Piscataway:IEEE Press , 2020 : 1 - 9 .

HE W X , GOU G P , KANG C C , et al . Malicious domain detection via domain relationship and graph models [C ] // Proceedings of 2019 IEEE 38th International Performance Computing and Communications Conference (IPCCC) . Piscataway:IEEE Press , 2019 : 1 - 8 .

NSFOCUS . 2019 Botnet trend report [R ] . NSFOCUS Security Labs , 2020 .

MIKOLOV T , SUTSKEVER I , CHEN K , et al . Distributed representations of words and phrases and their compositionality [C ] // Proceeding of the Advances in Neural Information Processing Systems . Massachusetts:MIT Press , 2013 : 3111 - 3119 .

SCHÜPPEN S , TEUBERT D , HERRMANN P , et al . FANCI:feature-based automated NXDomain classification and intelligence [C ] // Proceeding of the 27th USENIX Security Symposium . Berkley:USENIX Association , 2018 : 1165 - 1181 .

VAN D M L , HINTON G . Visualizing data using t-SNE [J ] . Journal of machine learning research , 2008 , 9 ( 11 ): 2579 - 2605 .

浏览量

641

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

基于生成对抗网络技术的医疗仿真数据生成方法