CPDGA：基于一致性传播的DGA域名主动检测算法

刘双双; 王志; 董伊萌; 李万鹏

doi:10.11959/j.issn.1000-436x.2025106

您当前的位置：

首页 >

文章列表页 >

CPDGA：基于一致性传播的DGA域名主动检测算法

学术论文 | 更新时间：2025-07-04

- CPDGA：基于一致性传播的DGA域名主动检测算法
- CPDGA: foresee future DGA using proactive conformal propagation
- 通信学报 2025年46卷第6期页码：18-31
- 作者机构：
  
  1.南开大学密码与网络空间安全学院，天津 300350
  2.利物浦大学计算机学院，利物浦 L693BX
- 作者简介：
  
  [ "刘双双（1999- ），女，山东菏泽人，南开大学博士生，主要研究方向为恶意域名、恶意代码检测、后门攻击和密码学等。" ]
  [ "王志（1981- ），男，山西长治人，博士，南开大学副教授，主要研究方向为计算机病毒分析与防治、二进制代码逆向分析等。" ]
  [ "董伊萌（2002- ），女，天津人，南开大学硕士生，主要研究方向为机器学习与深度学习在网络安全中的应用。" ]
  [ "李万鹏（1988- ），男，四川达州人，博士，利物浦大学助理教授，主要研究方向为信息安全、身份管理系统、漏洞挖掘、恶意代码检测等。" ]
- 基金信息：
  
  CCF-绿盟科技“鲲鹏”科研基金资助项目(CCF-NSFOCUS 2024016)
- DOI：10.11959/j.issn.1000-436x.2025106
  中图分类号： TP309.5
- 收稿日期：2025-04-12，
  
  修回日期：2025-05-29，
  
  纸质出版日期：2025-06-25
- 稿件说明：
移动端阅览
刘双双,王志,董伊萌等.CPDGA：基于一致性传播的DGA域名主动检测算法[J].通信学报,2025,46(06):18-31.

LIU Shuangshuang,WANG Zhi,DONG Yimeng,et al.CPDGA: foresee future DGA using proactive conformal propagation[J].Journal on Communications,2025,46(06):18-31.
刘双双,王志,董伊萌等.CPDGA：基于一致性传播的DGA域名主动检测算法[J].通信学报,2025,46(06):18-31. DOI： 10.11959/j.issn.1000-436x.2025106.

LIU Shuangshuang,WANG Zhi,DONG Yimeng,et al.CPDGA: foresee future DGA using proactive conformal propagation[J].Journal on Communications,2025,46(06):18-31. DOI： 10.11959/j.issn.1000-436x.2025106.

摘要

攻击者通过域名生成算法（DGA）动态注册域名以支持恶意软件活动，恶意域名不断演化导致概念漂移现象，使得现有依赖可持续性学习模型的检测技术时效性不足。针对这一威胁，结合一致性预测与一致性聚类方法，提出了一种基于一致性传播的DGA域名主动检测算法（CPDGA）。通过对2019—2023年恶意与良性域名数据集进行实验，证明CPDGA能够有效缓解概念漂移对机器学习检测模型性能的影响，并使检测准确率提升20.4%。此外，CPDGA在检测13种最新对抗模型生成域名时取得了96.42%的准确率，展现了强大的鲁棒性与适应性。

Abstract

Attackers dynamically register domain names through the domain generation algorithm (DGA) to support malware activities. The continuous evolution of malicious domain names leads to the phenomenon of concept drift

rendering the existing detection techniques based on continual learning models less effective over time. To address this threat

by combining conformal prediction and conformal clustering

a foresee future DGA using proactive conformal propagation (CPDGA) was proposed. Experiments were conducted using datasets of malicious and benign domain names from 2019 to 2023. CPDGA was applied to mitigate the effect of concept drift. As a result

the impact of concept drift was effectively reduced. The detection accuracy was improved by 20.4%. Additionally

CPDGA achieves an accuracy rate of 96.42% in detecting the domain names generated by 13 latest adversarial models

showing its strong robustness and adaptability.

关键词

Keywords

references

MOCKAPETRIS P , DUNLAP K J . Development of the domain name system [J ] . ACM SIGCOMM Computer Communication Review , 1988 , 18 ( 4 ): 123 - 133 .

EASTLAKE D , KAUFMAN C . Domain name system security extensions [R ] . 2016 .

PANG J , HENDRICKS J , AKELLA A , et al . Availability, usage, and deployment characteristics of the domain name system [C ] // Proceedings of the 4th ACM SIGCOMM Conference on Internet Measurement . New York : ACM Press , 2004 : 1 - 14 .

WANG J Y , CHITSAZ F , DERBYSHIRE M K , et al . The conserved domain database in 2023 [J ] . Nucleic Acids Research , 2023 , 51 ( 1 ): 384 - 388 .

GHEORGHIȚĂ C A , SMADA D , VEVERA A V , et al . Blacklists and whitelists in the framework of a domain reputation system [J ] . Romanian Journal of Information Technology and Automatic Control , 2023 , 33 ( 4 ): 33 - 46 .

ARORA A . Improving the efficiency of a new malicious domain prediction system [R ] . 2023 .

SKULA I , KVET M . Domain blacklist efficacy for phishing web-page detection over an extended time period [C ] // Proceedings of the 33rd Conference of Open Innovations Association (FRUCT) . Piscataway : IEEE Press , 2023 : 257 - 263 .

SUN X J , LIU Z F . Domain generation algorithms detection with feature extraction and domain center construction [J ] . PLoS One , 2023 , 18 ( 1 ): e0279866 .

PEREIRA M , COLEMAN S , YU B , et al . Dictionary extraction and detection of algorithmically generated domain names in passive DNS traffic [C ] // Proceedings of the 21st International Symposium Research in Attacks, Intrusions, and Defenses . Berlin : Springer , 2018 : 295 - 314 .

YADAV S , REDDY A K K , NARASIMHA REDDY A L , et al . Detecting algorithmically generated domain-flux attacks with DNS traffic analysis [J ] . IEEE/ACM Transactions on Networking , 2012 , 20 ( 5 ): 1663 - 1677 .

ANTONAKAKIS M , PERDISCI R , NADJI Y . From throw-away traffic to bots: Detecting the rise of DGA-based malware [C ] // Proceedings of the 21st USENIX Security Symposium (USENIX Security 12) . Berkeley : USENIX Association , 2012 : 491 - 506 .

YUAN J T , CHEN G X , TIAN S W , et al . Malicious URL detection based on a parallel neural joint model [J ] . IEEE Access , 2021 , 9 : 9464 - 9472 .

SCHIAVONI S , MAGGI F , CAVALLARO L , et al . Phoenix: DGA-based botnet tracking and intelligence [C ] // International Conference on Detection of Intrusions and Malware , and Vulnerability Assessment . Berlin : Springer , 2014 : 192 - 211 .

HASSAOUI M , HANINI M , EL KAFHALI S . Domain generated algorithms detection applying a combination of a deep feature selection and traditional machine learning models [J ] . Journal of Computer Security , 2023 , 31 ( 1 ): 85 - 105 .

LAU S Q . Domain analysis of e-commerce systems using feature-based model templates [D ] . Waterloo : University of Waterloo , 2006 .

HARIRI N , CASTRO-HERRERA C , MIRAKHORLI M , et al . Supporting domain analysis through mining and recommending features from online product listings [J ] . IEEE Transactions on Software Engineering , 2013 , 39 ( 12 ): 1736 - 1752 .

SCHÜPPEN S , TEUBERT D , HERRMANN P , et al . FANCI: Feature-based automated NXDomain classification and intelligence [C ] // Proceedings of the 27th USENIX Security Symposium (USENIX Security 18) . Berkeley : USENIX Association , 2018 : 1165 - 1181 .

ZHAO C , ZHANG Y Z , ZANG T N , et al . A multi-feature-based approach to malicious domain name identification from DNS traffic [C ] // Proceedings of the 2020 27th International Conference on Telecommunications (ICT) . Piscataway : IEEE Press , 2020 : 1 - 5 .

CHENG Y N , CHAI T T , ZHANG Z X , et al . Detecting malicious domain names with abnormal WHOIS records using feature-based rules [J ] . The Computer Journal , 2022 , 65 ( 9 ): 2262 - 2275 .

CHOW T , KAN Z L , LINHARDT L , et al . Drift forensics of malware classifiers [C ] // Proceedings of the 16th ACM Workshop on Artificial Intelligence and Security . New York : ACM Press , 2023 : 197 - 207 .

KAN Z L , MCFADDEN S , ARP D , et al . TESSERACT: eliminating experimental bias in malware classification across space and time (extended version) [J ] . arXiv Preprint , arXiv: 2402.01359 , 2024 .

GIBERT D . Machine learning for windows malware detection and classification: methods, challenges and ongoing research [J ] . arXiv Preprint , arXiv: 2404.18541 , 2024 .

PENDLEBURY F , PIERAZZI F , JORDANEY R , et al . TESSERACT: eliminating experimental bias in malware classification across space and time [C ] // 28th USENIX Security Symposium . Berkeley : USENIX Association , 2019 : 729 - 746 .

ŽLIOBAITĖ I , PECHENIZKIY M , GAMA J . An overview of concept drift applications [C ] // Big Data Analysis: New Algorithms for a New Society . Berlin : Springer , 2015 : 91 - 114 .

SINGHAL S , CHAWLA U , SHOREY R . Machine learning & concept drift based approach for malicious website detection [C ] // Proceedings of the 2020 International Conference on COMmunication Systems & NETworks (COMSNETS) . Piscataway : IEEE Press , 2020 : 582 - 585 .

RUANO-ORDÁS D , FDEZ-RIVEROLA F , MÉNDEZ J R . Concept drift in e-mail datasets: an empirical study with practical implications [J ] . Information Sciences , 2018 , 428 : 120 - 135 .

YUN X C , HUANG J , WANG Y P , et al . Khaos: an adversarial neural network DGA with high anti-detection ability [J ] . IEEE Transactions on Information Forensics and Security , 2019 , 15 : 2225 - 2240 .

GEFFNER J . End-to-end analysis of a domain generating algorithm malware family [R ] . 2013 .

NIE L H , ZHAO L P , LI K Q , et al . A game-based adversarial DGA detection scheme using multi-level incremental random forest [J ] . IEEE Transactions on Network Science and Engineering , 2024 , 11 ( 1 ): 779 - 792 .

BEHRENDS R , DILLON L , FLEMING S , et al . On the kraken and bobax botnets [R ] . 2008 .

DEMŠAR J , BOSNIĆ Z . Detecting concept drift in data streams using model explanation [J ] . Expert Systems with Applications , 2018 , 92 : 546 - 559 .

SHAN S , BHAGOJI A. N , ZHENG H . et al . Poison forensics: traceback of data poisoning attacks in neural networks [C ] // Proceedings of the 31st USENIX Security Symposium (USENIX Security 22) . Berkeley : USENIX Association , 2022 : 3575 - 3592 .

YANG W K , LI Z , LIU M C , et al . Diagnosing concept drift with visual analytics [C ] // Proceedings of the 2020 IEEE Conference on Visual Analytics Science and Technology (VAST) . Piscataway : IEEE Press , 2020 : 12 - 23 .

ZOLA F , BRUSE J L , GALAR M . Temporal analysis of distribution shifts in malware classification for digital forensics [C ] // Proceedings of the 2023 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW) . Piscataway : IEEE Press , 2023 : 439 - 450 .

ZHAO D , LI H , SUN X W , et al . Detecting DGA-based botnets through effective phonics-based features [J ] . Future Generation Computer Systems , 2023 , 143 : 105 - 117 .

YU B , GRAY D L , PAN J , et al . Inline DGA detection with deep networks [C ] // Proceedings of the 2017 IEEE International Conference on Data Mining Workshops (ICDMW) . Piscataway : IEEE Press , 2017 : 683 - 692 .

TRAN D , MAC H , TONG V , et al . A LSTM based framework for handling multiclass imbalance in DGA botnet detection [J ] . Neurocomputing , 2018 , 275 : 2401 - 2413 .

AONZO S , HAN Y , MANTOVANI A , et al . Humans vs. machines in malware classification [C ] // Proceedings of the 32nd USENIX Security Symposium . Berkeley : USENIX Association , 2023 : 1145 - 1162 .

BITAAB M , CHO H , OEST A , et al . Beyond phish: toward detecting fraudulent e-commerce websites at scale [C ] // Proceedings of the 2023 IEEE Symposium on Security and Privacy (SP) . Piscataway : IEEE Press , 2023 : 2566 - 2583 .

ANDERSON H S , WOODBRIDGE J , FILAR B . DeepDGA: adversarially-tuned domain generation and detection [C ] // Proceedings of the 2016 ACM Workshop on Artificial Intelligence and Security . New York : ACM Press , 2016 : 13 - 21 .

PECK J , NIE C , SIVAGURU R , et al . CharBot: a simple and effective method for evading DGA classifiers [J ] . IEEE Access , 2019 , 7 : 91759 - 91771 .

HU X Y , CHEN H , LI M , et al . ReplaceDGA: BiLSTM-based adversarial DGA with high anti-detection ability [J ] . IEEE Transactions on Information Forensics and Security , 2023 , 18 : 4406 - 4421 .

ZAGO M , GIL PÉREZ M , MARTÍNEZ PÉREZ G . UMUDGA: a dataset for profiling DGA-based botnet [J ] . Computers & Security , 2020 , 92 : 101719 .

TUAN T A , LONG H V , TANIAR D . On detecting and classifying DGA botnets and their families [J ] . Computers & Security , 2022 , 113 : 102549 .

SHAFER G , VOVK V . A tutorial on conformal prediction [J ] . arXiv Preprint , arXiv: 0706.3188 , 2007 .

CHERUBIN G , NOURETDINOV I , GAMMERMAN A , et al . Conformal clustering and its application to botnet traffic [C ] // Proceedings of the 3rd International on Statistical Learning and Data Sciences . Berlin : Springer , 2015 : 313 - 322 .

BARBERO F , PENDLEBURY F , PIERAZZI F , et al . Transcending TRANSCEND: revisiting malware classification in the presence of concept drift [C ] // Proceedings of the 2022 IEEE Symposium on Security and Privacy (SP) . Piscataway : IEEE Press , 2022 : 805 - 823 .

PARK S , BASTANI O , KIM T . ACon²: adaptive conformal consensus for provable blockchain oracles [C ] // Proceedings of the 32nd USENIX Security Symposium . Berkeley : USENIX Association , 2023 : 3313 - 3330 .

VAN DER MAATEN L , HINTON G . Visualizing data using t-SNE [J ] . Journal of Machine Learning Research , 2008 , 9 ( 11 ): 2479 - 2605 .

LIANG J B , CHEN S H , WEI Z L , et al . HAGDetector: heterogeneous DGA domain Name detection model [J ] . Computers & Security , 2022 , 120 : 102803 .

XU C Y , SHEN J Z , DU X . Detection method of domain names generated by DGAs based on semantic representation and deep neural network [J ] . Computers & Security , 2019 , 85 : 77 - 88 .

VRANKEN H , ALIZADEH H . Detection of DGA-generated domain names with TF-IDF [J ] . Electronics , 2022 , 11 ( 3 ): 414 .

HE K M , ZHANG X Y , REN S Q , et al . Deep residual learning for image recognition [C ] // Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE Press , 2016 : 770 - 778 .

SHU X , CAO C J , WANG L J , et al . GWDGA: an effective adversarial DGA [C ] // International Conference on Frontiers in Cyber Security . Berlin : Springer , 2022 : 30 - 48 .

SPOOREN J , PREUVENEERS D , DESMET L , et al . Detection of algorithmically generated domain names used by botnets: a dual arms race [C ] // Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing . New York : ACM Press , 2019 : 1916 - 1923 .

LIU Q H , YU G , WANG Y Y , et al . A novel DGA domain adversarial sample generation method by geometric perturbation [C ] // Proceedings of the 3rd International Conference on Advanced Information Science and System . New York : ACM Press , 2021 : 1 - 10 .

NIE L H , SHAN X Y , ZHAO L P , et al . PKDGA: a partial knowledge-based domain generation algorithm for botnets [J ] . IEEE Transactions on Information Forensics and Security , 2023 , 18 : 4854 - 4869 .

SIDI L , NADLER A , SHABTAI A . MaskDGA: an evasion attack against DGA classifiers and adversarial defenses [J ] . IEEE Access , 2020 , 8 : 161580 - 161592 .

ZHAI Y , YANG J , WANG Z X , et al . Cdga: a GAN-based controllable domain generation algorithm [C ] // Proceedings of the 2022 IEEE International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom) . Piscataway : IEEE Press , 2022 : 352 - 360 .

LIU W P , ZHANG Z L , HUANG C , et al . CLETer: a character-level evasion technique against deep learning DGA classifiers [J ] . ICST Transactions on Security and Safety , 2021 , 7 ( 24 ): 168723 .

CORLEY I , LWOWSKI J , HOFFMAN J . DomainGAN: generating adversarial examples to attack domain generation algorithm classifiers [J ] . arXiv Preprint , arXiv: 1911.06285 , 2019 .

ZHENG Y , YANG C , YANG Y Z , et al . ShadowDGA: toward evading DGA detectors with GANs [C ] // Proceedings of the 2021 International Conference on Computer Communications and Networks (ICCCN) . Piscataway : IEEE Press , 2021 : 1 - 8 .

浏览量

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

基于XGBoost和粒子群优化算法的DGA恶意域名识别

基于AGD的恶意域名检测