基于集成分类器的恶意网络流量检测

汪洁; 杨力立; 杨珉

doi:10.11959/j.issn.1000-436x.2018224

您当前的位置：

首页 >

文章列表页 >

基于集成分类器的恶意网络流量检测

学术通信 | 更新时间：2024-06-05

- 基于集成分类器的恶意网络流量检测
- Multitier ensemble classifiers for malicious network traffic detection
- 通信学报 2018年39卷第10期页码：155-165
- 作者机构：
  
  中南大学信息科学与工程学院，湖南长沙 410083
- 作者简介：
  
  [ "汪洁（1980-），女，湖南桃江人，博士，中南大学副教授，主要研究方向为网络与信息安全等。" ]
  [ "杨力立（1992-），女，布依族，贵州安顺人，中南大学硕士生，主要研究方向为网络与信息安全等。" ]
  [ "杨珉（1993-），男，江西南昌人，中南大学硕士生，主要研究方向为强化学习等。" ]
- 基金信息：
  
  国家自然科学基金资助项目(61202495)
- DOI：10.11959/j.issn.1000-436x.2018224
  中图分类号： TP302
- 网络出版日期：2018-10，
  
  纸质出版日期：2018-10-25
- 稿件说明：
移动端阅览
汪洁, 杨力立, 杨珉. 基于集成分类器的恶意网络流量检测[J]. 通信学报, 2018,39(10):155-165.

Jie WANG, Lili YANG, Min YANG. Multitier ensemble classifiers for malicious network traffic detection[J]. Journal on communications, 2018, 39(10): 155-165.
汪洁, 杨力立, 杨珉. 基于集成分类器的恶意网络流量检测[J]. 通信学报, 2018,39(10):155-165. DOI： 10.11959/j.issn.1000-436x.2018224.

Jie WANG, Lili YANG, Min YANG. Multitier ensemble classifiers for malicious network traffic detection[J]. Journal on communications, 2018, 39(10): 155-165. DOI： 10.11959/j.issn.1000-436x.2018224.

摘要

针对目前网络大数据环境攻击检测中因某些攻击步骤样本的缺失而导致攻击模型训练不够准确的问题，以及现有集成分类器在构建多级分类器时存在的不足，提出基于多层集成分类器的恶意网络流量检测方法。该方法首先采用无监督学习框架对数据进行预处理并将其聚成不同的簇，并对每一个簇进行噪音处理，然后构建一个多层集成分类器 MLDE 检测网络恶意流量。MLDE 集成框架在底层使用基分类器，非底层使用不同的集成元分类器。该框架构建简单，能并发处理大数据集，并能根据数据集的大小来调整集成分类器的规模。实验结果显示，当MLDE的基层使用随机森林、第2层使用bagging集成分类器、第3层使用AdaBoost集成分类器时，AUC的值能达到0.999。

Abstract

A malicious network traffic detection method based on multi-level distributed ensemble classifier was proposed for the problem that the attack model was not trained accurately due to the lack of some samples of attack steps for detecting attack in the current network big data environment

as well as the deficiency of the existing ensemble classifier in the construction of multilevel classifier.The dataset was first preprocessed and aggregated into different clusters

then noise processing on each cluster was performed

and then a multi-level distributed ensemble classifier

MLDE

was built to detect network malicious traffic.In the MLDE ensemble framework the base classifier was used at the bottom

while the non-bottom different ensemble classifiers were used.The framework was simple to be built.In the framework

big data sets were concurrently processed

and the size of ensemble classifier was adjusted according to the size of data sets.The experimental results show that the AUC value can reach 0.999 when MLDE base users random forest was used in the first layer，bagging was used in the second layer and AdaBoost classifier was used in the third layer.

关键词

Keywords

references

MOKHTAR B , ELTOWEISSY M . Big data and semantics management system for computer networks [J ] . Ad Hoc Networks , 2017 , 57 : 32 - 51 .

BROEDERS D , SCHRIJVERS E , SLOOT B VD , et al . Big data and security policies:towards a framework for regulating the phases of analytics and use of big data [J ] . Computer Law ＆ Security Review , 2017 , 33 ( 3 ): 309 - 323 .

MANOGARAN G , THOTA C , KUMAR M V . MetaCloudDataStorage architecture for BIG DATA security in cloud computing [J ] . Procedia Computer Science , 2016 , 87 : 128 - 133 .

XIA Y , CHEN J , LU X , et al . Big traffic data processing framework for intelligent monitoring and recording systems [J ] . Neurocomputing , 2016 , 181 : 139 - 146 .

ZHANG J , LI H , GAO Q , et al . Detecting anomalies from big network traffic data using an adaptive detection approach [J ] . Information Sciences , 2015 , 318 ( C ): 91 - 110 .

SARALADEVI B , PAZHANIRAJA N , PAUL P V , et al . Big data and hadoop-a study in security perspective [J ] . Procedia computer science , 2015 , 50 : 596 - 601 .

WANG H , JIANG X , KAMBOURAKIS G . Special issue on Security,Privacy and Trust in network-based big data [J ] . Information Sciences , 2015 , 318 ( C ): 48 - 50 .

SANCHEZ M I , ZEYDAN E , OLIVA A D L , et al . Mobility management:deployment and adaptability aspects through mobile data traffic analysis [J ] . Computer Communications , 2016 , 95 : 3 - 14 .

刘敬 , 谷利泽 , 钮心忻 , 等 . 基于单分类支持向量机和主动学习的网络异常检测研究 [J ] . 通信学报 , 2012 , 36 ( 11 ): 136 - 146 .

LIU J , GU L Z , NIU X X , et al . Research on network anomaly detection based on one-class SVM and active learning [J ] . Journal on Communications , 2012 , 36 ( 11 ): 136 - 146 .

钱叶魁 , 陈鸣 , 叶立新 . 基于多尺度主成分分析的全网络异常检测方法 [J ] . 软件学报 , 2012 , 23 ( 2 ): 361 - 377 .

QIAN Y K , CHEN M , YE L X . Network-wide anomaly detection method based on multiscale principal component analysis [J ] . Journal of Software , 2012 , 23 ( 2 ): 361 - 377 .

郑黎明 . 大规模通信网络流量异常检测与优化关键技术研究 [D ] . 长沙:国防科技大学 , 2012 .

ZHENG L M . Key Technologies research on traffic anomaly detection and optimization for large-scale networks [D ] . Changsha:National University of Defense Technology , 2012 .

李宇翀 , 罗兴国 , 钱叶魁 , 等 . RMPCM:一种基于健壮多元概率校准模型的全网络异常检测方法 [J ] . 通信学报 , 2015 , 36 ( 11 ): 201 - 212 .

LI Y C , LUO X G , QIAN Y K , et al . Network-wide anomaly detection method based on robust multivariate probabilistic calibration model [J ] . Journal on Communications , 2015 , 36 ( 11 ): 201 - 212 .

ABAWAJY J H , KELAREV A , CHOWDHURY M . Large iterative multitier ensemble classifiers for security of big data [J ] . IEEE Transactions on Emerging Topics in Computing , 2014 , 2 ( 3 ): 352 - 363 .

ABAWAJY J , CHOWDHURY M , KELAREV A . Hybrid consensus pruning of ensemble classifiers for big data malware detection [J ] . IEEE Transactions on Cloud Computing , 2015 ,PP( 99 ): 1 - 1 .

ISLAM R , ABAWAJY J . A multi-tier phishing detection and filtering approach [J ] . Journal of Network and Computer Applications , 2013 , 36 ( 1 ): 324 - 335 .

ISLAM M R , ABAWAJY J , WARREN M . Multi-tier phishing email classification with an impact of classifier rescheduling [C ] // Pervasive Systems,Algorithms,and Networks (ISPAN) . IEEE , 2009 : 789 - 793 .

ISLAM R , SINGH J , CHONKA A , et al . Multi-classifier classification of spam email on a ubiquitous multi-core architecture [C ] // Network and Parallel Computing . IEEE , 2008 : 210 - 217 .

ISLAM MR , ZHOU W , GUO M , et al . An innovative analyser for multi-classifier email classification based on grey list analysis [J ] . Journal of network and computer applications , 2009 , 32 ( 2 ): 357 - 366 .

RUTHERFORD J R , WHITE G B . Using an improved cybersecurity kill chain to develop an improved honey community [C ] // International Conference on System Sciences . 2016 : 2624 - 2632 .

MIHAI I C , PRUNA S , BARBU I D . Cyber kill chain analysis [J ] . Information Security and Cybercrime , 2014 , 3 :37.

DALZIEL H . Securing social media in the enterprise [M ] . Amsterdam : Syngress PublishingPress , 2015 : 7 - 15 .

WINKLER I , GOMES A T . Advanced persistent security [M ] . Amsterdam : Syngress PublishingPress , 2017 : 179 - 184 .

汪洁 , 何小贤 . 基于种子——扩充的多态蠕虫特征自动提取方法 [J ] . 通信学报 , 2014 , 35 ( 9 ): 12 - 19 .

WANG J , HE X X . Automated polymorphic worm signature generation approach based on seed-extending [J ] . Journal on Communications , 2014 , 35 ( 9 ): 12 - 19 .

LINCOLN LABORATORY . 2000 DARPA Intrusion Detection Scenario Specific Data Sets [EB ] . Lexington:Massachusetts Institute of Technology , 2000 .

WANG Y , XIANG Y , ZHANG J , et al . Internet traffic classification using constrained clustering [J ] . IEEE Transactions on Parallel and Distributed Systems , 2014 , 25 ( 11 ): 2932 - 2943 .

MOORE A , ZUEV D , CROGAN M . Discriminators for use in flow-based classification [M ] . London : Queen Mary and Westfield CollegePress , 2005 .

CASAS P , MAZEL J , OWEZARSKI P . Unsupervised network intrusion detection systems:Detecting the unknown without knowledge [J ] . Computer Communications , 2012 , 35 ( 7 ): 772 - 783 .

WANG Y , XIANG Y , ZHANG J , et al . Internet traffic clustering with side information [J ] . Journal of Computer and System Sciences , 2014 , 80 ( 5 ): 1021 - 1036 .

COMAR P M , LIU L , SAHA S , et al . Combining supervised and unsupervised learning for zero-day malware detection [C ] // INFOCOM,2013 Proceedings IEEE . IEEE , 2013 : 2022 - 2030 .

LIM Y , KIM H , JEONG J , et al . Internet traffic classification demystified:on the sources of the discriminative power [C ] // International Conference . ACM , 2010 :9.

HAN J W , KAMBER M , PEI J . Data mining:concepts and techniques,Third Edition [M ] . 3rd ed . San Francisco : Morgan Kaufmann PublishingPress , 2011 : 211 - 321 .

QUINLAN J R . C4.5:programs for machine learning [M ] . Elsevier , 2014 .

PLATT J C . Fast training of support vector machines using sequential minimal optimization [M ] . Advances in kernel methods . MIT Press , 1999 : 185 - 208 .

HÜHN J HÜLLERMEIER E . FURIA:an algorithm for unordered fuzzy rule induction [J ] . Data Mining and Knowledge Discovery , 2009 , 19 ( 3 ): 293 - 319 .

SHALEV-SHWARTZ S , SINGER Y , SREBRO N . Pegasos:Primal estimated sub-gradient solver for SVM [C ] // Proceedings of the 24th international conference on Machine learning . ACM , 2007 : 807 - 814 .

BREIMAN L . Random forests [J ] . Machine learning , 2001 , 45 ( 1 ): 5 - 32 .

RUMELHART D E , HINTON G E , WILLIAMS R J . Learning internal representations by error propagation [R ] . California Univ San Diego La Jolla Inst for Cognitive Science , 1985 .

HALL M A , FRANK E . Combining naive bayes and decision tables [C ] // FLAIRS Conference . 2008 , 2118 : 318 - 319 .

WOLPERT D H . Stacked generalization [J ] . Neural networks , 1992 , 5 ( 2 ): 241 - 259 .

BREIMAN L . Bagging predictors [J ] . Machine learning , 1996 , 24 ( 2 ): 123 - 140 .

FREUND Y , SCHAPIRE R E . Experiments with a new boosting algorithm [C ] // ICML . 1996 96 : 148 - 156 .

WEBB G I . Multiboosting:A technique for combining boosting and wagging [J ] . Machine learning , 2000 , 40 ( 2 ): 159 - 196 .

SEEWALD A K , FÜRNKRANZ J , . An evaluation of grading classifiers [C ] // International Symposium on Intelligent Data Analysis . Springer-Verlag , 2001 : 115 - 124 .

MELVILLE P , MOONEY R J . Constructing diverse classifier ensembles using artificial training examples [C ] // International Joint Conference on Artificial Intelligence.Morgan Kaufmann Publishers Inc . 2003 3 505 - 510 .

KAI M T , WITTEN I H . Stacking bagged and dagged models [C ] // Fourteenth international conference on machine learning.Morgan Kaufmann Publisher Inc . 1997 : 367 - 375 .

WITTEN I H , FRANK E . Data mining:practical machine learning tools and techniques [M ] . Amsterdam : Elsevier/Morgan KaufmanPress , 2011 .

浏览量

1435

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

面向服务质量感知云API推荐系统的数据投毒攻击检测方法

基于简单统计特征的LDoS攻击检测方法

SDN下基于深度学习混合模型的DDoS攻击检测与防御

CCN中基于节点状态模型的缓存污染攻击检测算法

基于异常控制流识别的漏洞利用攻击检测方法