基于自编码器的未知协议分类方法

顾纯祥; 吴伟森; 石雅男; 李光松

doi:10.11959/j.issn.1000-436x.2020123

您当前的位置：

首页 >

文章列表页 >

基于自编码器的未知协议分类方法

学术论文 | 更新时间：2024-06-05

- 基于自编码器的未知协议分类方法
- Method of unknown protocol classification based on autoencoder
- 通信学报 2020年41卷第6期页码：88-97
- 作者机构：
  
  1. 信息工程大学网络空间安全学院，河南郑州 450001
  2. 网络密码技术河南省重点实验室，河南郑州 450001
- 作者简介：
  
  [ "顾纯祥（1976- ），男，安徽霍山人，博士，信息工程大学教授、博士生导师，网络密码技术河南省重点实验室主任，主要研究方向为密码学与网络安全" ]
  [ "吴伟森（1996- ），男，浙江天台人，信息工程大学硕士生，主要研究方向为网络安全、机器学习" ]
  [ "石雅男（1982- ），女，河南安阳人，信息工程大学讲师，主要研究方向为安全协议分析" ]
  [ "李光松（1977- ），男，山东德州人，博士，信息工程大学副教授，主要研究方向为网络协议分析、区块链、无线网络安全" ]
- 基金信息：
  
  国家自然科学基金资助项目(61772548);国家自然科学基金创新群体基金资助项目(61521003);信息保障技术重点实验室开放基金资助项目(KJ-17-001)
- DOI：10.11959/j.issn.1000-436x.2020123
  中图分类号： TP181
- 网络出版日期：2020-06，
  
  纸质出版日期：2020-06-25
- 稿件说明：
移动端阅览
顾纯祥, 吴伟森, 石雅男, 等. 基于自编码器的未知协议分类方法[J]. 通信学报, 2020,41(6):88-97.

Chunxiang GU, Weisen WU, Ya’nan SHI, et al. Method of unknown protocol classification based on autoencoder[J]. Journal on communications, 2020, 41(6): 88-97.
顾纯祥, 吴伟森, 石雅男, 等. 基于自编码器的未知协议分类方法[J]. 通信学报, 2020,41(6):88-97. DOI： 10.11959/j.issn.1000-436x.2020123.

Chunxiang GU, Weisen WU, Ya’nan SHI, et al. Method of unknown protocol classification based on autoencoder[J]. Journal on communications, 2020, 41(6): 88-97. DOI： 10.11959/j.issn.1000-436x.2020123.

摘要

针对互联网中存在的大量未知协议导致网络管理和维护网络安全十分困难的问题，提出了一种未知协议的分类识别方法。结合自编码器技术和改进的K-means聚类技术针对网络流量实现了未知协议的分类识别。利用自编码器对网络流量进行降维和特征提取，使用聚类技术对降维后数据进行无监督的分类，最终实现对网络流量的无监督识别分类。实验结果表明，所提方法分类效果优于传统的 K-means、DBSCAN、GMM 算法，且具有更高的效率。

Abstract

Aiming at the problem that a large number of unknown protocols exist in the Internet

which makes it very difficult to manage and maintain the network security

a classification and identification method of unknown protocols was proposed.Combined with the autoencoder technology and the improved K-means clustering technology

the unknown protocol was classified and identified for the network traffic.The autoencoder was used to reduce dimensionality and select features of network traffic

clustering technology was used to classify the dimensionality reduction data unsupervised

and finally unsupervised recognition and classification of network traffic were realized.Experimental results show that the classification effect is better than the traditional K-means

DBSCAN

GMM algorithm

and has higher efficiency.

关键词

Keywords

references

吴礼发 , 洪征 , 潘瑶 . 网络协议逆向分析及应用 [M ] . 北京 : 国防工业出版社 , 2016 .

WU L F , HONG Z , PAN Y . Network protocol reverse analysis and application [M ] . Beijing : National Defense Industry PressPress , 2016 .

ANDERSON B , MCGREW D . Machine learning for encrypted malware traffic classification:accounting for noisy labels and non-stationarity [C ] // Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining . New York:ACM Press , 2017 : 1723 - 1732 .

HINTON G , SALAKHUTDINOV R . Reducing the dimensionality of data with neural networks [J ] . Science , 2006 , 313 ( 5786 ): 504 - 507 .

QI Y , XU L , YANG B , et al . Packet classification algorithms:from theory to practice [J ] . Proceedings - IEEE INFOCOM , 2009 , 13 ( 10 ): 648 - 656 .

FIVOS C , PANAYIOTIS M . Identifying known and unknown peer-to-peer traffic [C ] // Proceedings of IEEE International Symposium on Network Computing ＆ Applications . Piscataway:IEEE Press , 2006 : 93 - 102 .

THAY C , VISOOTTIVISETH V , MONGKOLLUKSAMEE S . P2P traffic classification for residential network [C ] // Computer Science ＆Engineering Conference . Piscataway:IEEE Press , 2016 : 1 - 6 .

CHUNG J , PARK B , WON Y , et al . Traffic classification based on flow similarity [C ] // IEEE International Workshop on IP Operations ＆Management . Berlin:Springer , 2009 : 65 - 77 .

ROCHA E , SALVADOR P , NOGUEIRA A . Detection of illicit network activities based on multivariate Gaussian fitting of multi-scale traffic characteristics [C ] // 2011 IEEE International Conference on Communications . Piscataway:IEEE Press , 2011 : 1 - 6 .

TAYLOR V , SPOLAOR R , CONTI M , et al . Robust smartphone App identification via encrypted network traffic analysis [J ] . IEEE Transactions on Information Forensics ＆ Security , 2017 , 13 ( 1 ): 63 - 78 .

BLAKE A , SUBHARTHI P , DAVID M . Deciphering malware’s use of TLS (without decryption) [J ] . arXiv Preprint,arXiv:1607.01639 , 2017

WANG W , ZHU M , ZENG X , , et al . Malware traffic classification using convolutional neural network for representation learning [C ] // 2017 International Conference on Information Networking . Piscataway:IEEE Press , 2017 : 712 - 717 .

YANG Y , KANG C , GOU G , et al . TLS/SSL encrypted traffic classification with autoencoder and convolutional neural network [C ] // 2018 IEEE 20th International Conference on High Performance Computing and Communications; IEEE 16th International Conference on Smart City; IEEE 4th International Conference on Data Science and Systems (HPCC/SmartCity/DSS) . Piscataway:IEEE Press , 2018 : 362 - 369 .

MA R , QIN S . Identification of unknown protocol traffic based on deep learning [C ] // 2017 3rd IEEE International Conference on Computer and Communications . Piscataway:IEEE Press , 2017 : 1195 - 1198 .

ZHANG J , CHEN C , XIANG Y , et al . An effective network traffic classification method with unknown flow detection [J ] . IEEE Transactions on Network and Service Management , 2013 , 10 ( 2 ): 133 - 147 .

ZHU P , ZHANG S , LUO H , et al . A semi-supervised method for classifying unknown protocols [C ] // 2019 IEEE 3rd Information Technology,Networking,Electronic and Automation Control Conference . Piscataway:IEEE Press , 2019 : 1246 - 1250 .

ZANDER S , NGUYEN T , ARMITAGE G . Automated traffic classification and application identification using machine learning [C ] // IEEE Conference on Local Computer Networks . Piscataway:IEEE Press , 2005 ： 250 - 257 .

ERMAN J , ARLITT M , MAHANTI A . Traffic classification clustering algorithms [C ] // Proceedings of SIGMETRICS . New York:ACM Press , 2006 : 281 - 286 .

卢政宇 , 李光松 , 申莹珠 , 等 . 基于连续特征的未知协议消息聚类算法 [J ] . 山东大学学报(理学版) , 2019 , 54 ( 5 ): 37 - 43 .

LU Z Y , LI G S , SHEN Y Z , et al . Clustering algorithm of unknown protocol messages based on continuous features [J ] . Journal of Shandong University (Science Edition) , 2019 , 54 ( 5 ): 37 - 43 .

DING C , HE X . Cluster structure of K-means clustering via principal component analysis [J ] . Lecture Notes in Computer Science , 2004 , 46 ( 4 ): 414 - 418 .

CHEN X , KINGMA D , SALIMANS T , et al . Variational lossy autoencoder [J ] . arXiv preprint arXiv:1611.02731 , 2016

DENG J , ZHANG Z , EYBEN F , et al . Autoencoder-based unsupervised domain adaptation for speech emotion recognition [J ] . IEEE Signal Processing Letters , 2014 , 21 ( 9 ): 1068 - 1072 .

BENGIO Y , LAMBLIN P , POPOVICI D , et al . Greedy layer-wise training of deep networks [C ] // Neural Information Processing Systems . Massachusetts:MIT Press , 2007 : 153 - 160 .

VINCENT P , LAROCHELLE H , BENGIO Y , et al . Extracting and composing robust features with denoising autoencoders [C ] // Machine Learning,Proceedings of the Twenty-Fifth International Conference . New York:ACM Press , 2008 : 1096 - 1103 .

RIFAI S , VINCENT P , MULLER X , et al . Contractive auto-encoders:explicit invariance during feature extraction [C ] // Proceedings of the 28th International Conference on Machine Learning . New York:ACM Press , 2011 : 833 - 840 .

HARTIGAN J , WONG M . Algorithm AS 136:a K-means clustering algorithm [J ] . Journal of the Royal Statistical Society.Series C (Applied Statistics) , 1979 , 28 ( 1 ): 100 - 108 .

SELIM S , ISMAIL M . K-means-type algorithms:a generalized convergence theorem and characterization of local optimality [J ] . IEEE Transactions on Pattern Analysis and Machine Intelligence , 1984 , 6 ( 1 ): 81 - 87 .

LAURENS V , HINTON G . Visualizing data using T-SNE [J ] . Journal of Machine Learning Research , 2008 , 9 ( 2605 ): 2579 - 2605 .

MAATEN L . Learning a Parametric embedding by preserving local structure [J ] . Journal of Machine Learning Research , 2009 ( 5 ): 384 - 391 .

HALKIDI M , VAZIRGIANNIS M . Clustering validity assessment:finding the optimal partitioning of a data set [C ] // IEEE International Conference on Data Mining . Piscataway:IEEE Press , 2001 :187.

LIU Y , LI Z , XIONG H , et al . Understanding of internal clustering validation measures [C ] // 2010 IEEE International Conference on Data Mining . Piscataway:IEEE Press , 2010 : 911 - 916 .

HUBERT L , ARABIE P . Comparing partitions [J ] . Journal of Classification , 1985 , 2 ( 1 ): 193 - 218 .

浏览量

738

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

自编码器及其应用综述

基于全局-局部散度的多元时间序列无监督降维方法

基于参数回归的快速全景图像拼接算法

面向6G的深度图像语义通信模型

DDAC：面向卷积神经网络图像隐写分析模型的特征提取方法