浏览全部资源
扫码关注微信
河北大学网络空间安全与计算机学院,河北 保定 071002
[ "杨晓晖(1975- ),男,河北巨鹿人,博士,河北大学教授、硕士生导师,主要研究方向为分布计算、信息安全与可信计算。" ]
[ "张圣昌(1993- ),男,河北邯郸人,河北大学硕士生,主要研究方向为分布式计算与信息安全。" ]
网络出版日期:2019-08,
纸质出版日期:2019-08-25
移动端阅览
杨晓晖, 张圣昌. 基于多粒度级联孤立森林算法的异常检测模型[J]. 通信学报, 2019,40(8):133-142.
Xiaohui YANG, Shengchang ZHANG. Anomaly detection model based on multi-grained cascade isolation forest algorithm[J]. Journal on communications, 2019, 40(8): 133-142.
杨晓晖, 张圣昌. 基于多粒度级联孤立森林算法的异常检测模型[J]. 通信学报, 2019,40(8):133-142. DOI: 10.11959/j.issn.1000-436x.2019132.
Xiaohui YANG, Shengchang ZHANG. Anomaly detection model based on multi-grained cascade isolation forest algorithm[J]. Journal on communications, 2019, 40(8): 133-142. DOI: 10.11959/j.issn.1000-436x.2019132.
孤立森林算法是基于隔离机制的异常检测算法,存在与轴平行的局部异常点无法检测、对高维数据异常点缺乏敏感性和稳定性等问题。针对这些问题,提出了基于随机超平面的隔离机制和多粒度扫描机制,随机超平面使用多个维度的线性组合简化数据模型的隔离边界,利用随机线性分类器的隔离边界能够检测更复杂的数据模式。同时,多粒度扫描机制利用滑动窗口的方式进行维度子采样,每一个维度子集均训练一个森林,多个森林集成投票决策,构造层次化集成学习异常检测模型。实验表明,改进的孤立森林算法对复杂异常数据模式有更好的稳健性,层次化集成学习模型提高了高维数据中异常检测的准确性和稳定性。
The isolation-based anomaly detector
isolation forest has two weaknesses
its inability to detect anomalies that were masked by axis-parallel clusters
and anomalies in high-dimensional data.An isolation mechanism based on random hyperplane and a multi-grained scanning was proposed to overcome these weaknesses.The random hyperplane generated by a linear combination of multiple dimensions was used to simplify the isolation boundary of the data model which was a random linear classifier that can detect more complex data patterns
so that the isolation mechanism was more consistent with data distribution characteristics.The multi-grained scanning was used to perform dimensional sub-sampling which trained multiple forests to generate a hierarchical ensemble anomaly detection model.Experiments show that the improved isolation forest has better robustness to different data patterns and improves the efficiency of anomaly points in high-dimensional data.
GRUBBS F . Procedures for detecting outlying observations in samples [J ] . Technometrics , 1969 , 11 ( 1 ): 1 - 21 .
毛嘉莉 , 金澈清 , 章志刚 , 等 . 轨迹大数据异常检测:研究进展及系统框架 [J ] . 软件学报 , 2017 , 28 ( 1 ): 17 - 34 .
MAO J L , JIN C Q , ZHANG Z G , et al . Anomaly detection for trajectory big data:Advancements and framework [J ] . Journal of Software , 2017 , 28 ( 1 ): 17 - 34 .
ZHANG L , LIN J , KARIM R . Adaptive kernel density-based anomaly detection for nonlinear systems [J ] . Knowledge-Based Systems , 2018 , 139 : 50 - 63 .
BREUNIG M M , KRIEGEL H P , NG R T . LOF:identifying density-based local outliers [C ] // ACM SIGMOD International Conference on Management of Data . ACM , 2000 : 93 - 104 .
付培国 , 胡晓惠 . 基于密度偏倚抽样的局部距离异常检测方法 [J ] . 软件学报 , 2017 , 28 ( 10 ): 2625 - 2639 .
FU P G , HU X H . Anomaly detection algorithm based on the local distance of density-based sampling data [J ] . Journal of Software , 2017 , 28 ( 10 ): 2625 - 2639 .
AHMED M , MAHMOOD A N , HU J . A survey of network anomaly detection techniques [J ] . Journal of Network and Computer Applications , 2016 , 60 : 19 - 31 .
李洪成 , 吴晓平 , 严博 . 面向 MANET 异常检测的分布式遗传k-means 研究 [J ] . 通信学报 , 2015 , 36 ( 11 ): 167 - 173 .
LI H C , WU X P , YAN B . Research on distributed genetic k-means for anomaly detection in MANET [J ] . Journal on Communications , 2015 , 36 ( 11 ): 167 - 173 .
唐成华 , 刘鹏程 , 汤申生 , 等 . 基于特征选择的模糊聚类异常入侵行为检测 [J ] . 计算机研究与发展 , 2015 , 52 ( 3 ): 718 - 728 .
TANG C H , LIU P C , TANG S S , et al . Anomaly intrusion behavior detection based on fuzzy clustering and features selection [J ] . Journal of Computer Research and Development , 2015 , 52 ( 3 ): 718 - 728 .
程国振 , 程东年 , 俞定玖 . 基于多尺度低秩模型的网络异常流量检测方法 [J ] . 通信学报 , 2012 , 33 ( 1 ): 182 - 190 .
CHENG G Z , CHENG D N , YU D J . Network traffic detection based on multi-resolution low rank model [J ] . Journal on Communications , 2012 , 33 ( 1 ): 182 - 190 .
张晶 , 冯林 . 针对动态非平衡数据集鲁棒的在线极端学习机 [J ] . 计算机研究与发展 , 2015 , 52 ( 7 ): 1487 - 1498 .
ZHANG J , FENG L . An algorithm of robust online extreme learning machine for dynamic imbalanced datasets [J ] . Journal of Computer Research and Development , 2015 , 52 ( 7 ): 1487 - 1498 .
GOLDSTEIN M , UCHIDA S . A comparative evaluation of unsupervised anomaly detection algorithms for multivariate data [J ] . PloS one , 2016 , 11 ( 4 ):e0152173.
KAI M T , ZHOU G T , LIU F T , et al . Mass estimation and its applications [C ] // ACM SIGKDD International Conference on Knowledge Discovery and Data Mining . ACM , 2010 : 989 - 998 .
TING K M , ZHOU G T , LIU F T , et al . Mass estimation [J ] . Machine Learning , 2013 , 90 ( 1 ): 127 - 160 .
LIU F T , KAI M T , ZHOU Z H . On detecting clustered anomalies using SCiForest [C ] // European Conference on Machine Learning and Knowledge Discovery in Databases . Springer-Verlag , 2010 : 274 - 290 .
BANDARAGODA T R , KAI M T , ALBRECHT D , et al . Isolation‐based anomaly detection using nearest‐neighbor ensembles [J ] . Computational Intelligence , 2018 , 34 ( 4 ): 968 - 998 .
LIU F T , TING K M , ZHOU Z H . Isolation forest [C ] // The IEEE International Conference on Data Mining . IEEE , 2008 : 413 - 422 .
MUJA M , LOWE D G . Scalable nearest neighbor algorithms for high dimensional data [J ] . IEEE Transactions on Pattern Analysis & Machine Intelligence , 2014 ( 11 ): 2227 - 2240 .
DOMINGUES R , FILIPPONE M , MICHIARDI P , et al . A comparative evaluation of outlier detection algorithms:Experiments and analyses [J ] . Pattern Recognition , 2018 ( 74 ): 406 - 421 .
PHAM B T , PRAKASH I , BUI D T . Spatial prediction of landslides using a hybrid machine learning approach based on random subspace and classify cation and regression trees [J ] . Geomorphology , 2018 ( 303 ): 256 - 270 .
KRAWCZYK B , MINKU L L , GAMA J , et al . Ensemble learning for data stream analysis:a survey [J ] . Information Fusion , 2017 ( 37 ): 132 - 156 .
ROY G , ROY G , ROY G , et al . Robust random cut forest based anomaly detection on streams [C ] // International Conference on International Conference on Machine Learning.JMLR . org , 2016 : 2712 - 2721 .
0
浏览量
1639
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构