浏览全部资源
扫码关注微信
1. 燕山大学信息科学与工程学院,河北 秦皇岛 066004
2. 河北省计算机虚拟技术与系统集成重点实验室,河北 秦皇岛 066004
3. 河北省软件工程重点实验室,河北 秦皇岛 066004
[ "张忠平(1972− ),男,吉林松原人,博士,燕山大学教授,主要研究方向为大数据、数据挖掘、半结构化数据等" ]
[ "刘伟雄(1997− ),男,广东广州人,燕山大学硕士生,主要研究方向为数据挖掘" ]
[ "张玉停(1996− ),男,安徽阜阳人,燕山大学硕士生,主要研究方向为数据挖掘" ]
[ "邓禹(1996− ),男,河北唐山人,燕山大学硕士生,主要研究方向为数据挖掘" ]
[ "魏棉鑫(1997− ),男,广东汕头人,燕山大学硕士生,主要研究方向为数据挖掘" ]
网络出版日期:2021-09,
纸质出版日期:2021-09-25
移动端阅览
张忠平, 刘伟雄, 张玉停, 等. ERDOF:基于相对熵权密度离群因子的离群点检测算法[J]. 通信学报, 2021,42(9):133-143.
Zhongping ZHANG, Weixiong LIU, Yuting ZHANG, et al. ERDOF: outlier detection algorithm based on entropy weight distance and relative density outlier factor[J]. Journal on communications, 2021, 42(9): 133-143.
张忠平, 刘伟雄, 张玉停, 等. ERDOF:基于相对熵权密度离群因子的离群点检测算法[J]. 通信学报, 2021,42(9):133-143. DOI: 10.11959/j.issn.1000-436x.2021152.
Zhongping ZHANG, Weixiong LIU, Yuting ZHANG, et al. ERDOF: outlier detection algorithm based on entropy weight distance and relative density outlier factor[J]. Journal on communications, 2021, 42(9): 133-143. DOI: 10.11959/j.issn.1000-436x.2021152.
针对现有离群点检测算法在复杂数据分布和高维度数据集上精度低的问题,提出了一种基于相对熵权密度离群因子的离群点检测算法。首先引入熵权距离取代欧氏距离以提高离群点检测精度。然后结合自然邻居的概念对数据对象进行高斯核密度估计。同时提出相对距离来刻画数据对象偏离邻域的程度,提高所提算法在低密度区域检测离群点的能力。最后提出相对熵权密度离群因子来刻画数据对象的离群程度。在人工数据集和真实数据集下进行的实验表明,所提算法能有效适应各种数据分布和高维数据的离群点检测。
An outlier detection algorithm based on entropy weight distance and relative density outlier factor was proposed to solve the problem of low accuracy in complex data distribution and high dimensional data sets.Firstly
entropy weight distance was introduced instead of euclidean distance to improve the detection accuracy of outliers.Then
the Gaussian kernel density estimation was carried out for the data object based on the concept of natural neighbor.At the same time
relative distance was proposed to describe the degree of the data object deviating from the neighborhood and improve the ability of the algorithm to detect outliers in the low-density region.Finally
the entropy weight distance and relative density outlier factor were proposed to describe the degree of outliers.Experiments with artificial data sets and real data sets show that the proposed algorithm can effectively adapt to various data distributions and outlier detection of high-dimensional data.
RAMOTSOELA D , ABU-MAHFOUZ A , HANCKE G . A survey of anomaly detection in industrial wireless sensor networks with critical water system infrastructure as a case study [J ] . Sensors , 2018 , 18 ( 8 ): 2491 .
KIRLIDOG M , ASUK C . A fraud detection approach with data mining in health insurance [J ] . Procedia-Social and Behavioral Sciences , 2012 , 62 : 989 - 994 .
ANDRYSIAK T . Sparse representation and overcomplete dictionary learning for anomaly detection in electrocardiograms [J ] . Neural Computing and Applications , 2020 , 32 ( 5 ): 1269 - 1285 .
杨加 , 李笑难 , 张扬 , 等 . 基于大数据分析的校园电子邮件异常行为检测技术研究 [J ] . 通信学报 , 2018 , 39 ( S1 ): 116 - 123 .
YANG J , LI X N , ZHANG Y , et al . Abnormal behavior detection for campus email systems based on big data analysis [J ] . Journal on Communications , 2018 , 39 ( S1 ): 116 - 123 .
DENNING D E . An intrusion-detection model [J ] . IEEE Transactions on Software Engineering , 1987 , 13 ( 2 ): 222 - 232 .
琚安康 , 郭渊博 , 李涛 , 等 . 基于网络通信异常识别的多步攻击检测方法 [J ] . 通信学报 , 2019 , 40 ( 7 ): 57 - 66 .
JU A K , GUO Y B , LI T , et al . Multi-step attack detection method based on network communication anomaly recognition [J ] . Journal on Communications , 2019 , 40 ( 7 ): 57 - 66 .
ROUSSEEUW P J , LEROY A M . Robust regression and outlier detection [M ] . New York : John Wiley & Sons,Inc. , 1987 .
BARNETT V , LEWIS T , ABELES F . Outliers in statistical data [M ] . 3rd ed . Hoboken : John Wiley & Sons , 1994 .
KNORR E M , NG R T , TUCAKOV V . Distance-based outliers:algorithms and applications [J ] . The VLDB Journal , 2000 , 8 ( 3/4 ): 237 - 253 .
KNORR E M , NG R T . A unified approach for mining outliers:properties and computation [C ] // Proceedings of Conference of the Centre for Advanced Studies on Collaborative Research .[S.n.:s.l. ] , 1997 : 219 - 222 .
JAIN A K , MURTY M N , FLYNN P J . Data clustering [J ] . ACM Computing Surveys , 1999 , 31 ( 3 ): 264 - 323 .
ESTER M , KRIEGEL H , SANDER J . A density-based algorithm for discovering clusters a density-based algorithm for discovering clusters in large spatial databases with noise [C ] // International Conference on Knowledge Discovery & Data Mining . New York:ACM Press , 1996 : 226 - 231 .
KARYPIS G , HAN E H , KUMAR V . Chameleon:hierarchical clustering using dynamic modeling [J ] . Computer , 1999 , 32 ( 8 ): 68 - 75 .
BREUNIG M M , KRIEGEL H P , NG R T , et al . LOF:identifying density-based local outliers [C ] // Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data . New York:ACM Press , 2000 : 93 - 104 .
杨晓晖 , 刘晓明 . 基于双向邻居修正的局部异常因子算法 [J ] . 通信学报 , 2020 , 41 ( 8 ): 130 - 140 .
YANG X H , LIU X M . Local outlier factor algorithm based on correction of bidirectional neighbor [J ] . Journal on Communications , 2020 , 41 ( 8 ): 130 - 140 .
ZHANG K , HUTTER M , JIN H D . A new local distance-based outlier detection approach for scattered real-world data [C ] // Advances in Knowledge Discovery and Data Mining . Berlin:Springer , 2009 : 813 - 822 .
WANG L N , FENG C , REN Y J , et al . Local outlier detection based on information entropy weighting [J ] . International Journal of Sensor Networks , 2019 , 30 ( 4 ): 207 .
SCHUBERT E , ZIMEK A , KRIEGEL H P . Generalized outlier detection with flexible kernel density estimates [C ] // Proceedings of the 2014 SIAM International Conference on Data Mining .[S.n.:s.l. ] , 2014 : 542 - 550 .
WAHID A , ANNAVARAPU C S R . NaNOD:a natural neighbour-based outlier detection algorithm [J ] . Neural Computing and Applications , 2021 , 33 ( 6 ): 2107 - 2123 .
ZHU Q S , FENG J , HUANG J L . Natural neighbor:a self-adaptive neighborhood method without parameter K [J ] . Pattern Recognition Letters , 2016 , 80 : 30 - 36 .
OMOHUNDRO S M . Five Balltree construction algorithms [R ] . Technical Report,International Computer Science Institute , 1989 .
ZHANG L W , LIN J , KARIM R . Adaptive kernel density-based anomaly detection for nonlinear systems [J ] . Knowledge-Based Systems , 2018 , 139 : 50 - 63 .
JIN W , TUNG A K H , HAN J W , et al . Ranking outliers using symmetric neighborhood relationship [C ] // Advances in Knowledge Discovery and Data Mining . Berlin:Springer , 2006 : 577 - 593 .
LIU F T , TING K M , ZHOU Z H . Isolation-based anomaly detection [J ] . ACM Transactions on Knowledge Discovery from Data , 2012 , 6 ( 1 ): 1 - 39 .
LATECKI L J , LAZAREVIC A , POKRAJAC D . Outlier detection with kernel density functions [C ] // Machine Learning and Data Mining in Pattern Recognition . Berlin:Springer , 2007 : 61 - 75 .
TANG B , HE H B . A local density-based approach for outlier detection [J ] . Neurocomputing , 2017 , 241 : 171 - 180 .
HUANG J L , ZHU Q S , YANG L J , et al . A non-parameter outlier detection algorithm based on Natural Neighbor [J ] . Knowledge-Based Systems , 2016 , 92 : 71 - 77 .
LI Z , ZHAO Y , BOTTA N , et al . COPOD:copula-based outlier detection [C ] // 2020 IEEE International Conference on Data Mining . Piscataway:IEEE Press , 2020 : 1118 - 1123 .
FLACH P A , . Putting things in order:on the fundamental role of ranking in classification and probability estimation [C ] // European Conference on Principles of Data Mining & Knowledge Discovery . Berlin:Springer , 2007 : 2 - 3 .
0
浏览量
420
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构