浏览全部资源
扫码关注微信
哈尔滨工程大学信息安全研究中心,黑龙江 哈尔滨 150001
[ "于淼(1987-),男,黑龙江牡丹江人,哈尔滨工程大学博士生,主要研究方向为数据挖掘、社会计算。" ]
[ "杨武(1974-),男,辽宁宽甸人,博士,哈尔滨工程大学教授、博士生导师,主要研究方向为信息安全、数据挖掘、互联网安全。" ]
[ "王巍(1974-),男,黑龙江哈尔滨人,博士,哈尔滨工程大学副教授,主要研究方向为数据挖掘、网络安全。" ]
[ "申国伟(1986-),男,湖南邵阳人,哈尔滨工程大学博士生,主要研究方向为数据挖掘、信息安全。" ]
网络出版日期:2016-01,
纸质出版日期:2016-01-25
移动端阅览
于淼, 杨武, 王巍, 等. 面向微博的多实体稀疏关系数据联合聚类[J]. 通信学报, 2016,37(1):151-159.
Miao YU, Wu YANG, Wei WANG, et al. Co-clustering of multi-entities sparse relational data in microblogging[J]. Journal on communications, 2016, 37(1): 151-159.
于淼, 杨武, 王巍, 等. 面向微博的多实体稀疏关系数据联合聚类[J]. 通信学报, 2016,37(1):151-159. DOI: 10.11959/j.issn.1000-436x.2016019.
Miao YU, Wu YANG, Wei WANG, et al. Co-clustering of multi-entities sparse relational data in microblogging[J]. Journal on communications, 2016, 37(1): 151-159. DOI: 10.11959/j.issn.1000-436x.2016019.
针对大规模微博中多实体间的稀疏关系数据,提出一种面向多实体稀疏关系数据的高效联合聚类算法。在算法中,为了充分利用多关系数据,提出了一种顽健的约束信息嵌入方法构建关系矩阵,降低了矩阵的稀疏性,进一步提高了算法的准确率。在稀疏约束的块坐标下降框架下,关系矩阵通过非负矩阵三分解算法同时获得不同实体的聚类指示矩阵。非负矩阵分解过程中,通过高效的投射算法实现快速求解,确保了聚类结果的稀疏结构。在人工和真实数据集上的实验表明,算法在 个指标上都具有明显提高,特别是在极端稀疏数据上的效果更加明显。3
For large-scale sparse relation data of multi-entity in microblogging
an efficient co-clustering algorithm was proposed which processed sparse relation data of multi-entity. In order to take full advantage of multi-relational data when using this algorithm
a robust constraint information embedding algorithm was proposed to construct relation ma-trix
and the performance of relation mining was improved by reducing matrix sparsity. In the sparse constraint block coordinate descent framework
relation matrix concurrently obtained cluster indication matrix of different entities by non-negative matrix tri-factorization. In non-negative matrix factorization
to ensure sparse structure of clustering result
a quick solution was achieved through efficient projection algorithm. Experiments on synthetic and real data sets show that proposed algorithm goes beyond all the baselines on three indicators. The improvement is more significant especially when processing extremely sparse data.
GAO D , ZHANG R , LI W , et al . Twitter hyperlink recommendation with user-tweet-hyperlink three-way clustering [C ] // The 21st ACM In-ternational Conference on Information and Knowledge Management. ACM , c 2012 : 2535 - 2538 .
LONG B , ZHANG Z , W X , et al . Spectral clustering for multi-type relational data [C ] // The 23rd International Conference on Machine Learning . Pittsburgh,Pennsylvania,ACM , c 2006 : 585 - 592 .
WANG H , HUANG H , DING C . Simultaneous clustering of mul-ti-type relational data via symmetric nonnegative matrix tri-factorization [C ] // The 20th ACM international Conference on In-formation and Knowledge Management . Glasgow. Scotland, UK, ACM , c 2011 : 279 - 284 .
LIU J , WANG C , GAO J . Multi-view custering via joint nonneg-ative matrix factorization [C ] // 2013 SIAM International Conference on Data Mining.SIAM . c 2013 .
WANG H , HUANG H , DING C . Simultaneous clustering of multi-type relational data via symmetric nonnegative matrix tri-factorization [C ] // The 20th ACM International Conference on Information and Knowledge Management. ACM . c 2011 : 279 - 284 .
LIU Y , SHEN C . Orthogonal nonnegative matrix factoriza ion for multi-type relational clustering [J ] . International Journal of Computer and Information Technolog , 2013 , 2 ( 2 ): 215 - 221 .
WANG H , NIE F , HUANG H , et al . Fast nonnegative matrix tri-factoriza-tion for large-scale data co-clustering [C ] // The 22nd International joint Conference on Artificial Intelligence , China , c 2011 : 1553 - 1558 .
DHILLON I S , MALLELA S , MODHA D S . Information theoret co-clustering [C ] // The 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM , c 2003 : 89 - 98 .
LI T , DING C . The relationships among various nonnegative matrix factorization methods for clustering [C ] // The 6th International Confe-rence on Data Mining , Hong Kong, China , c 2006 : 362 - 371 .
GU Q , ZHOU J . Co-clustering on manifolds [C ] // The 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM , c 2009 : 359 - 368 .
LI P , BU J , CHEN C , et al . Relational co-clustering via manifold ensemble learning [C ] // The 21st ACM International Conference on In-formation and Knowledge Management. ACM , c 2012 : 1687 - 1691 .
HOYER P O . Non-negative matrix factorization with sparseness constraints [J ] . The Journal of Machine Learning Research , 2004 ,( 5 ): 1457 - 1469 .
XING E P , JORDAN M I , RUSSELL S , et al . Distance metric learning. with application to clustering with side-information [C ] // Advances in Neural Information Processing Systems . c 2002 : 505 - 512 .
WANG H , NIE F , HUANG H . Robust distance metric learning via simultaneous l1-norm minimization and maximization [C ] // The 31st International Conference on Machine Learning . c 2014 : 1836 - 1844 .
HSIEH C-J , DHILLON I S . Fast coordinate descent methods with variable selection for non-negative matrix factorization [C ] // The 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM . c 2011 : 1064 - 1072 .
KIM J , HE Y , PARK H . Algorithms for nonnegative matrix and tensor factorizations: a unified view based on block coordinate descent framework [J ] . Journal of Global Optimization , 2013 , 58 ( 2 ): 285 - 319 .
THOM M , PALM G . Efficient sparseness-enforcing projections [J ] . arXiv preprint arXiv:13035259 , 2013 .
CHEN Y H , WANG L J , DONG M . Non-negative matrix factorization for semisupervised heterogeneous data coclustering [J ] . IEEE Transac-tions on Knowledge and Data Engineering , 2010 , 22 ( 10 ): 1459 - 1474 .
ZHAO Y , KARYPIS G . Criterion Functions for Document Clustering: Experiments and analysis [R ] . City , 2001 .
STREHL A , GHOSH J . Cluster ensembles-a knowledge reuse frame-work for combining multiple partitions [J ] . The Journal of Machine Learning Research , 2003 , 3 : 583 - 617 .
HUBERT L , ARABIE P . Comparing partitions [J ] . Journal o Classifi-cation , 1985 , 2 ( 1 ): 193 - 218 .
LOMET A , GOVAERT G , GRANDVALET Y . Design of Artificial Data Tables for Co-clustering Analysis [R ] . City , 2012 .
MCGEE J , CAVERLEE J , CHENG Z . Location prediction in social media based on tie strength [C ] // The 22nd ACM international Confe-rence on Information and Knowledge Management . San Francisco, California, USA,ACM . c 2013 : 459 - 468 .
LI R , WANG S , DENG H , et al . Towards social user profiling: unified. and discriminative influence model for inferring home loca-tions [C ] // The 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining . Beijing, China,ACM , c 2012 : 1023 - 1031 .
0
浏览量
629
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构