浏览全部资源
扫码关注微信
1. 国防科技大学计算机学院,湖南 长沙 410073
2. 北京邮电大学计算机学院,北京 100876
[ "邓璐(1989-),女,湖北松滋人,国防科技大学博士生,主要研究方向为社交网络分析、数据挖掘、复杂网络等。" ]
[ "贾焰(1960-),女,四川成都人,国防科技大学教授、博士生导师,主要研究方向为社交网络分析、信息安全等。" ]
[ "方滨兴(1960-),男,江西万年人,中国工程院院士,北京邮电大学教授、博士生导师,主要研究方向为计算机体系结构、计算机网络与信息安全。" ]
[ "周斌(1971-),男,江西南昌人,国防科技大学教授、博士生导师,主要研究方向为社交网络分析、信息安全等。" ]
[ "张涛(1993-),女,湖南常德人,国防科技大学硕士生,主要研究方向为社交网络分析。" ]
[ "刘心(1993-),女,湖南长沙人,国防科技大学硕士生,主要研究方向为社交网络分析。" ]
网络出版日期:2018-08,
纸质出版日期:2018-08-25
移动端阅览
邓璐, 贾焰, 方滨兴, 等. 分布式环境下话题发现算法性能分析[J]. 通信学报, 2018,39(8):176-184.
Lu DENG, Yan JIA, Binxing FANG, et al. Performance analysis of topic detection algorithms in distributed environment[J]. Journal on communications, 2018, 39(8): 176-184.
邓璐, 贾焰, 方滨兴, 等. 分布式环境下话题发现算法性能分析[J]. 通信学报, 2018,39(8):176-184. DOI: 10.11959/j.issn.1000-436x.2018136.
Lu DENG, Yan JIA, Binxing FANG, et al. Performance analysis of topic detection algorithms in distributed environment[J]. Journal on communications, 2018, 39(8): 176-184. DOI: 10.11959/j.issn.1000-436x.2018136.
社交网络成为现在人们生活的一种重要方式,越来越多的人选择通过社交网络表达观点、抒发心情。在海量的数据下,快速发现讨论的内容得到越来越多的研究者的关注,随即出现了大量的话题发现算法。在大规模新浪微博数据环境下,针对3种经典分布式话题发现算法,结合社交网络平台的特点提出了分析性能的测试方案,并根据测试方案比较与分析了3种算法的性能,指出了各算法的优缺点,为后续应用提供参考。
Social network has become a way of life
therefore more and more people choose social network to express their views and feelings.Quickly find what people are talking about in big data gets more and more attention.And a lot of related methods of topic detection spring up in this situation.The performance analysis project was proposed based on the characteristics of social network.According to the project
the performances of some typical topic detection algorithms were tested and compared in large-scale data of Sina Weibo.What’s more
the advantages and disadvantages of these algorithms were pointed out so as to provide references for later applications.
中国互联网络信息中心 第41次《中国互联网络发展状况统计报告》 [R ] . 2018 .
China Internet Network Information Center The 41th statistical report on Internet development in China [R ] . 2018 .
DHILLON I S , MODHA D S . Concept decompositions for large sparse text data using clustering [C ] // Machine Learning . 2001 : 143 - 175 .
KUMMAMURU K , DHAWALE A , KRISHNAPURAM R . Fuzzy co-clustering of documents and keywords [C ] // The IEEE International Conference on Fuzzy Systems . 2003 : 772 - 777 .
ZHAO Y , KARYPIS G . Soft clustering criterion functions for partitional document clustering:a summary of results [C ] // Thirteenth ACM International Conference on Information & Knowledge Management . 2004 : 246 - 247 .
MAKKONEN J , AHONENMYKA H , SALMENKIVI M . Topic detection and tracking with spatio-temporal evidence [C ] // European Conference on Ir Research . 2003 : 251 - 265 .
WU C , WANG B . Extracting topics based on Word2Vec and improved jaccard similarity coefficient [C ] // IEEE Second International Conference on Data Science in Cyberspace . 2017 : 389 - 397 .
HOFMANN T , . Probabilistic latent semantic indexing [C ] // International ACM SIGIR Conference on Research and Development in Information Retrieval . 1999 : 50 - 57 .
BLEI D M , NG A Y , JORDAN M I . Latent dirichlet allocation [J ] . J Machine Learning Research Archive , 2003 , 3 : 993 - 1022 .
STEYVERS M , GRIFFITHS T . Probabilistic topic models [J ] . Handbook of Latent Semantic Analysis , 2007 , 427 ( 7 ): 424 - 440 .
BLEI D , CARIN L , DUNSON D . Probabilistic topic models [C ] // ACM SIGKDD International Conference Tutorials . 2011 :1.
BERNHARD S , JOHN P , THOMAS H . A collapsed variational bayesian inference algorithm for latent dirichlet allocation [C ] // The Twentieth Conference on Neural Information Processing Systems . 2006 : 1353 - 1360 .
GRIFFITHS T L , STEYVERS M . Finding scientific topics [J ] . National Academy of Sciences of the United States of America , 2004 : 5228 - 5235 .
RAMAGE D , . Characterizing microblogs with topic models [C ] // International AAAI Conference on Weblogs and Social Media . 2010 : 130 - 137 .
CHEN Z , LIU B . Mining topics in documents:standing on the shoulders of big data [C ] // ACM SIGKDD International Conference on Knowledge Discovery and Data Mining . 2014 : 1116 - 1125 .
LIN T , TIAN W , MEI Q , et al . The dual-sparse topic model:mining focused topics and focused terms in short text [C ] // International Conference on World Wide Web . 2014 : 539 - 550 .
ZHAI K , BOYD G J , ASADI N , et al . MrLDA:a flexible large scale topic modeling package using variational inference in MapReduce [C ] // International Conference on World Wide Web . 2012 : 879 - 888 .
ARONSSON F . Large scale cluster analysis with Hadoop and Mahout [J ] . Technology & Engineering , 2015 .
MENG X R , BRADLEY J , BURAK Y , et al . MLlib:machine learning in apache spark [J ] . Journal of Machine Learning Research , 2015 , 17 ( 1 ): 1235 - 1241 .
0
浏览量
1076
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构