浏览全部资源
扫码关注微信
北京邮电大学智能通信软件与多媒体北京市重点实验室,北京 100876
[ "石磊(1986-),男,内蒙古突泉人,北京邮电大学博士生,主要研究方向为人工智能、数据挖掘、社交网络搜索。" ]
[ "杜军平(1963-),女,北京人,博士,北京邮电大学教授、博士生导师,主要研究方向为人工智能和数据挖掘。" ]
[ "梁美玉(1985-),女,山东泰安人,北京邮电大学副教授、硕士生导师,主要研究方向为信息搜索、数据挖掘、智能信息处理和计算机视觉。" ]
网络出版日期:2018-04,
纸质出版日期:2018-04-25
移动端阅览
石磊, 杜军平, 梁美玉. 基于RNN和主题模型的社交网络突发话题发现[J]. 通信学报, 2018,39(4):189-198.
Lei SHI, Junping DU, Meiyu LIANG. Social network bursty topic discovery based on RNN and topic model[J]. Journal on communications, 2018, 39(4): 189-198.
石磊, 杜军平, 梁美玉. 基于RNN和主题模型的社交网络突发话题发现[J]. 通信学报, 2018,39(4):189-198. DOI: 10.11959/j.issn.1000-436x.2018056.
Lei SHI, Junping DU, Meiyu LIANG. Social network bursty topic discovery based on RNN and topic model[J]. Journal on communications, 2018, 39(4): 189-198. DOI: 10.11959/j.issn.1000-436x.2018056.
社交网络数据是稀疏和嘈杂的,并伴有大量的无意义话题。传统突发话题发现方法无法解决社交网络短文本稀疏性问题,并需要复杂的后处理过程。为了解决上述问题,提出一种基于循环神经网络(RNN
recurrent neural network)和主题模型的突发话题发现(RTM-SBTD)方法。首先,综合RNN和逆序文档频率(IDF
inverse document frequency)构建权重先验来学习词的关系,同时通过构建词对解决短文本稀疏性问题。其次,模型中引入针板先验(spike and slab)来解耦突发话题分布的稀疏和平滑。最后,引入词的突发性来区分建模普通话题和突发话题,实现突发话题自动发现。实验结果表明与现有的主流突发话题发现方法相比,所提 RTM-SBTD 方法在多种评价指标上优于对比算法。
The data is noisy and diverse
with a large number of meaningless topics in social network.The traditional method of bursty topic discovery cannot solve the sparseness problem in social network
and require complicated post-processing.In order to tackle this problem
a bursty topic discovery method based on recurrent neural network and topic model was proposed.Firstly
the weight prior based on RNN and IDF were constructed to learn the relationship between words.At the same time
the word pairs were constructed to solve the sparseness problem.Secondly
the “spike and slab” prior was introduced to decouple the sparsity and smoothness of the bursty topic distribution.Finally
the burstiness of words were leveraged to model the bursty topic and the common topic
and automatically discover the bursty topics.To evaluate the effectiveness of proposed method
the various experiments were conducted.Both qualitative and quantitative evaluations demonstrate that the proposed RTM-SBTD method outperforms favorably against several state-of-the-art methods.
方滨兴 , 贾焰 , 韩毅 . 社交网络分析核心科学问题、研究现状及未来展望 [J ] . 中国科学院院刊 , 2015 ( 2 ): 187 - 199 .
FANG B X , JIA Y , HAN Y . Social network analysis-key research problems,related work,and future prospects [J ] . Bulletin of Chinese Academy of Sciences , 2015 ( 2 ): 187 - 199 .
贾焰 , 甘亮 , 李爱平 . 社交网络智慧搜索研究进展与发展趋势 [J ] . 通信学报 , 2015 , 36 ( 12 ): 9 - 16 .
JIA Y , GAN L , LI A P . Research progress and development trend of online social network smart search [J ] . Journal on Communications , 2015 , 36 ( 12 ): 9 - 16 .
王晓阳 , 郑骁庆 , 肖仰华 . 智慧搜索中的实体与关联关系建模与挖掘 [J ] . 通信学报 , 2015 , 36 ( 12 ): 17 - 27 .
WANG X Y , ZHENG X Q , XIAO Y H . Entity-relation modeling and discovery for smart search [J ] . Journal on Communications , 2015 , 36 ( 12 ): 17 - 27 .
黄河燕 . 在线社交网络的可视化分析 [J ] . 中国科学院院刊 , 2015 ,( 2 ): 229 - 237 .
HUANG H Y . Visual analysis of online social networks [J ] . Bulletin of Chinese Academy of Sciences , 2015 ,( 2 ): 229 - 237 .
唐杰 , 陈文光 . 面向大社交数据的深度分析与挖掘 [J ] . 科学通报 , 2015 , 60 ( 5 ): 509 - 519 .
TANG J , CHEN W G . Deep analytics and mining for big social data [J ] . Chinese Science Bulletin , 2015 , 60 ( 5 ): 509 - 519 .
WANG Y , LIU J , HUANG Y . Using hashtag graph-based topic model to connect semantically-related words without co-occurrence in microblogs [J ] . IEEE Transactions on Knowledge and Data Engineering , 2016 , 28 ( 7 ): 1919 - 1933 .
MCMINN A J , JOSE J M . ,Real-time entity-based event detection for Twitter [C ] // International Conference of the Cross-Language Evaluation Forum for European Languages . 2015 : 65 - 77 .
CHENG X , YAN X , LAN Y . BTM:topic modeling over short texts [J ] . IEEE Transactions on Knowledge & Data Engineering , 2014 , 26 ( 12 ): 2928 - 2941 .
QUAN X , KIT C , GE Y . Short and sparse text topic modeling via self-aggregation [C ] // International Conference on Artificial Intelligence . 2015 : 2270 - 2276 .
HOFFMAN M D , BLEI D M , BACH F . Online learning for latent dirichlet allocation [C ] // International Conference on Neural Information Processing Systems . 2010 : 856 - 864 .
STILO G , VELARDI P . Efficient temporal mining of micro-blog texts and its application to event discovery [J ] . Data Mining and Knowledge Discovery , 2016 , 30 ( 2 ): 372 - 402 .
YAN X , GUO J , LAN Y . A probabilistic model for bursty topic discovery in microblogs [C ] // Twenty-Ninth AAAI Conference on Artificial Intelligence . 2015 : 353 - 359 .
LAU J H , COLLIER N , BALDWIN T . On-line trend analysis with topic models:# twitter trends detection topic model online [C ] // COLING . 2012 : 1519 - 1534 .
CAO Z , LI S , LIU Y . A novel neural topic model and its supervised extension [C ] // 29th AAAI Conference on Artificial Intelligence . 2015 : 2210 - 2216 .
XIE W , ZHU F , JIANG J . Topicsketch:real-time bursty topic detection from twitter [J ] . IEEE Transactions on Knowledge and Data Engineering , 2016 , 28 ( 8 ): 2216 - 2229 .
GAO Y , WEN D , CHEN NS . A novel contextual topic model for multi-document summarization [J ] . Expert Systems with Applications , 2015 , 42 ( 3 ): 1340 - 1352 .
PETROVIĆ S , OSBORNE M , LAVRENKO V . Streaming first story detection with application to twitter [C ] // Human Language Technologies:The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics . 2010 : 181 - 189 .
BECKER H , NAAMAN M , GRAVANO L . Beyond trending topics:real-world event identification on twitter [J ] . ICWSM , 2011 ( 11 ): 438 - 441 .
LI C , SUN A , DATTA A . Twevent:segment-based event detection from tweets [C ] // ACM International Conference on Information and Knowledge Management . 2012 : 155 - 164 .
HUANG J , PENG M , WANG H . A probabilistic method for emerging topic tracking in Microblog stream [J ] . World Wide Web-Internet &Web Information Systems , 2016 , 20 ( 2 ): 1 - 26 .
SUTSKEVER I , MARTENS J , HINTON G E . Generating text with recurrent neural networks [C ] // International Conference on Machine Learning . 2011 : 1017 - 1024 .
AMIRI H , DAUMÉ III H . Short text representation for detecting churn in microblogs [C ] // AAAI . 2016 : 2566 - 2572 .
XIA Y , TANG N , HUSSAIN A . Discriminative bi-term topic model for headline-based social news clustering [C ] // FLAIRS Conference . 2015 : 311 - 316 .
LU H , XIE L Y , KANG N . Don't forget the quantifiable relationship between words:using recurrent neural network for short text topic discovery [C ] // AAAI . 2017 : 1192 - 1198 .
WANG C , BLEI D M . Decoupling sparsity and smoothness in the discrete hierarchical dirichlet process [C ] // Advances in neural information processing systems . 2009 : 1982 - 1989 .
LIN T , TIAN W , MEI Q . The dual-sparse topic model:mining focused topics and focused terms in short text [C ] // International Conference on World Wide Web . 2014 : 539 - 550 .
GRIFFITHS T L , STEYVERS M . Finding scientific topics [J ] . The National Academy of Sciences , 2004 , 101 ( 1 ): 5228 - 5235 .
NEWMAN D , LAU J H , GRIESER K . Automatic evaluation of topic coherence [C ] // Human Language Technologies:The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics . 2010 : 100 - 108 .
0
浏览量
1954
下载量
4
CSCD
关联资源
相关文章
相关作者
相关机构