浏览全部资源
扫码关注微信
1. 信息工程大学密码工程学院,河南 郑州 450001
2. 郑州大学软件学院,河南 郑州 450000
3. 信息工程大学信息技术研究所,河南 郑州 450001
[ "张晗(1985- ),女,河南项城人,信息工程大学博士生,主要研究方向为自然语言处理、信息安全" ]
[ "胡永进(1981- ),男,山东潍坊人,信息工程大学讲师,主要研究方向为主动防御、态势感知" ]
[ "郭渊博(1975- ),男,陕西周至人,博士,信息工程大学教授、博士生导师,主要研究方向为大数据安全、态势感知" ]
[ "陈吉成(1984- ),男,江苏涟水人,信息工程大学博士生,主要研究方向为复杂网络、信息内容安全" ]
网络出版日期:2020-02,
纸质出版日期:2020-02-25
移动端阅览
张晗, 胡永进, 郭渊博, 等. 信息安全领域内实体共指消解技术研究[J]. 通信学报, 2020,41(2):165-175.
Han ZHANG, Yongjin HU, Yuanbo GUO, et al. Research on coreference resolution technology of entity in information security[J]. Journal on communications, 2020, 41(2): 165-175.
张晗, 胡永进, 郭渊博, 等. 信息安全领域内实体共指消解技术研究[J]. 通信学报, 2020,41(2):165-175. DOI: 10.11959/j.issn.1000-436x.2020033.
Han ZHANG, Yongjin HU, Yuanbo GUO, et al. Research on coreference resolution technology of entity in information security[J]. Journal on communications, 2020, 41(2): 165-175. DOI: 10.11959/j.issn.1000-436x.2020033.
针对信息安全领域内的共指消解问题,提出了一个混合型方法。该方法在原来BiLSTM-attention-CRF模型的基础上引入领域词典匹配机制,将其与文档层面的注意力机制相结合,作为一种新的基于字典的注意力机制,来解决从文本中提取候选词时对稀有实体以及长度较长的实体识别能力稍弱的问题,并通过总结领域文本特征,将提取出的待消解候选词根据词性分别采用规则与机器学习的方式进行消解,以提高准确性。通过在安全领域数据集的实验,分别从共指消解以及提取候选词并分类2个方面证明了方法的优越性。
To solve the problem of coreference resolution in information security
a hybrid method was proposed.Based on the BiLSTM-attention-CRF model
the domain-dictionary matching mechanism was introduced and combined with the attention mechanism at the document level.As a new dictionary-based attention mechanism
the word features were calculated to solve the problem of weak recognition ability of rare entities and entities with long length when extracting candidates from text.And by summarizing the features of the domain texts
the candidates were coreferenced by rules and machine learning according to the part of speech to improve the accuracy.Through the experiments on security data set
the superiority of the method is proved from the aspects of coreference resolution and extraction of candidates from text .
SUKTHANKER R , PORIA S , CAMBRIA E , et al . Anaphora and coreference resolution:a review [J ] . arXiv Preprint,arXiv:1805.11824 , 2018
VASWANI A , BENGIO S , BREVDO E , et al . Tensor2tensor for neural machine translation [J ] . arXiv Preprint,arXiv:1803.07416 , 2018 .
LAMPLE G , OTT M , CONNEAU A , et al . Phrase-based & neural unsupervised machine translation [J ] . arXiv Preprint,arXiv:1804.07755 , 2018 .
CHEN M X , FIRAT O , BAPNA A , et al . The best of both worlds:Combining recent advances in neural machine translation [J ] . arXiv Preprint,arXiv:1804.09849 , 2018 .
CAMBRIA E , PORIA S , HAZARIKA D , et al . SenticNet 5:discovering conceptual primitives for sentiment analysis by means of context embeddings [C ] // Thirty-Second AAAI Conference on Artificial Intelligence . 2018 : 1795 - 1802 .
ETTER M , COLLEONI E , ILLIA L , et al . Measuring organizational legitimacy in social media:assessing citizens’ judgments with sentiment analysis [J ] . Business & Society , 2018 , 57 ( 1 ): 60 - 97 .
MA Y , PENG H , CAMBRIA E . Targeted aspect-based sentiment analysis via embedding commonsense knowledge into an attentive LSTM [C ] // Thirty-Second AAAI Conference on Artificial Intelligence . 2018 : 5876 - 5883 .
ZENG D , DAI Y , LI F , et al . Adversarial learning for distant supervised relation extraction [J ] . Computers,Materials & Continua , 2018 , 55 ( 1 ): 121 - 136 .
GÁBOR K , BUSCALDI D , SCHUMANN A K , et al . Semeval-2018 task 7:Semantic relation extraction and classification in scientific papers [C ] // The 12th International Workshop on Semantic Evaluation . 2018 : 679 - 688 .
QIN P , XU W , WANG W Y . DSGAN:generative adversarial training for distant supervision relation extraction [J ] . arXiv Preprint,arXiv:1805.09929 , 2018 .
LIU F , FLANIGAN J , THOMSON S , et al . Toward abstractive summarization using semantic representations [J ] . arXiv Preprint,arXiv:1805.10399 , 2018 .
CHEN Y C , BANSAL M . Fast abstractive summarization with reinforce-selected sentence rewriting [J ] . arXiv Preprint,arXiv:1805.11080 , 2018 .
HOBBS J R . Resolving pronoun references [J ] . Lingua , 1978 , 44 ( 4 ): 311 - 338 .
BRENNAN S E , FRIEDMAN M W , POLLARD C J . A centering approach to pronouns [C ] // The 25th Annual Meeting on Association for Computational Linguistics . 1987 : 155 - 162 .
LAPPIN S , LEASS H J . An algorithm for pronominal anaphora resolution [J ] . Computational Linguistics , 1994 , 20 ( 4 ): 535 - 561 .
LEE H , CHANG A , PEIRSMAN Y , et al . Deterministic coreference resolution based on entity-centric,precision-ranked rules [J ] . Computational Linguistics , 2013 , 39 ( 4 ): 885 - 916 .
SOON W M , NG H T , LIM D C Y . A machine learning approach to coreference resolution of noun phrases [J ] . Computational Linguistics , 2001 , 27 ( 4 ): 521 - 544 .
AONE C , BENNETT S W . Evaluating automated and manual acquisition of anaphora resolution strategies [C ] // The 33rd Annual Meeting on Association for Computational Linguistics . 1995 : 122 - 129 .
LEE H , SURDEANU M , JURAFSKY D . A scaffolding approach to coreference resolution integrating statistical and rule-based models [J ] . Natural Language Engineering , 2017 , 23 ( 5 ): 733 - 762 .
钱伟 , 郭以昆 , 周雅倩 , 等 . 基于最大熵模型的英文名词短语指代消解 [J ] . 计算机研究与发展 , 2003 , 40 ( 9 ): 1337 - 1343 .
QIAN W , GUO Y K , ZHOU Y Q , et al . English noun phrase coreference resolution via a maximum entropy model [J ] . Journal of Computer Research and Development , 2003 , 40 ( 9 ): 1337 - 1343 .
WISEMAN S , RUSH A M , SHIEBER S , et al . Learning anaphoricity and antecedent ranking features for coreference resolution [C ] // The 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing . 2015 : 1416 - 1426 .
LEE K , HE L , LEWIS M , et al . End-to-end neural coreference resolution [J ] . arXiv Preprint,arXiv:170707045 , 2017 .
ZHANG R , SANTOS C N , YASUNAGA M , et al . Neural coreference resolution with deep biaffine attention by joint mention detection and mention clustering [J ] . arXiv Preprint,arXiv:1805.04893 , 2018 .
WISEMAN S , RUSH A M , SHIEBER S M . Learning global features for coreference resolution [J ] . arXiv Preprint,arXiv:160403035 , 2016 .
CLARK K , MANNING C D . Deep reinforcement learning for mention-ranking coreference models [C ] // The 2016 Conference on Empirical Methods in Natural Language Processing . 2016 : 2256 - 2262 .
DODDINGTON G R , MITCHELL A , PRZYBOCKI M A , et al . The automatic content extraction (ACE) program-tasks,data,and evaluation [C ] // LREC . 2004 :1.
PRADHAN S , MOSCHITTI A , XUE N , et al . Conll-2012 shared task:modeling multilingual unrestricted coreference in ontonotes [C ] // Joint Conference on EMNLP and CoNLL-Shared Task,Association for Computational Linguistics . 2012 : 1 - 40 .
GUILLOU L , HARDMEIER C , SMITH A , et al . Parcor 1.0:a parallel pronoun-coreference corpus to support statistical mt [C ] // 9th International Conference on Language Resources and Evaluation (LREC) . 2014 : 3191 - 3198 .
HAGHIGHI A , KLEIN D . Simple coreference resolution with rich syntactic and semantic features [C ] // The 2009 Conference on Empirical Methods in Natural Language Processing . 2009 : 1152 - 1161 .
BBN Technologies.2006.Coreference Guidelines for English OntoNotes-Version 6.0 .
张晗 , 郭渊博 , 李涛 . 结合GAN与BiLSTM-Attention-CRF的领域命名实体识别 [J ] . 计算机研究与发展 , 2019 , 56 ( 9 ): 1851 - 1858 .
ZHANG H , GUO Y B , LI T . Domain named entity recognition combining GAN and BiLSTM-Attention-CRF [J ] . Journal of Computer Research and Development , 2019 , 56 ( 9 ): 1851 - 1858 .
PUSTEJOVSKY J , CASTANO J , SAURI R , et al . (2002)Medstract:creating large-scale information servers for biomedical libraries [C ] // The ACL-02 Workshop on Natural Language Processing in the Biomedical Domain . 2002 : 85 - 92 .
SU J , YANG X , HONG H , et al . Coreference resolution in biomedical texts:a machine learning approach [C ] // Dagstuhl Seminar Proceedings . 2008 .
D’SOUZA J , NG V . Anaphora resolution in biomedical literature:a hybrid approach [C ] // ACM Conference on Bioinformatics . ACM , 2012 : 113 - 122 .
韩旭 . 基于神经网络的文本特征表示关键技术研究 [D ] . 北京:北京邮电大学 , 2019 .
HAN X . Research on key technologies of text feature representation based on neural network [D ] . Beijing:Beijing University of Posts and Telecommunications , 2019 .
ZHANG H , GUO Y B , LI T . Multifeature named entity recognition in information security based on adversarial learning [J ] . Security and Communication Networks , 2019 , 2019 ( 2 ): 1 - 9 .
KINGMA D P , BA J . Adam:a method for stochastic optimization [J ] . arXiv Preprint,arXiv:1412.6980 , 2014 .
0
浏览量
459
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构