浏览全部资源
扫码关注微信
1. 燕山大学信息科学与工程学院,河北 秦皇岛 066004
2. 河北省计算机虚拟技术与系统集成重点实验室,河北 秦皇岛 066004
3. 河北省软件工程重点实验室,河北 秦皇岛 066004
[ "仲美玉(1993− ),女,河北邢台人,燕山大学博士生,主要研究方向为智能信息处理" ]
[ "吴培良(1981− ),男,河北石家庄人,博士,燕山大学教授、博士生导师,主要研究方向为自然语言处理、深度强化学习、机器人操作技能学习" ]
[ "窦燕(1968− ),女,陕西西安人,博士,燕山大学教授、硕士生导师,主要研究方向为智能信息处理、机器视觉与模式识别" ]
[ "刘毅(1998− ),男,河北石家庄人,燕山大学硕士生,主要研究方向为智能信息处理、机器视觉" ]
[ "孔令富(1957− ),男,吉林公主岭人,博士,燕山大学教授、博士生导师,主要研究方向为智能控制与智能信息处理、机器人视觉" ]
网络出版日期:2022-11,
纸质出版日期:2022-11-25
移动端阅览
仲美玉, 吴培良, 窦燕, 等. 基于中文语义-音韵信息的语音识别文本校对模型[J]. 通信学报, 2022,43(11):65-79.
Meiyu ZHONG, Peiliang WU, Yan DOU, et al. Chinese semantic and phonological information-based text proofreading model for speech recognition[J]. Journal on communications, 2022, 43(11): 65-79.
仲美玉, 吴培良, 窦燕, 等. 基于中文语义-音韵信息的语音识别文本校对模型[J]. 通信学报, 2022,43(11):65-79. DOI: 10.11959/j.issn.1000-436x.2022222.
Meiyu ZHONG, Peiliang WU, Yan DOU, et al. Chinese semantic and phonological information-based text proofreading model for speech recognition[J]. Journal on communications, 2022, 43(11): 65-79. DOI: 10.11959/j.issn.1000-436x.2022222.
为了研究拼音对检测和纠正语音识别文本错误的影响,提出了一种基于中文语义-音韵信息的文本校对模型。定义了5种拼音编码方法构建字符-音韵嵌入向量,以此作为基于GRU的Seq2Seq模型的输入,并应用注意力机制提取语句的语义-音韵信息来校对语音识别文本错误。针对标注语料不足的问题,提出了一种基于拼音声韵置换的数据增强方法。在 AISHELL-3 公开数据集的实验结果表明,拼音携带的音韵信息有利于校对语音识别文本错误,所提方法可提升模型的检错性能。
To study the influence of Chinese Pinyin on detecting and correcting text errors in speech recognition
a text proofreading model based on Chinese semantic and phonological information was proposed.Five Pinyin coding methods were designed to construct the character-Pinyin embedding vector that was employed as the input of the Seq2Seq model based on gated recurrent unit.At the same time
the attention mechanism was adopted to extract the Chinese semantic and phonological information of sentences to correct speech recognition errors.Aiming at the problem of insufficient labeled corpus
a data augmentation method was introduced
which could automatically obtain annotated corpora by exchanging the initials or finals of Chinese Pinyin.The experimental results on AISHELL-3’s public data show that phonological information is conducive to the text proofreading model to detect and correct text errors after speech recognition
and the proposed data augmentation method can improve the error detection performance of the model.
ERRATTAHI R , HANNANI A E , OUAHMANE H . Automatic speech recognition errors detection and correction:a review [J ] . Procedia Computer Science , 2018 , 128 : 32 - 37 .
ZHANG S L , LEI M , YAN Z J . Investigation of transformer based spelling correction model for CTC-based end-to-end mandarin speech recognition [C ] // Proceedings of the International Speech Communication Association (INTERSPEECH) . Grenoble:International Speech Communication Association , 2019 : 2180 - 2184 .
ZHAO Y , YANG X R , WANG J C , et al . BART based semantic correction for Mandarin automatic speech recognition system [C ] // Proceedings of the International Speech Communication Association (INTERSPEECH) . Grenoble:International Speech Communication Association , 2021 : 2017 - 2021 .
WANG X Q , LIU Y Q , ZHAO S , et al . A light-weight contextual spelling correction model for customizing transducer-based speech recognition systems [C ] // Proceedings of the International Speech Communication Association (INTERSPEECH) . Grenoble:International Speech Communication Association , 2021 : 1982 - 1986 .
ZHANG S L , LEI M , LIU Y , et al . Investigation of modeling units for mandarin speech recognition using dfsmn-ctc-smbr [C ] // Proceedings of 2019 IEEE International Conference on Acoustics,Speech and Signal Processing . Piscataway:IEEE Press , 2019 : 7085 - 7089 .
YANG L , LI Y , WANG J , et al . Post text processing of Chinese speech recognition based on bidirectional LSTM networks and CRF [J ] . Electronics , 2019 , 8 ( 11 ): 1248 .
CHEN Y C , CHENG C Y , CHEN C A , et al . Integrated semantic and phonetic post-correction for Chinese speech recognition [C ] // Proceedings of Conference on Computational Linguistics and Speech Processing (ROCLING) . Stroudsburg:Association for Computational Linguistics , 2021 : 95 - 102 .
LI M , DANILEVSKY M , NOEMAN S , et al . DIMSIM:an accurate Chinese phonetic similarity algorithm based on learned high dimensional encoding [C ] // Proceedings of the 22nd Conference on Computational Natural Language Learning . Stroudsburg:Association for Computational Linguistics , 2018 : 444 - 453 .
DUAN D G , LIANG S H , HAN Z M , et al . Pinyin as a feature of neural machine translation for Chinese speech recognition error correction [C ] // China National Conference on Chinese Computational Linguistics (CCL) . Berlin:Springer , 2019 : 651 - 663 .
JIANG Y , WANG T , LIN T , et al . A rule based Chinese spelling and grammar detection system utility [C ] // Proceedings of 2012 International Conference on System Science and Engineering (ICSSE) . Piscataway:IEEE Press , 2012 : 437 - 440 .
CHU W C , LIN C J . NTOU Chinese spelling check system in SIGHAN-8 bake-off [C ] // Proceedings of the Eighth SIGHAN Workshop on Chinese Language Processing . Stroudsburg:Association for Computational Linguistics , 2015 : 102 - 107 .
XU H D , LI Z L , ZHOU Q Y , et al . Read,listen,and see:leveraging multimodal information helps Chinese spell checking [C ] // Proceedings of Findings of the Association for Computational Linguistics:ACL-IJCNLP 2021 . Stroudsburg:Association for Computational Linguistics , 2021 : 716 - 728 .
王辰成 , 杨麟儿 , 王莹莹 , 等 . 基于Transformer增强架构的中文语法纠错方法 [J ] . 中文信息学报 , 2020 , 34 ( 6 ): 106 - 114 .
WANG C C , YANG L E , WANG Y Y , et al . Chinese grammatical error correction method based on transformer enhanced architecture [J ] . Journal of Chinese Information Processing , 2020 , 34 ( 6 ): 106 - 114 .
段建勇 , 袁阳 , 王昊 . 基于Transformer局部信息及语法增强架构的中文拼写纠错方法 [J ] . 北京大学学报(自然科学版) , 2021 , 57 ( 1 ): 61 - 67 .
DUAN J Y , YUAN Y , WANG H . Chinese spelling correction method based on transformer local information and syntax enhancement ar-chitecture [J ] . Acta Scientiarum Naturalium Universitatis Pekinensis , 2021 , 57 ( 1 ): 61 - 67 .
ZHUANG L , BAO T , ZHU X , et al . A Chinese OCR spelling check approach based on statistical language models [C ] // Proceedings of 2004 IEEE International Conference on Systems,Man and Cybernetics . Piscataway:IEEE Press , 2004 : 4727 - 4732 .
XIE W J , HUANG P J , ZHANG X R , et al . Chinese spelling check system based on N-gram model [C ] // Proceedings of the Eighth SIGHAN Workshop on Chinese Language Processing . Stroudsburg:Association for Computational Linguistics , 2015 : 128 - 136 .
LIU X D , CHENG F , DUH K , et al . A hybrid ranking approach to Chinese spelling check [J ] . ACM Transactions on Asian and Low-Resource Language Information Processing , 2015 , 14 ( 4 ): 1 - 17 .
冯海林 , 张潇 , 刘同存 . 融合评论文本特征和评分图卷积表示的推荐模型 [J ] . 通信学报 , 2022 , 43 ( 3 ): 164 - 171 .
FENG H L , ZHANG X , LIU T C . Recommendation model combining review's feature and rating graph convolutional representation [J ] . Journal on Communications , 2022 , 43 ( 3 ): 164 - 171 .
张煜 , 吕锡香 , 邹宇聪 , 等 . 基于生成对抗网络的文本序列数据集脱敏 [J ] . 网络与信息安全学报 , 2020 , 6 ( 4 ): 109 - 119 .
ZHANG Y , LYU X X , ZOU Y C , et al . Differentially private sequence generative adversarial networks for data privacy masking [J ] . Chinese Journal of Network and Information Security , 2020 , 6 ( 4 ): 109 - 119 .
叶俊民 , 罗达雄 , 陈曙 . 基于层次化修正框架的文本纠错模型 [J ] . 电子学报 , 2021 , 49 ( 2 ): 401 - 407 .
YE J M , LUO D X , CHEN S . A text error correction model based on hierarchical editing framework [J ] . Acta Electronica Sinica , 2021 , 49 ( 2 ): 401 - 407 .
郭可翔 , 王衡军 , 白祉旭 . 融合多通道CNN与BiGRU的字词级文本错误检测模型 [J ] . 计算机工程 , 2022 , 48 ( 9 ): 63 - 70 .
GUO K X , WANG H J , BAI Z X . Detection model for word-level text error combining multi-channel CNN and BiGRU [J ] . Computer Engi-neering , 2022 , 48 ( 9 ): 63 - 70 .
WANG D M , SONG Y , LI J , et al . A hybrid approach to automatic corpus generation for Chinese spelling check [C ] // Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing . Stroudsburg:Association for Computational Linguistics , 2018 : 2517 - 2527 .
WANG D M , TAY Y , ZHONG L . Confusionset-guided pointer networks for Chinese spelling check [C ] // Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics . Stroudsburg:Association for Computational Linguistics , 2019 : 5780 - 5785 .
CHOLLAMPATT S , NG H T . A multilayer convolutional encoder-decoder neural network for grammatical error correction [C ] // Proceedings of the AAAI Conference on Artificial Intelligence . Palo Alto:AAAI Press , 2018 : 5755 - 5762 .
LIU C L , LAI M H , TIEN K W , et al . Visually and phonologically similar characters in incorrect Chinese words [J ] . ACM Transactions on Asian Language Information Processing , 2011 , 10 ( 2 ): 1 - 39 .
WANG H , WANG B , DUAN J Y , et al . Chinese spelling error detection using a fusion lattice LSTM [J ] . ACM Transactions on Asian and Low-Resource Language Information Processing , 2021 , 20 ( 2 ): 1 - 11 .
LIU S L , YANG T , YUE T C , et al . PLOME:pre-training with misspelled knowledge for Chinese spelling correction [C ] // Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1:Long Papers) . Stroudsburg:Association for Computational Linguistics , 2021 : 2991 - 3000 .
HONG Y Z , YU X G , HE N , et al . FASPell:a fast,adaptable,simple,powerful Chinese spell checker based on DAE-decoder paradigm [C ] // Proceedings of the 5th Workshop on Noisy User-generated Text (W-NUT 2019) . Stroudsburg:Association for Computational Linguistics , 2019 : 160 - 169 .
ZHANG S H , HUANG H R , LIU J C , et al . Spelling error correction with soft-masked BERT [C ] // Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics . Stroudsburg:Association for Computational Linguistics , 2020 : 882 - 890 .
CHENG X Y , XU W D , CHEN K L , et al . SpellGCN:incorporating phonological and visual similarities into language models for Chinese spelling check [C ] // Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics . Stroudsburg:Association for Computational Linguistics , 2020 : 871 - 881 .
JI T , YAN H , QIU X P . SpellBERT:a lightweight pretrained model for Chinese spelling check [C ] // Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing . Stroudsburg:Association for Computational Linguistics , 2021 : 3544 - 3551 .
ZHANG R Q , PANG C , ZHANG C Q , et al . Correcting Chinese spelling errors with phonetic pre-training [C ] // Proceedings of Findings of the Association for Computational Linguistics:ACL-IJCNLP 2021 . Stroudsburg:Association for Computational Linguistics , 2021 : 2250 - 2261 .
TSENG Y H , LEE L H , CHANG L P , et al . Introduction to SIGHAN 2015 bake-off for Chinese spelling check [C ] // Proceedings of the Eighth SIGHAN Workshop on Chinese Language Processing . Stroudsburg:Association for Computational Linguistics , 2015 : 32 - 37 .
CHO K , MERRIENBOER B V , GULCEHRE C , et al . Learning phrase representations using RNN encoder-decoder for statistical machine translation [C ] // Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) . Stroudsburg:Association for Computational Linguistics , 2014 : 1724 - 1734 .
SUTSKEVER I , VINYALS O , LE Q V . Sequence to sequence learning with neural networks [C ] // Annual Conference on Neural Information Processing Systems (NeurIPS) . Cambridge:MIT Press , 2014 : 3104 - 3112 .
GRUNDKIEWICZ R , JUNCZYS-DOWMUNT M ,, . Near human-level performance in grammatical error correction with hybrid machine translation [C ] // Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies,Volume 2 (Short Papers) . Stroudsburg:Association for Computational Linguistics , 2018 : 284 - 290 .
LUONG T , PHAM H , MANNING C D . Effective approaches to attention-based neural machine translation [C ] // Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing . Stroudsburg:Association for Computational Linguistics , 2015 : 1412 - 1421 .
SHI Y , BU H , XU X , et al . AISHELL-3:a multi-speaker mandarin TTS corpus and the baselines [J ] . arXiv Preprint,arXiv:2010.11567 , 2020 .
POVEY D , GHOSHAL A , BOULIANNE G , et al . The Kaldi speech recognition toolkit [C ] // IEEE Workshop on Automatic Speech Recognition and Understanding (CONF) . Piscataway:IEEE Press , 2011 : 1 - 4 .
王宁 . 通用规范汉字字典 [M ] . 北京 : 商务印书馆 , 2013 .
WANG N . The general specification Chinese character dictionary [M ] . Beijing : The Commercial Press , 2013 .
0
浏览量
345
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构