浏览全部资源
扫码关注微信
信息工程大学 信息系统工程学院,河南 郑州 450000
[ "屈丹(1974-),女,吉林长春人,博士,信息工程大学副教授、硕士生导师,主要研究方向为语音处理与识别、机器学习、自然语言处理。" ]
[ "张文林(1982-),男,湖北蕲春人,博士,信息工程大学讲师,主要研究方向为语音处理与识别、机器学习、自然语言处理。" ]
网络出版日期:2015-09,
纸质出版日期:2015-09-25
移动端阅览
屈丹, 张文林. 基于稀疏组LASSO约束的本征音子说话人自适应[J]. 通信学报, 2015,36(9):47-54.
Dan QU, Wen-lin ZHANG. Sparse group LASSO constraint eigenphone speaker adaptation method for speech recognition[J]. Journal on communications, 2015, 36(9): 47-54.
屈丹, 张文林. 基于稀疏组LASSO约束的本征音子说话人自适应[J]. 通信学报, 2015,36(9):47-54. DOI: 10.11959/j.issn.1000-436x.2015241.
Dan QU, Wen-lin ZHANG. Sparse group LASSO constraint eigenphone speaker adaptation method for speech recognition[J]. Journal on communications, 2015, 36(9): 47-54. DOI: 10.11959/j.issn.1000-436x.2015241.
本征音子说话人自适应方法在自适应数据量不足时会出现严重的过拟合现象,提出了一种基于稀疏组LASSO 约束的本征音子说话人自适应算法。首先给出隐马尔可夫—高斯混合模型下本征音子说话人自适应的基本原理;然后将稀疏组 LASSO 正则化引入到本征音子说话人自适应,通过调整权重因子控制模型的复杂度,并通过一种加速近点梯度的数学优化算法来实现;最后将稀疏组 LASSO 约束的自适应算法与当前多种正则化约束的自适应方法进行比较。汉语连续语音识别的说话人自适应实验表明,引入稀疏组 LASSO 约束后,本征音子说话人自适应方法的性能得到了明显提高,且稀疏组LASSO约束方法优于l
1
、l
2
和弹性网正则化方法。
Original eigenphone speaker adaptation method
performed well when the amount of adaptation data was suffi-cient.However
it suffered from server overfitting when insufficient amount of adaptation data was provided.A sparse group LASSO(SGL) constraint eigenphone speaker adaptation method was proposed.Firstly
the principle of eigenphone speaker adaptation was introduced in case of hidden Markov model-Gaussian mixture model (HMM-GMM) based speech recognition system.Then
a sparse group LASSO was applied to estimation of the eigenphone matrix.The weight of the SGL norm was adjusted to control the complexity of the adaptation model.Finally
an accelerated proximal gradient method was adopted to solve the mathematic optimization.The method was compared with up-to-date norm algorithms.Experiments on an mandarin Chinese continuous speech recognition task show that
the performance of the SGL con-straint eigenphone method can improve remarkably the performance of the system than original eigenphone method
and is also superior to l
1
、l
2
-norm and elastic net constraint methods.
ZHANG W L , ZHANG W Q , LI B C , et al . Bayesian speaker adapta-tion based on a new hierarchical probabilistic model [J ] . IEEE Transac-tions on Audio,Speech and Language Processing [J ] . 2012 , 20 ( 7 ): 2002 - 2015 .
SOLOMONOFF A , CAMPBELL W M , BOARDMAN I . Advances in channel compensation [A ] . for SVM speaker recognition.Proceedings of International Conference on Acoustics,Speech,and Signal Proc-essing(ICASSP) [C ] . Philadelphia,USA , 2005 . 629 - 632 .
PAVAN KUMAR D S , PRASAD N V , JOSHI V , et al . Modified splice and its extension to non-stereo data for noise robust speech recogni-tion [A ] . Proceedings of IEEE Automatic Speech Recognition and Un-derstanding Workshop(ASRU) [C ] . Olomouc,Czech Republic , 2013 . 174 - 179 .
HAMIDI S G , RICHARD C R . Two-stage speaker adaptation in sub-space gaussian mixture models [A ] . Proceedings of International Con-ference on Acoustics,Speech and Signal Processing(ICASSP) [C ] . Florence,Italy , 2014 . 6374 - 6378 .
WANG Y Q , GALE M J F . Tandem system adaptation using multiple linear feature transforms [A ] . Proceedings of International Conference on Acoustics,Speech and Signal Processing(ICASSP) [C ] . Vancouver,Canada , 2013 . 7932 - 7936 .
KENNY P , BOULIANNE G , OUELLETET P , et al . Speaker adapta-tion using an eigenphone basis [J ] . IEEE Transaction on Audio,Speech and Language Processing , 2004 , 12 ( 6 ): 579 - 589 .
ZHANG W L , ZHANG W Q , LI B C . Speaker adaptation based on speaker-dependent eigenphone estimation [A ] . Proceedings of IEEE Automatic Speech Recognition and Understanding Workshop(ASRU) [C ] . Hawaii,USA , 2011 . 48 - 52 .
LI J , TSAO Y , LEE C H . Shrinkage model adaptation in automatic speech recognition [A ] . Proceedings of Annual Conference on Interna-tional Speech Communication Association(INTERSPEECH) [C ] . Ma-kuhari,Chiba,Japan , 2010 . 1656 - 1659 .
OLSEN P A , HUANG J , RENNIE S J , et al . Sparse maximum a pos-teriori adaptation [A ] . Proceedings of IEEE Automatic Speech Recog-nition and Understanding Workshop(ASRU) [C ] . Hawaii,USA , 2011 . 53 - 58 .
OLSEN P A , HUANG J , RENNIE S J , et al . Affine invariant sparse maximum a posteriori adaptation [A ] . Proceedings of International Conference on Audio,Speech and Signal Processing(ICASSP) [C ] . Kyoto,Japan , 2012 . 4317 - 4320 .
KIM Y G , KIM H . Constrained mle-based speaker adaptation with l 1 regularization [A ] . Proceedings of International Conference on Audio,Speech and Signal Processing(ICASSP) [C ] . Florence,Italy , 2014 . 6419 - 6422 .
张文林 , 张连海 , 牛铜 , 等 . 基于正则化的本征音说话人自适应方法 [J ] . 自动化学报 , 2012 , 38 ( 12 ): 1950 - 1957 .
ZHANG W L , ZHANG L H , NIU T , et al . Regularization based ei-genvoice speaker adaptation method [J ] . ACTA Automatica Sinica , 2012 , 38 ( 12 ): 1950 - 1957 .
YOUNG S , EVERMANN G , GALES M , et al . The HTK book (for HTK version 3.4) [EB/OL ] . http://htk.eng.cam.ac.uk/docs/docs.shtml http://htk.eng.cam.ac.uk/docs/docs.shtml . 2009 .
张文林 , 张连海 , 陈琦 , 等 . 语音识别中基于低秩约束的本征音子说话人自适应方法 [J ] . 电子与信息学报 , 2014 , 36 ( 4 ): 981 - 987 .
ZHANG W L , ZHANG L H , CHEN Q , et al . Low-rank constraint ei-genphone speaker adaptation method for speech recognition [J ] . Jour-nal of Electronics &Information Technology , 2014 , 36 ( 4 ): 981 - 987 .
YUAN M , LIN Y . Model selection and estimation in regression with grouped variables [A ] . Journal of the Royal Statistical Society(Series B) . 2007 , 68 ( 1 ): 49 - 67 .
TAN Q F , NARAYANAN S S . Novel variations of group sparse regu-larization techniques with applications to noise robust automatic speech recognition [A ] . IEEE Transaction on Acoustic,Speech and Signal Processing . 2012 , 20 ( 4 ): 1337 - 1346 .
SIMON N , FRIEDMAN J , HASTIE T , et al . A sparse-group LASSO [J ] . Journal of Computational and Graphical Statistics , 2013 , 22 ( 2 ): 231 - 245 .
CHANG E , SHI Y , ZHOU J , et al . Speech lab in a box:a Mandarin speech toolbox to jumpstart speech related research [A ] . Proceedings of 7th European Conference on Speech Communication and Technol-ogy(EUROSPEECH) [C ] . Aalborg,Denmark , 2001 . 2799 - 2802 .
BECK A , TEBOULLE M . A fast iterative shrinkage-thresholding algorithm for linear inverse problems [A ] . SIAM Journal on Imaging Sciences . 2009 , 2 ( 1 ): 183 - 202 .
BERTSEKAS D P . Incremental proximal methods for large scale convex optimization [J ] . Mathematical Programming . 2011 , 129 ( 2 ): 163 - 195 .
PARIKH N , BOYD S . Proximal Algorithms.Foundations and Trends in Optimization [M ] . 2013 .
0
浏览量
726
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构