ZHU Hong,JIN Xiaofeng*.Research of improved cross-lingual speaker verification method[J].Journal of Yanbian University,2017,43(02):184-188.
改进的跨语种说话人确认方法的研究
- Title:
- Research of improved cross-lingual speaker verification method
- Keywords:
- speaker verification; cross-lingual; voiced extraction; fusion feature
- 分类号:
- TP391.41
- 文献标志码:
- A
- 摘要:
- 提出了一种基于改进的语音融合特征和GMM模型相结合的跨语种说话人确认方法.首先,采用Teager能量算子提取语音中的浊音段,消除与说话人声道特征无关的静音段和清音段.其次,提取基音周期参数,并与16维的MFCC参数融合形成本文的语音融合特征.最后,将本文方法与文献[9]的方法分别进行了单语种和跨语种的说话人确认对比实验,实验结果表明本文方法识别准确率和平均判别时间均优于文献[9]的方法,证明本文提出的方法有效,可用于跨语种的说话人确认应用领域.
- Abstract:
- This paper presents a cross-lingual speaker verification method based on improved speech fusion feature and GMM model. First, the Teager energy operator is used to extract voiced clips in speech, eliminating mute and unvoiced clips that are independent of speaker’s vocal tract. Secondly, pitch period parameters are extracted and fused with 16-dimensional MFCC parameters to form speech fusion feature. Finally, experimental results show that the accuracy and average discriminant time of this method are better than that of reference[9], which proves that the method proposed in this paper is valid and available in cross-lingual speaker verification applications.
参考文献/References:
[1] 郑方.声纹识别技术及其应用现状[J].信息安全研究,2016,2(1):44-57.
[2] 陈强.基于GMM的说话人识别系统研究与实现[D].武汉:武汉理工大学,2010:6-7.
[3] Sarkar S, Rao K S, Nandi D. Multilingual speaker recognition on Indian languages[C]//2013 Annual IEEE India Conference(INDICON). Mumbai, India, 2013:1-5.
[4] Bhattacharjee U, Sarmah K. A multilingual speech database for speaker recognition[C]//IEEE International Conference on Signal Processing, Computing and Control. Hong Kong, China, 2012:1-5.
[5] Ma B, Meng H. English-Chinese bilingual text-independent speaker verification[C]//IEEE International Conference on Acoustics, Speech, and Signal Processing. Montreal, Canada, 2004:V-293-6.
[6] Lu L, Dong Y, Zhao X. The effect of language factors for robust speaker recognition[C]//IEEE International Conference on Acoustics, Speech, and Signal Processing. Taipei, Taiwan, 2009:4217-4220.
[7] 骆启帆.基于声门信息的说话人确认方法研究[D].杭州:杭州电子科技大学,2014:16-17.
[8] 房安栋.复杂背景下声纹特征提取与识别[D].长沙:中南林业科技大学,2014:35-45.
[9] Zhang Xuefeng, Dong Yuan. Insight into the role of pitch information in text-independent speaker recognition[C]//第八届全国人机语音通讯学术会议.北京,2005:214-217.
[10] Kim S, Eriksson T, Kang H G. A pitch synchronous feature extraction method for speaker recognition[C]// IEEE International Conference on Acoustics, Speech, and Signal Processing. Montreal, Canada, 2004:I-405-8.
[11] 王义元,赵黎明.基于小波变换和Teager能量算子浊音段提取[J].控制工程,2004,11(S2):99-101.
[12] Derrien T, Johnson R, Bussotti G. Wavelet speech enhancement based on the Teager energy operator[J]. IEEE Signal Processing Letters, 2001,8(1):10-12.
[13] Teager H. Some observations on oral air flow during phonation[J]. IEEE Transactions on Acoustics Speech & Signal Processing, 1980,28(5):599-601.
[14] 刘士.基于GMM的声纹识别技术的研究[D].成都:电子科技大学,2012:34-35.
[15] 曹仁松.汉语声调特点对英语语调学习的负迁移[J].大连海事大学学报(社会科学版),2008,7(3):189-191.
[16] 徐世荣.抓住声调教学这一环—突破朝鲜族学汉语的难点[J].汉语学习,1980(5):1-5.
[17] 张丽莉.日本初学者上声习得偏误分析及解决策略[D].大连:辽宁师范大学,2014:1-4.
[18] 侯红霞.蒙古国UB升日中学初级学生汉语声韵调习得偏误分析与应对策略[D].西安:西北大学,2014:34-37.
备注/Memo
收稿日期: 2017-04-07 基金项目: 吉林省科技厅自然科学基金资助项目(20140101225JC)
*通信作者: 金小峰(1970—),男,教授,研究方向为智能信息处理.