LI Mingyu,JIN Xiaofeng*.Research on automatic segmentation algorithm ofKorean speech syllables[J].Journal of Yanbian University,2019,45(02):128-135.
朝鲜语语音音节自动切分算法的研究
- Title:
- Research on automatic segmentation algorithm of Korean speech syllables
- 文章编号:
- 1004-4353(2019)02-0128-08
- 关键词:
- 朝鲜语语音语料; 语料自动标注; Seneff听觉模型; 语音音节分割
- Keywords:
- Korean speech corpus; automatic segmentation; Seneff auditory model; syllable segmentation
- 分类号:
- TP391.4
- 文献标志码:
- A
- 摘要:
- 针对目前语音语料人工标注效率低的问题,提出了一种朝鲜语连续语音语料的音节自动切分方法.该方法首先采用Seneff听觉模型提取音频的包络检测响应和广义同步检测响应等特征参数,其次结合朝鲜语发音特点确定音节的候选边界位置,最后通过静音段和摩擦音检测消除虚假边界,以提高边界检测的准确率.实验结果表明,该朝鲜语语音语料音节自动切分方法的准确率(93.56%)比传统的基于Seneff听觉模型的分割算法提高了14.59%,召回率(86.43%)比传统的基于Seneff听觉模型的分割算法降低了1.69%; 因此,本文算法总体优于传统的基于Seneff听觉模型的分割算法.
- Abstract:
- Aiming at the current low efficiency of manual annotation of speech corpus, an automatic syllable segmentation method for Korean continuous speech corpus is proposed. First, Seneff auditory model is used to extract the audio characteristic parameters, such as the envelope detection response and generalized synchronous detection response, etc. Secondly, the candidate boundary position of syllables is defined according to the Korean pronunciation characteristics. Finally, the false boundary is eliminated by silent segment and fricative detection to improve the boundary detection accuracy. The experimental results show that the accuracy of the proposed Korean syllable segmentation method is 93.56%, increased by 14.59% than that of traditional segmentation algorithms based on Sneff auditory model, meanwhile, the recall rate reaches to 86.43%, decreased by 1.69%. Therefore, the proposed algorithm in this paper is overall superior to traditional segmentation algorithms based on Sneff auditory model.
参考文献/References:
[1] 何可嘉.广播语音的自动标注系统[D].北京:北京邮电大学,2010.
[2] 王丽娟,曹志刚.基于HMM模型的语音单元边界的自动切分[J].数据采集与处理,2005,20(4):381-383.
[3] 李诗心.傣语语音合成系统中自动分词技术与音子自动切分技术研究[D].昆明:云南大学,2015.
[4] 韩虎.汉语连续语音的音节自动标注算法研究及实现[D].哈尔滨:哈尔滨工业大学,2008.
[5] TOLEGEN Gulmira,邬春学.基于深度学习方法的句子及语素边界划分研究[J].电子科技,2017,30(9):20-22.
[6] PAILAI J, KONGKACHANDRA R, SUPNITHI T, et al. A comparative study on different techniques for Thai part-of-speech tagging[C]//Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology(ECTI-CON), 2013 10th International Conference on. Krabi, Thailand: IEEE, 2013:245-247.
[7] RUNSHEN C. A modified syllable segmentation method based on multi-feature for mandarin speech[C]//The 2010 4th International Conference on Intelligent Information Technology Application(IITA2010). Qinhuangdao, China: IEEE, 2010:487-489.
[8] 郝静,张刚.基于粒计算的清浊音检测算法[J].太原理工大学学报,2008,39(3):39-40.
[9] 王艳,冯宏伟,张利平,等.基于元音检测的汉语连续语音声韵母分割[J].计算机工程与应用,2011,47(14):134-136.
[10] 姚徐,于洪志,单广荣.音段自动切分系统的设计与实现[J].电脑知识与技术,2008,2(13):737-739.
[11] 陈斌,张连海,王波,等.基于Seneff听觉谱特征的汉语连续语音声韵母边界检测[J].声学学报,2012,37(1):104-110.
[12] 王桂荣.朝鲜语和蒙古语语音对比分析方法研究[D].延吉:延边大学,2018.
[13] 张美英.基础韩国语[M].哈尔滨:黑龙江朝鲜民族出版社,2016:55-59.
[14] 陈斌.汉语连续语音声韵母类别属性检测技术研究[D].郑州:解放军信息工程大学,2015.
[15] STEPHANIE S. Pitch and spectral analysis of speech based on an auditory synchrony model[D]. Cambridge: Massachusetts Institute of Technology, 1980:83-89.
[16] AHMED M, ABDELLATY A. Robust auditory-based speech processing using the average localized synchrony detection[J]. University of Pennsylvania Scholarly Commons. Departmental Papers(ESE), 2002,10(5):280-282.
[17] HU Guoning, WANG Deliang. Auditory segmentation based on onset and offset analysis[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2007,15(2):398-399.
备注/Memo
收稿日期: 2019-02-17 *通信作者: 金小峰(1970—),男,教授,研究方向为智能信息处理.
*基金项目: 吉林省教育厅“十三五”科学技术项目(JJKH20191126KJ); 延边大学世界一流学科建设培育项目(18YLPY14)