HUANG Zhenghao,CUI Rongyi*.Design of translation assistant system based onautomatic extraction of terms[J].Journal of Yanbian University,2017,43(03):259-263.
基于术语自动抽取的科技文献翻译辅助系统的设计
- Title:
- Design of translation assistant system based on automatic extraction of terms
- 分类号:
- TP391.41
- 文献标志码:
- A
- 摘要:
- 设计了一种中韩科技信息综合平台中的翻译辅助系统.首先,依据关键词确定的组词特证获取候选术语,并使用互信息评估候选术语以实现术语自动提取.其次,将已有术语、抽取到的新术语、术语译文和历史翻译记录等信息存储到系统数据库中建立术语库.最后,设计翻译工作者的用户接口,使其通过该接口获取已有术语的译文信息、新术语的相似译文信息和译文记忆库为基础的历史翻译数据.测试结果表明,本文设计的术语自动抽取功能和辅助译文生成功能达到了预定的设计目标,术语自动抽取算法召回率达到61.8%, 结合优化方法进行优化后达到66.9%;
- Abstract:
- This paper describes the design method of Chinese and Korean science and technology information aided translation system. Firstly, extracting candidate terms based on word formation characteristics of keywords, and using mutual information to evaluate candidate terms for automatic term extraction. Secondly, existing terminology, extraction of the new terminology, terminology translation and history of translation records and so on are stored in the system database and the established terminology database. Finally, design the user interface for translators, so the translators can obtain the translation of exiting terminology, the similar translation of additional terms, and the history translation data based on translation memory through this interface. The results of the system test show that automatic extraction of term and the auxiliary translation function reach the desired goals. The recall rate or term automatic extraction algorithm is 61.8%, and after optimization the rate is improved by optimization method to reach 66.9%. The generation of auxiliary translation averagely delays 0.031 seconds, and the MRR is 0.951, so the test results fulfil the users' needs.
参考文献/References:
[1] 朱玉彬,陈晓倩.国内外四种常见计算机辅助翻译软件比较研究[J].外语电化教学,2013,149(1):69-75.
[2] 叶娜,张桂平,韩亚冬,等.从计算机辅助翻译到协同翻译[J].中文信息学报,2012,26(6):1-10.
[3] Bowker L. Computer-aided translation technology: a practical introduction[J]. Linguistics, 2007,8(2):229-231.
[4] 冯全功,崔启亮.译后编辑研究:焦点透析与发展趋势[J].上海翻译,2016(6):67-89.
[5] 罗季美,李梅.机器翻译译文错误分析[J].中国翻译,2012(5):84-89.
[6] 黄河燕,陈肇雄.一种智能译后编辑器的设计及其实现算法[J].软件学报,1995(3):129-134.
[7] 冯志伟.现代术语学引论[M].北京:语文出版社,1997.
[8] 周浪,张亮,冯冲,等.基于词频分布变化统计的术语抽取方法[J].计算机科学,2009,36(5):177-180.
[9] 张榕.术语定义抽取、聚类与术语识别研究[D].北京:北京语言大学,2006:76-77.
[10] 李丽双,王意文,黄德根,等.基于信息熵和词频分布变化的术语抽取研究[J].中文信息学报,2015,29(1):82-87.
[11] 傅祖芸.信息论[M].2版.北京:北京电子工业出版社,2010.
[12] Levenshtein V I. Binary codes capable of correcting deletions, insertions and reversals[J]. Problems of Information Transmission, 1965,1(1):8-17.
[13] 费巍.搜索引擎检索功能的性能评价研究[D].武汉:武汉大学,2010:11-12.
备注/Memo
收稿日期: 2017-04-19 *通信作者: 崔荣一(1962—),男,博士,教授,研究方向为模式识别、智能计算.
基金项目: 吉林省自然科学基金资助项目(20140101186JC); 延边大学-延边州科技信息服务中心合作项目(延大科合字[2016]1号)