GUO Hongzhuang,JIN Xiaofeng.Korean ancient books character detection method based on improved HRCenterNet model[J].Journal of Yanbian University,2022,(03):235-241.
基于HRCenterNet模型改进的朝鲜语古籍文字检测方法
- Title:
- Korean ancient books character detection method based on improved HRCenterNet model
- 文章编号:
- 1004-4353(2022)03-0235-07
- 关键词:
- 古籍文字检测; 朝鲜语古籍; Involution算子; ECA注意力机制
- 分类号:
- TP391.1
- 文献标志码:
- A
- 摘要:
- 为减少朝鲜语古籍中的小文字错检和漏检问题,提出了一种基于HRCenterNet模型改进的朝鲜语古籍文字检测方法.首先,将HRCenterNet中Bottleneck模块的3×3卷积运算替换为Involution算子,即将Bottleneck模块替换为Involution - Bottleneck模块.其次,通过引入ECA(efficient channel attention)注意力机制扩展Involution - Bottleneck模块,并由此提出了基于IENeck模块的HRCenterNet改进模型.最后,利用朝鲜语古籍数据集对改进的HRCenterNet模型和原模型分别进行了训练,并测试了其在不同IOU下的准确率、召回率以及F1等指标.实验结果表明,在IOU ≥ 0.6时,改进的HRCenterNet模型在朝鲜语古籍数据集上的准确率、召回率和F1指标均优于原模型,且IOU值越高模型的检测效果越好.这表明改进的HRCenterNet模型显著优于原模型,可应用于朝鲜语古籍文字的检测中.
- Abstract:
- In order to reduce the misdetection and omission of small characters in Korean ancient books, an improved Korean ancient book character detection method based on HRCenterNet model was proposed.First, replace the 3×3 convolution operation of the Bottleneck module in HRCenterNet with the Involution operator, that is, replace the Bottleneck module with the Involution - Bottleneck module.Second, by introducing the efficient channel attention(ECA)mechanism to extend the Involution - Bottleneck module, an improved HRCenterNet model based on the IENeck module is proposed.Finally, the improved HRCenterNet model and the original model are trained separately using the Korean ancient book dataset, and the precision, recall, and F1 of different models under different IOUs are tested.The experimental results show that when IOU ≥ 0.6, the precision, recall and F1 index of the improved HRCenterNet model on the Korean ancient book data set are better than the original model, and the higher the IOU value, the better the detection effect of the model.This shows that the improved HRCenterNet model is significantly better than the original model, and can be applied to the detection of Korean ancient books.
参考文献/References:
[1] 姜哲,马少平,夏莹.大型中文古籍《四库全书》自动版面分析系统[J].中文信息学报,2000,14(2):14 - 20.
[2] KIM M, OH I.Script - Free text line segmentation using interline space model for printed document images[C]//2011 International Conference on Document Analysis and Recognition.Beijing, China: IEEE, 2011:1354 - 1358.
[3] KIM S H, JEONG S, LEE G S, et al.Word segmentation in handwritten Korean text lines based on gap clustering techniques[C]//Proceedings of Sixth International Conference on Document Analysis and Recognition.Seattle, USA: IEEE, 2001:189 - 193.
[4] WAHYONO, JO K H.A clustering strategy for touching characters in Korean and English printed text segmentation[C]//International Conference on Ubiquitous Robots and Ambient Intelligence.Daejeon: IEEE, 2012:23 - 25.
[5] 靳简明,丁晓青,彭良瑞,等.印刷维吾尔文本切割[J].中文信息学报,2005,18(5):76 - 83.
[6] YANG H, JIN L, HUANG W, et al.Dense and tight detection of Chinese characters in historical documents: datasets and a recognition guided detector[J].IEEE Access, 2018,6:30174 - 30183.
[7] XIE Z, HUANG Y, JIN L, et al.Weakly supervised precise segmentation for historical document images[J].Neurocomputing, 2019,350:271 - 281.
[8] WU S H, WANG J P, MA W H, et al.Precise detection of Chinese characters in historical documents with deep reinforcement learning[J].Pattern Recognition, 2020,107:107503.
[9] MA W, ZHANG H, JIN L, et al.Joint layout analysis, character detection and recognition for historical document digitization[C]//2020 17th International Conference on Frontiers in Handwriting Recognition(ICFHR).Dortmund, Germany: IEEE, 2020:31 - 36.
[10] TANG C W, LIU C L, CHIU P S.HRCenterNet: An anchorless approach to Chinese character segmentation in historical documents[C]//2020 IEEE International Conference on Big Data. Atlanta, GA: IEEE, 2020:1924 - 1930.
[11] 薛春寒,金小峰.基于迁移学习的少样本朝鲜语古籍文字的识别方法[J].延边大学学报(自然科学版),2021,47(4):350 - 355.
[12] LI D, HU J, WANG C, et al.Involution: Inverting the inherence of convolution for visual recognition[J].arXivpreprint, arXiv:2103.06255, 2021.
[13] WANG Q, WU B, ZHU P, et al.ECA - Net: Efficient channel attention for deep convolutional neural networks[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).Seattle, WA, USA: IEEE, 2020:11531 - 11539.
[14] HE K, ZHANG X, REN S, et al.Deep residual learning for image recognition[J].IEEE, 2016:770 - 778.
[15] HU J, SHEN L, ALBANIE S, et al.Squeeze - and - Excitation networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017:214 - 232.
[16] 杨海林.基于深度学习的古籍文字识别和检测[D].广州:华南理工大学,2019.
相似文献/References:
[1]刘晓童,赵梦玲,王桂荣,等.基于DeepCluster的朝鲜语古籍文字图像的无监督聚类方法研究[J].延边大学学报(自然科学版),2023,(02):183.
LIU Xiaotong,ZHAO Mengling,WANG Guirong,et al.Research on unsupervised clustering method of Korean ancient book character images based on DeepCluster[J].Journal of Yanbian University,2023,(03):183.
备注/Memo
收稿日期: 2022-04-12
基金项目: 延边大学外国语言文学世界一流学科建设项目(18YLPY14); 国家社会科学基金重大项目(18ZDA306)
第一作者: 郭洪壮(1998—),男,硕士研究生,研究方向为计算机视觉.
通信作者: 金小峰(1970—),男,硕士,教授,研究方向为语音信息处理、计算机视觉、机器人技术.