JIN Cheng,CUI Rongyi,ZHAO Yahui*.Research on correlation analysis between college entrance examination information and college program design course scores based on machine learning[J].Journal of Yanbian University,2020,46(04):366-370.
基于机器学习的高考信息与大学程序设计课程成绩相关性分析研究
- Title:
- Research on correlation analysis between college entrance examination information and college program design course scores based on machine learning
- 文章编号:
- 1004-4353(2020)04-0366-05
- Keywords:
- college grades; influential factor; decision tree; random forest
- 分类号:
- TP399
- 文献标志码:
- A
- 摘要:
- 为研究学生高考信息与计算机程序设计课程(C语言)成绩的相关性,提出了一种基于随机森林算法的相关性预测与分析模型.首先,对2014—2016年延边大学计算机科学与技术专业的学生相关数据进行了清洗和筛选,并将C语言考试成绩分成5类; 其次,将学生的高考信息作为特征训练随机森林分类模型; 最后,使用LIME解释性模型对影响随机森林的主要特征进行了相关性分析.实验结果表明,影响C语言成绩的主要特征为生源、总成绩、民族、数学和语文.该研究结果可有效识别不同学生学习成绩的主要相关因素,为教师针对不同学生群体设计合理教学模式提供参考依据.
- Abstract:
- In order to study the correlation between college entrance examination information and computer programming course(C language)scores, a correlation prediction and analysis model based on stochastic forest algorithm is proposed. Firstly, the data related to computer science and Technology majors in Yanbian University from 2014 to 2016 were cleaned and screened, and the C language scores were divided into 5 categories of different levels. Secondly, the college entrance examination information of students is used as a random forest classification model of feature training. Finally, LIME explanatory model is used to analyze the correlation of the most influential characteristics of random forest. The experimental results show that the five characteristics of student origin, total score, nationality, mathematics and Chinese have the greatest influence on the C language score.The results of this study can effectively identify the main factors related to different students academic performance and provide reference for teachers to design reasonable teaching modes for different groups of students.
参考文献/References:
[1] 陈小杭.高考数学成绩与大学数学专业课学习能力相关性分析[J].长春教育学院学报,2019,35(2):8-10.
[2] 石铁玉,王维维,袁帅.工科学生高考成绩对大学阶段学习成绩的影响分析[J].中国电力教育,2014(8):239-240.
[3] 杜晓燕,丁厚成,林晓飞,等.大一成绩与高考成绩的相关性研究[J].安徽工业大学学报(社会科学版),2016,33(4):56-57.
[4] BREIMAN L. Statistical modeling: the two cultures(with comments and a rejoinder by the author)[J]. Statistical Science, 2001,16(3):199-231.
[5] 李航.统计学习方法[M].2版.北京:清华大学出版社,2019.
[6] BREIMAN L. Random forests[J]. Machine Learning, 2001,45(1):5-32.
[7] ZEITOUNI K, CHELGHOUM N. Spatial decision tree -application to traffic risk analysis[C]//ACS/IEEE International Conference on Computer Systems and Applications. Beirut: IEEE, 2001:203-207.
[8] 吕红燕,冯倩.随机森林算法研究综述[J].河北省科学院学报,2019,36(3):37-41.
[9] 王奕森,夏树涛.集成学习之随机森林算法综述[J].信息通信技术,2018(1):49-55.
[10] 申英美.中国朝鲜族教育问题研究[D].北京:中央民族大学,2006.
相似文献/References:
[1]伍雄斌,林雨平,赵磊,等.城市共享自行车使用者满意度测评研究[J].延边大学学报(自然科学版),2020,46(01):90.
WU Xiongbin,LIN Yuping,ZHAO Lei,et al.Research on the survey of bike -sharing user’s satisfaction[J].Journal of Yanbian University,2020,46(04):90.
备注/Memo
收稿日期: 2020-07-30 基金项目: 吉林省高等教育学会高教科研课题(JGJX2018D347)
*通信作者: 赵亚慧(1974—),女,教授,研究方向为自然语言文本处理、教育信息处理.