SHEN Lin.Random ensemble attribute reduction ofneighborhood rough sets[J].Journal of Yanbian University,2019,45(01):49-54.
邻域粗糙集的随机集成属性约简
- Title:
- Random ensemble attribute reduction of neighborhood rough sets
- 文章编号:
- 1004-4353(2019)01-0049-06
- 分类号:
- TP181
- 文献标志码:
- A
- 摘要:
- 为了解决传统的辨识矩阵空间复杂度高,难以应用于大规模数据的问题,提出了一种基于随机抽样的属性约简算法.首先随机抽取多个小样本子集,以降低辨识矩阵的空间复杂度; 然后分别对每个样本子集进行属性约简,并计算每个属性子集的权重; 最后选择高权重的几个属性子集进行测试,找出精度最高的属性子集.实验结果证明,本文方法比传统辨识矩阵的占用空间降低2~3个数量级,并且精度与其基本相当.
- Abstract:
- In order to solve the problem that the traditional identification matrix has high spatial complexity, and is difficult to be applied to large-scale data, an attribute reduction algorithm based on random sampling is proposed. Firstly, several small sample subsets are randomly extracted to decrease the spatial complexity of the identification matrix; Secondly, attribute reduction is performed for each sample subset, and the weight of each attribute subset is calculated. Finally, several attribute subsets with high weights are selected for testing to find out the most accurate attribute subset. The experimental results show that the proposed method can reduce the occupied space by 2 to 3 orders of magnitude than traditional identification matrix, and its accuracy is basically the same as that of the traditional identification matrix.
参考文献/References:
[1] 胡清华,于达仁,谢宗霞.基于邻域粒化和粗糙逼近的数值属性约简[J].软件学报,2008,19(3):640-649.
[2] PAWLAK Z. Rough-Sets: Theoretical Aspects of Reasoning About Data[M]. Dordrecht: Kluwer Academic Publisher, 1991.
[3] CHEN H M, LI T R, CAI Y, et al. Parallel attribute reduction in dominance-based neighborhood rough set[J]. Information Sciences, 2016,373:351-368.
[4] LIN Y, LI J, LIN P, et al. Feature selection via neighborhood multigranulation fusion[J]. Knowledge -Based Systems, 2014,67(3):162-168.
[5] 鲍丽娜,丁世飞,许新征,等.基于邻域粗糙集的极速学习机算法[J].济南大学学报(自然科学版),2015,29(5):367-371.
[6] LI X J, RAO F. Outlier detection using the information entropy of neighborhood rough sets[J]. Journal of Information & Computational Science, 2012,12(9):3339-3350.
[7] 沈林.基于改进辨识矩阵的变精度邻域粗糙集属性约简[J].延边大学学报(自然科学版),2018,44(2):149-154.
[8] ZOU Q, ZENG J, CAO L, et al. A novel features ranking metric with application to scalable visual and bioinformatics data classification[J]. Neurocomputing, 2016,173(1):346-354.
备注/Memo
收稿日期: 2019-01-21
基金项目: 福建省教育厅项目(JA15458)
作者简介: 沈林(1983—),男,讲师,研究方向为人工智能、机器学习、粗糙集理论.