WANG Bo, JIA Yan, YANG Shu-qiang, et al. Feature selection algorithm for uncertain text classification[J]. 2009, 30(8): 32-38.DOI:
适用于不确定文本分类的特征选择算法
摘要
基于Hilbert-Schmidt依赖性准则提出了一种新颖的特征选择算法FSUNT
重点考虑特征选择过程中可能出现的模糊性和不确定性。针对类标号不确定而其他特征值确定的文本数据
通过考察特征与不确定的类标号间的Hilbert-Schmidt相关性
对特征进行排序
并选取最终的结果子集。最后大量真实与仿真实验结果表明
基于该算法可得到良好的分类效果和稳定性。
Abstract
A novel algorithm called FSUNT was proposed based on HSIC
with the focus on the vagueness and uncertainty which might be taken into account during feature selection. For text data with fixed feature values and uncertain class labels
features were ranked according to the correlation between features and uncertain class labels evaluated by HSIC. The results of experimental evaluation on a variety of datasets show better performance and stability of FSUNT.