基于MIC(最大互信息系数)的特征选择

2021-05-21 16:39:10 浏览数 (1)

最大信息系数 maximal information coefficient (MIC),又称最大互信息系数。

之前写了一个MIC的介绍,里面包含了MIC的原理,链接:https://cloud.tencent.com/developer/article/1827564

利用到的MATLAB包安装请参见:https://cloud.tencent.com/developer/article/1827541

特征选择步骤

①计算不同维度(特征)之间的MIC值,MIC值越大,说明这两个维度越接近。

②寻找那些与其他维度MIC值较小的维度,根据阈值选出这些特征。

③利用SVM训练

④训练结果在测试集上判断错误率

minepy的安装请参照:https://cloud.tencent.com/developer/article/1827541

MATLAB代码:

代码语言:txt复制
clc
load train_F.mat;
load train_L.mat;
load test_F.mat;
load test_L.mat;
Dim = 22;
MIC_matrix = zeros(Dim, Dim);
for i = 1:Dim
    for j = 1:Dim
        X_v = reshape(train_F(:,i),1,size(train_F(:,i),1));
        Y_v = reshape(train_F(:,j),1,size(train_F(:,j),1));
        [A, ~] = mine(X_v, Y_v);
        MIC_matrix(i, j) = A.mic;
    end
end
MIC_matrix(MIC_matrix>0.4) = 0;
MIC_matrix(MIC_matrix~=0) = 1;
inmodel = sum(MIC_matrix);
threshold = sum(inmodel)/Dim;
inmodel(inmodel <= threshold) = 0;
inmodel(inmodel > threshold) = 1;
 
model = libsvmtrain(train_L,train_F(:,inmodel));
[predict_label, ~, ~] = libsvmpredict(test_L,test_F(:,inmodel),model); 
error=0;
for j=1:length(test_L)
    if(predict_label(j,1) ~= test_L(j,1))
        error = error 1;
    end
end
error = error/length(test_L);

代码采用Apache 2.0授权 

文章采用知识共享许可协议BY-NC-SA4.0授权

OmegaXYZ-版权所有 转载请注明出处

0 人点赞