NSGA2算法特征选择MATLAB实现(多目标)

2021-05-21 16:44:31 浏览数 (1)

利用nsga2进行进行特征选择其主要思想是:将子集的选择看作是一个搜索寻优问题(wrapper方法),生成不同的组合,对组合进行评价,再与其他的组合进行比较。这样就将子集的选择看作是一个是一个优化问题。

需要优化的两个目标为特征数和精度。

nsga2是一个多目标优化算法。

具体的nsga2通用算法请看:https://www.omegaxyz.com/2018/01/22/new_nsga2/

具体的特征选择代码在上述代码的基础上改了两个①主函数②评价函数,增加了一个数据分成训练集和测试集的函数:

MATLAB

代码语言:txt复制
function divide_datasets()
load Parkinson.mat;
dataMat=Parkinson_f;
len=size(dataMat,1);
%归一化
 
maxV = max(dataMat);
minV = min(dataMat);
range = maxV-minV;
newdataMat = (dataMat-repmat(minV,[len,1]))./(repmat(range,[len,1]));
 
Indices   =  crossvalind('Kfold', length(Parkinson_label), 10);
site = find(Indices==1|Indices==2|Indices==3);
train_F = newdataMat(site,:);
train_L = Parkinson_label(site);
site2 = find(Indices~=1&Indices~=2&Indices~=3);
test_F = newdataMat(site2,:);
test_L =Parkinson_label(site2);
save train_F train_F;
save train_L train_L;
save test_F test_F;
save test_L test_L;
end
%what doesn't kill you makes you stronger, stand a little taller,doesn't mean i'm over cause you're gonw.

MATLAB代码主函数:

MATLAB

代码语言:txt复制
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 
%此处可以更改
 
%更多机器学习内容请访问omegaxyz.com
 
clc;
clear;
pop = 500; %种群数量
gen = 100; %迭代次数
M = 2; %目标数量
V = 22; %维度
min_range = zeros(1, V); %下界
max_range = ones(1,V); %上界
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 
%特征选择
 
divide_datasets();
global answer
answer=cell(M,3);
global choice     %选出的特征个数
choice=0.8;
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 
chromosome = initialize_variables(pop, M, V, min_range, max_range);
chromosome = non_domination_sort_mod(chromosome, M, V);
 
for i = 1 : gen
    pool = round(pop/2);
    tour = 2;
    parent_chromosome = tournament_selection(chromosome, pool, tour);
    mu = 20;
    mum = 20;
    offspring_chromosome = genetic_operator(parent_chromosome,M, V, mu, mum, min_range, max_range);
    [main_pop,~] = size(chromosome);
    [offspring_pop,~] = size(offspring_chromosome);
    clear temp
    intermediate_chromosome(1:main_pop,:) = chromosome;
    intermediate_chromosome(main_pop   1 : main_pop   offspring_pop,1 : M V) = offspring_chromosome;
    intermediate_chromosome = non_domination_sort_mod(intermediate_chromosome, M, V);
    chromosome = replace_chromosome(intermediate_chromosome, M, V, pop);
    if ~mod(i,100)
        clc;
        fprintf('%d generations completedn',i);
    end
end
 
if M == 2
    plot(chromosome(:,V   1),chromosome(:,V   2),'*');
    xlabel('f_1'); ylabel('f_2');
    title('Pareto Optimal Front');
elseif M == 3
    plot3(chromosome(:,V   1),chromosome(:,V   2),chromosome(:,V   3),'*');
    xlabel('f_1'); ylabel('f_2'); zlabel('f_3');
    title('Pareto Optimal Surface');
end

评价函数(利用林志仁SVM进行训练):

MATLAB

代码语言:txt复制
function f = evaluate_objective(x, M, V, i)
f = [];
global answer
global choice
load train_F.mat;
load train_L.mat;
load test_F.mat;
load test_L.mat;
temp_x = x(1:V);
inmodel = temp_x>choice;%%%%%设定恰当的阈值选择特征
f(1) = sum(inmodel(1,:));
answer(i,1)={f(1)};
model = libsvmtrain(train_L,train_F(:,inmodel), '-s 0 -t 2 -c 1.2 -g 2.8');
[predict_label, ~, ~] = libsvmpredict(test_L,test_F(:,inmodel),model,'-q'); 
error=0;
for j=1:length(test_L)
    if(predict_label(j,1) ~= test_L(j,1))
        error = error 1;
    end
end
error = error/length(test_L);
f(2) = error;
answer(i,2)={error};
answer(i,3)={inmodel};
end

选的的数据集请从UCI上下载。

结果:

①pareto面

最后粒子的数据(选出的特征数和精确度)

PSO单目标特征选择请见:https://cloud.tencent.com/developer/article/1827562

0 人点赞