单变量特征选择

2022-05-29 10:27:32 浏览数 (1)

代码语言:javascript复制
import numpy as np
import matplotlib.pyplot as plt
from sklearn import datasets, svm
from sklearn.feature_selection import SelectPercentile, f_classif
#iris数据集
iris=datasets.load_iris()
#噪声数据
E=np.random.uniform(0, 0.1, size=(len(iris.data), 20))
#将噪声数据添加到特征数据
X=np.hstack((iris.data, E))
y=iris.target
plt.figure(1)
plt.clf()
X_indices = np.arange(X.shape[-1])
#基于F检验的单变量特征选择
selector=SelectPercentile(f_classif, percentile=10)
selector.fit(X, y)
scores=-np.log10(selector.pvalues_)
scores/=scores.max()
plt.bar(X_indices-.30,scores,width=.2,label=r'Univariate score ($-Log(p_{value})$)',color='darkorange',edgecolor='black')
#与SVM权重比较
clf=svm.SVC(kernel='linear')
clf.fit(X, y)
svm_weights=(clf.coef_**2).sum(axis=0)
svm_weights/=svm_weights.max()
plt.bar(X_indices-.25,svm_weights,width=.2,label='SVM weight',color='navy',edgecolor='black')
clf_selected=svm.SVC(kernel='linear')
clf_selected.fit(selector.transform(X),y)
svm_weights_selected=(clf_selected.coef_**2).sum(axis=0)
svm_weights_selected/=svm_weights_selected.max()
plt.bar(X_indices[selector.get_support()]-.05,svm_weights_selected,width=.2,label='SVM weights after selection',color='c',edgecolor='black')
plt.title("Comparing feature selection")
plt.xlabel('Feature number')
plt.yticks(())
plt.axis('tight')
plt.legend(loc='upper right')
plt.show()

算法:单变量特征选择是通过增加显著特征权值改善分类效果。‍

链接:https://scikit-learn.org/stable/modules/feature_selection.html#feature-selection

https://scikit-learn.org/stable/modules/generated/sklearn.feature_selection.SelectKBest.html#sklearn.feature_selection.SelectKBest

https://scikit-learn.org/stable/modules/generated/sklearn.feature_selection.SelectPercentile.html#sklearn.feature_selection.SelectPercentile

https://scikit-learn.org/stable/modules/generated/sklearn.feature_selection.SelectFdr.html#sklearn.feature_selection.SelectFdr

https://scikit-learn.org/stable/modules/generated/sklearn.feature_selection.GenericUnivariateSelect.html#sklearn.feature_selection.GenericUnivariateSelect

https://scikit-learn.org/stable/auto_examples/feature_selection/plot_f_test_vs_mi.html#sphx-glr-auto-examples-feature-selection-plot-f-test-vs-mi-py

0 人点赞