F检验与互信息

2022-05-29 10:28:55 浏览数 (1)

代码语言:javascript复制
import numpy as np
import matplotlib.pyplot as plt
from sklearn.feature_selection import f_regression, mutual_info_regression
np.random.seed(0)
X=np.random.rand(100,3)
y=X[:,0] np.sin(6*np.pi*X[:,1]) 0.1*np.random.randn(100)
f_test,_=f_regression(X,y)
f_test/=np.max(f_test)
mi=mutual_info_regression(X,y)
mi/=np.max(mi)
plt.figure(figsize=(15,5))
for i in range(3):
    plt.subplot(1,3,i 1)
    plt.scatter(X[:,i],y,edgecolor='black',s=20)
    plt.xlabel("$x_{}$".format(i 1),fontsize=14)
    if i==0:
        plt.ylabel("$y$",fontsize=14)
    plt.title("F-test={:.2f},MI={:.2f}".format(f_test[i],mi[i]),fontsize=16)
plt.show()

算法:F检验和互信息是前者仅仅反映线性依赖关系,后者反映变量之间的任何类型(包括线性和非线性关系)的相关性,和F检验相似,既可以做回归,也可以做分类,并且包含两个类feature_selection.mutual_info_classif(互信息分类)和feature_selection.mutual_info_regression(互信息回归)。

文献:《Design and Analysis of Experiments》

《A review of feature selection techniques in bioinformatics》

链接:http://appliedpredictivemodeling.com/

https://github.com/scikit-learn/scikit-learn

https://scikit-learn.org/stable/auto_examples/feature_selection/plot_f_test_vs_mi.html#sphx-glr-auto-examples-feature-selection-plot-f-test-vs-mi-py

https://scikit-learn.org/stable/modules/generated/sklearn.feature_selection.SelectKBest.html#sklearn.feature_selection.SelectKBest

http://lijiancheng0614.github.io/scikit-learn/modules/generated/sklearn.feature_selection.SelectFpr.html

0 人点赞