Kernel PCA for nonlinear dimensionality reduction核心PCA非线性降维

2020-04-20 10:15:00 浏览数 (1)

Most of the techniques in statistics are linear by nature, so in order to capture nonlinearity,we might need to apply some transformation. PCA is, of course, a linear transformation.In this recipe, we'll look at applying nonlinear transformations, and then apply PCA for dimensionality reduction.

多数统计学技术都是自然线性的,所以如果想要处理非线性情况,我们需要应用一些变换,PCA当然是线性变换,以下,我们将先应用非线性变换,然后再应用PCA进行降维。

Getting ready准备工作

Life would be so easy if data was always linearly separable, but unfortunately it's not.Kernel PCA can help to circumvent this issue. Data is first run through the kernel function that projects the data onto a different space; then PCA is performed.

如果数据都是能够线性分割的,生活将是多轻松啊,但是不幸的是他不是,核心PCA能帮忙绕过这个问题,数据将首先经过能够将数据转换成另一种形式的核函数,然后PCA开始崭露头角。

To familiarize yourself with the kernel functions, it will be a good exercise to think of how to generate data that is separable by the kernel functions available in the kernel PCA.Here, we'll do that with the cosine kernel. This recipe will have a bit more theory than the previous recipes.

为了使你熟悉核函数,思考如何生成能够被核心PCA用核函数分割的数据将会是一个好的练习,我们将会用余弦核函数,这一步会比之前的步骤更偏理论一些。

How to do it...怎么做

The cosine kernel works by comparing the angle between two samples represented in the feature space. It is useful when the magnitude of the vector perturbs the typical distance measure used to compare samples.

余弦核函数能够比较两个样本在特征空间中的夹角,用测量物理距离之间的大小来比较样本间的差距是非常有效的。

As a reminder, the cosine between two vectors is given by the following:像提出的那样,两向量cos公式如下:

This means that the cosine between A and B is the dot product of the two vectors normalized by the product of the individual norms. The magnitude of vectors A and B have no influence on this calculation.

公式的意思是A、B夹角的COS值是两个向量的点积除以自己的模的乘积。AB向量的差距和他们的估计值无关。

So, let's generate some data and see how useful it is. First, we'll imagine there are two different underlying processes; we'll call them A and B:

现在让我们来生成一些数据,来看看如何使用。首先,我们设想有两个不同的潜在过程,我们称他们为A和B:

代码语言:javascript复制
import numpy as np
A1_mean = [1, 1]
A1_cov = [[2, .99], [1, 1]]
A1 = np.random.multivariate_normal(A1_mean, A1_cov, 50)
A2_mean = [5, 5]
A2_cov = [[2, .99], [1, 1]]
A2 = np.random.multivariate_normal(A2_mean, A2_cov, 50)
A = np.vstack((A1, A2))
B_mean = [5, 0]
B_cov = [[.5, -1], [-.9, .5]]
B = np.random.multivariate_normal(B_mean, B_cov, 100)

Once plotted, it will look like the following:绘图,他们会是这样:

By visual inspection, it seems that the two classes are from different processes, but separating them in one slice might be difficult. So, we'll use the kernel PCA with the cosine kernel discussed earlier:

通过视觉判断,有两类不同的过程,一刀切分辨他们会很难,所以我们用cos核模型的核心PCA提前讨论。

代码语言:javascript复制
kpca = decomposition.KernelPCA(kernel='cosine', n_components=1)
AB = np.vstack((A, B))
AB_transformed = kpca.fit_transform(AB)

Visualized in one dimension after the kernel PCA, the dataset looks like the following:

通过核心PCA后一维形象化,数据集将看起来是一下的样子:

Contrast this with PCA without a kernel:比较一下没有核的PCA

Clearly, the kernel PCA does a much better job.很明显,核心PCA表现很不错

How it works...如何工作的

There are several different kernels available as well as the cosine kernel. You can even write your own kernel function. The available kernels are:

有很多像cos函数一样不同的核函数可用,你也可以写自己的核函数,可选的核是:

1、poly (polynomial) 多项式核函数

2、rbf (radial basis function)径向基函数

3、sigmoid S型函数

4、cosine cos

5、precomputed 预计算

There are also options contingent of the kernel choice. For example, the degree argument will specify the degree for the poly , rbf , and sigmoid kernels; also, gamma will affect the rbf or poly kernels.

还有很多可组合的核函数,如级参数将为解释poly , rbf , and sigmoid核函数,γ将影响rbf or poly核函数

The recipe on SVM will cover the rbf kernel function in more detail.在SVM那部分将要包含更多关于rbf核函数的细节

A word of caution: kernel methods are great to create separability, but they can also cause overfitting if used without care.一点忠告,核方法很擅长分离,但是要注意因为不注意的使用它而引起的过拟合。

0 人点赞