最新 最热

Reducing dimensionality with PCA主成分分析之降维

Now it's time to take the math up a level! Principal component analysis (PCA) is the first somewhat advanced technique discussed in this book. While everything ...

2020-04-20
1

使用Pipelines来整合多个数据预处理步骤

Pipelines are (at least to me) something I don't think about using often, but are useful.They can be used to tie together many steps into one object. This allow...

2020-04-20
1

Working with categorical variables处理分类变量

Categorical variables are a problem. On one hand they provide valuable information; on the other hand, it's probably text—either the actual text or integers cor...

2020-04-20
1

Creating binary features through thresholding通过阈值来生成二元特征

In the last recipe, we looked at transforming our data into the standard normal distribution.Now, we'll talk about another transformation, one that is quite dif...

2020-04-20
1

Scaling data to the standard normal缩放数据到标准正态形式

A preprocessing step that is almost recommended is to scale columns to the standard normal. The standard normal is probably the most important distribution of a...

2020-04-20
1

scikit-learn Cookbook 01

I will again implore you to use some of your own data for this book, but in the event you cannot,we'll learn how we can use scikit-learn to create toy data.

2020-04-20
1

scikit-learn Cookbook 00

This chapter discusses setting data, preparing data, and premodel dimensionality reduction.These are not the

2020-04-20
1

这几个冷门却实用的 Python 库,我爱了!

Python 是一个很棒的语言。它是世界上发展最快的编程语言之一。它一次又一次地证明了在开发人员职位中和跨行业的数据科学职位中的实用性。整个 Python 及其库的生态系统使它成为全世界用户(初学者和高级用户)的合适选...

2020-04-02
1

手把手带你开启机器学习之路——房价预测(一)

本文我们使用加州住房价格数据集,从零开始,一步一步建立模型,预测每个区域的房价中位数。目的是完整实现一个机器学习的流程。

2020-04-01
1

独家 | 拓扑机器学习的神圣三件套:Gudhi,Scikit-Learn和Tensorflow(附链接&代码)

本文简要介绍了机器学习中拓扑数据分析的力量并展示如何配合三个Python库:Gudhi,Scikit-Learn和Tensorflow进行实践。

2020-03-26
1