股票数据来源雅虎
数据抓取
安装抓取环境
https://github.com/pydata/pandas-datareader
pip下载 pip install pandas-datareader
代码语言:javascript复制可以不通过命令直接安装
bin/conda install pandas_datareader
使用datareader获取雅虎
``` import pandas_datareader as pdr pdr.get_data_fred(‘GS10’)
代码语言:javascript复制## 数据预览
```python
import pandas_datareader as pdr
# 获取阿里股票数据
alibaba = data = pdr.get_data_yahoo('BABA')
代码语言:javascript复制data.head()
High | Low | Open | Close | Volume | Adj Close | |
---|---|---|---|---|---|---|
Date | ||||||
2015-05-11 | 87.669998 | 86.059998 | 86.699997 | 86.720001 | 19776900 | 86.720001 |
2015-05-12 | 87.500000 | 86.139999 | 87.050003 | 86.769997 | 16077800 | 86.769997 |
2015-05-13 | 88.470001 | 87.000000 | 87.080002 | 87.529999 | 19015900 | 87.529999 |
2015-05-14 | 88.480003 | 87.529999 | 87.739998 | 88.400002 | 12087900 | 88.400002 |
2015-05-15 | 88.959999 | 88.050003 | 88.510002 | 88.459999 | 13424600 | 88.459999 |
alibaba.shape
代码语言:javascript复制(1259, 6)
代码语言:javascript复制alibaba.tail()
High | Low | Open | Close | Volume | Adj Close | |
---|---|---|---|---|---|---|
Date | ||||||
2020-05-04 | 195.000000 | 189.529999 | 194.759995 | 191.149994 | 25709400 | 191.149994 |
2020-05-05 | 198.270004 | 194.199997 | 196.380005 | 195.020004 | 22957200 | 195.020004 |
2020-05-06 | 198.910004 | 194.929993 | 197.669998 | 195.169998 | 18598900 | 195.169998 |
2020-05-07 | 198.089996 | 194.779999 | 198.000000 | 196.490005 | 16164600 | 196.490005 |
2020-05-08 | 203.020004 | 198.679993 | 199.800003 | 201.190002 | 23819700 | 201.190002 |
alibaba.describe()
High | Low | Open | Close | Volume | Adj Close | |
---|---|---|---|---|---|---|
count | 1259.000000 | 1259.000000 | 1259.000000 | 1259.000000 | 1.259000e 03 | 1259.000000 |
mean | 141.759526 | 138.353728 | 140.171177 | 140.087427 | 1.671934e 07 | 140.087427 |
std | 48.532706 | 47.441216 | 48.067582 | 47.999745 | 9.014614e 06 | 47.999745 |
min | 58.650002 | 57.200001 | 57.299999 | 57.389999 | 3.775300e 06 | 57.389999 |
25% | 90.514999 | 88.485001 | 89.110001 | 89.090000 | 1.098520e 07 | 89.090000 |
50% | 156.000000 | 151.600006 | 154.320007 | 154.100006 | 1.476290e 07 | 154.100006 |
75% | 182.567505 | 177.989998 | 180.660004 | 180.495003 | 1.996240e 07 | 180.495003 |
max | 231.139999 | 227.039993 | 230.050003 | 230.479996 | 9.791410e 07 | 230.479996 |
alibaba.info()
代码语言:javascript复制<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 1259 entries, 2015-05-11 to 2020-05-08
Data columns (total 6 columns):
High 1259 non-null float64
Low 1259 non-null float64
Open 1259 non-null float64
Close 1259 non-null float64
Volume 1259 non-null int64
Adj Close 1259 non-null float64
dtypes: float64(5), int64(1)
memory usage: 68.9 KB