Pandas-15.window函数

2019-05-29 17:17:20 浏览数 (1)

Pandas-15.window函数

以如下代码作为测试基础:

代码语言:javascript复制
df = pd.DataFrame(np.random.randn(10, 4), index = pd.date_range('1/1/2020', periods=10),columns=["A", "B", "C","D"])
'''

A   B   C   D
2020-01-01  1.423760    -0.901543   0.302208    -0.066452
2020-01-02  1.358759    -0.286062   -0.667683   0.957295
2020-01-03  1.680685    -1.200288   -1.027512   1.107693
2020-01-04  0.530800    0.744215    -0.371192   -0.424567
2020-01-05  0.774829    1.247949    -1.720958   1.374363
2020-01-06  -1.205282   0.170897    1.127205    -1.709388
2020-01-07  -0.418933   -2.237936   -0.102924   -0.251697
2020-01-08  -1.228783   1.438976    0.797958    2.991456
2020-01-09  0.894173    -2.297820   -0.808664   0.789931
2020-01-10  -0.077157   -0.905713   0.064675    0.782972
'''

.rolling()函数

  • 滚动统计,指定周期
  • 指定window=n参数(必选)
  • 之后加上统计函数

指定周期求平均值:

代码语言:javascript复制
print(df.rolling(window=5).mean())
'''
                   A         B         C         D
2020-01-01       NaN       NaN       NaN       NaN
2020-01-02       NaN       NaN       NaN       NaN
2020-01-03       NaN       NaN       NaN       NaN
2020-01-04       NaN       NaN       NaN       NaN
2020-01-05  1.153767 -0.079146 -0.697027  0.589666
2020-01-06  0.627958  0.135342 -0.532028  0.261079
2020-01-07  0.272420 -0.255033 -0.419076  0.019281
2020-01-08 -0.309474  0.272820 -0.053982  0.396033
2020-01-09 -0.236799 -0.335587 -0.141477  0.638933
2020-01-10 -0.407197 -0.766319  0.215650  0.520655
'''

.expanding()函数

  • 扩展统计,累计计算
  • 指定参数min_periods=n(可选)

从第五项开始累加求和

代码语言:javascript复制
print(df.expanding(min_periods=5).sum())
'''
                   A         B         C         D
2020-01-01       NaN       NaN       NaN       NaN
2020-01-02       NaN       NaN       NaN       NaN
2020-01-03       NaN       NaN       NaN       NaN
2020-01-04       NaN       NaN       NaN       NaN
2020-01-05  5.768834 -0.395729 -3.485137  2.948331
2020-01-06  4.563552 -0.224832 -2.357932  1.238943
2020-01-07  4.144619 -2.462768 -2.460856  0.987247
2020-01-08  2.915836 -1.023792 -1.662898  3.978702
2020-01-09  3.810008 -3.321612 -2.471562  4.768633
2020-01-10  3.732851 -4.227325 -2.406888  5.551605
'''

.ewm()

  • 指数加权滑动
  • 一般用于移动平均
  • 指定参数:
    • com: Center of mass ,c = (s - 1) / 2
    • span:所谓的“N天移动平均”
    • Half-life:指数权重减少一般的时间周期
    • Alpha:直接指定平滑因子

以下代码计算移动平均

代码语言:javascript复制
print (df.ewm(com=0.5).mean())
'''
A   B   C   D
2020-01-01  1.423760    -0.901543   0.302208    -0.066452
2020-01-02  1.375009    -0.439932   -0.425210   0.701358
2020-01-03  1.586631    -0.966333   -0.842188   0.982667
2020-01-04  0.873945    0.188287    -0.524266   0.032784
2020-01-05  0.807595    0.897647    -1.325358   0.930866
2020-01-06  -0.536166   0.412481    0.311930    -0.831721
2020-01-07  -0.457975   -1.355272   0.035234    -0.444861
2020-01-08  -0.971926   0.507844    0.543794    1.846366
2020-01-09  0.272203    -1.362694   -0.357890   1.142040
2020-01-10  0.039292    -1.058035   -0.076176   0.902657
'''

0 人点赞