Pandas-16.聚合

2019-05-29 17:17:32 浏览数 (1)

Pandas-16.聚合

以如下代码作为例子:

代码语言:javascript复制
df = pd.DataFrame(np.random.randint(-10,10, (5,4)),
      index = pd.date_range('1/1/2020', periods=5),
      columns = ['A', 'B', 'C', 'D'])

在整个数据窗口内应用聚合

代码语言:javascript复制
print(df)
print(df.rolling(window=3,min_periods=1).aggre)
'''
            A  B  C  D
2020-01-01 -6  3 -5  3
2020-01-02 -2  8 -2 -6
2020-01-03 -4 -6  3  2
2020-01-04 -4 -5  1  0
2020-01-05 -3 -9  2 -6
               A     B    C    D
2020-01-01  -6.0   3.0 -5.0  3.0
2020-01-02  -8.0  11.0 -7.0 -3.0
2020-01-03 -12.0   5.0 -4.0 -1.0
2020-01-04 -10.0  -3.0  2.0 -4.0
2020-01-05 -11.0 -20.0  6.0 -4.0
'''

DataFrame的单列进行聚合

代码语言:javascript复制
print(df)
print("----------")
print(df.rolling(window=3,min_periods=1).A.aggregate(np.sum))
'''
            A  B  C  D
2020-01-01 -6  3 -5  3
2020-01-02 -2  8 -2 -6
2020-01-03 -4 -6  3  2
2020-01-04 -4 -5  1  0
2020-01-05 -3 -9  2 -6
----------
2020-01-01    -6.0
2020-01-02    -8.0
2020-01-03   -12.0
2020-01-04   -10.0
2020-01-05   -11.0
Freq: D, Name: A, dtype: float64
'''
多列聚合
代码语言:javascript复制
print(df)
print("----------")
print(df.rolling(window=3,min_periods=1)["A","C"].aggregate(np.sum))
'''
            A  B  C  D
2020-01-01 -6  3 -5  3
2020-01-02 -2  8 -2 -6
2020-01-03 -4 -6  3  2
2020-01-04 -4 -5  1  0
2020-01-05 -3 -9  2 -6
----------
               A    C
2020-01-01  -6.0 -5.0
2020-01-02  -8.0 -7.0
2020-01-03 -12.0 -4.0
2020-01-04 -10.0  2.0
2020-01-05 -11.0  6.0
'''

多函数

代码语言:javascript复制
print(df)
print("----------")
print(df.rolling(window=3,min_periods=1)["A","C"].aggregate([np.sum,np.mean]))
'''
            A  B  C  D
2020-01-01 -6  3 -5  3
2020-01-02 -2  8 -2 -6
2020-01-03 -4 -6  3  2
2020-01-04 -4 -5  1  0
2020-01-05 -3 -9  2 -6
----------
               A              C          
             sum      mean  sum      mean
2020-01-01  -6.0 -6.000000 -5.0 -5.000000
2020-01-02  -8.0 -4.000000 -7.0 -3.500000
2020-01-03 -12.0 -4.000000 -4.0 -1.333333
2020-01-04 -10.0 -3.333333  2.0  0.666667
2020-01-05 -11.0 -3.666667  6.0  2.000000
'''

不同函数不同列

代码语言:javascript复制
print(df)
print("----------")
print(df.rolling(window=3,min_periods=1).aggregate({"A": np.sum, "C":np.mean}))
'''
            A  B  C  D
2020-01-01 -6  3 -5  3
2020-01-02 -2  8 -2 -6
2020-01-03 -4 -6  3  2
2020-01-04 -4 -5  1  0
2020-01-05 -3 -9  2 -6
----------
               A         C
2020-01-01  -6.0 -5.000000
2020-01-02  -8.0 -3.500000
2020-01-03 -12.0 -1.333333
2020-01-04 -10.0  0.666667
2020-01-05 -11.0  2.000000
'''

0 人点赞