测试环境
CPU: Intel® Core™ i7-10700F 磁盘: ST1000DM010-2EP102 系统:windows10
测试代码
代码语言:javascript复制import os
import pickle
import numpy as np
import pandas as pd
import time
def read_pkl(path):
f = open(path, 'rb')
test = pickle.load(f)
f.close()
return test
def read_csv(path):
test = pd.read_csv(path)
return test
def read_npy(path):
test = np.load(path)
return test
if __name__ == '__main__':
s_t = time.time()
root_path = "ceshi"
# 写入
for i in range(200):
a = np.ones([1000, 1000])
# 写入npy
np.save('ceshi/%s.npy' % i, a)
# 写入pkl
# f = open('ceshi/%s.pkl' % i, 'wb')
# pickle.dump(a, f)
# f.close()
# 写入csv
# pd.DataFrame(a).to_csv('ceshi/%s.csv' % i)
# 读取
# for i in os.listdir(root_path):
# test = read_pkl(os.path.join(root_path, i))
e_t = time.time()
print(e_t - s_t)
测试结果
文件类型 | 写入时间 | 读取时间 |
---|---|---|
npy | 8.997190713882446 | 0.6495010852813721 |
pkl | 10.918317794799805 | 1.3253061771392822 |
csv | 36.7954158782959 | 10.26122784614563 |
注:此时间为多次运行的平均时间
结论
npy文件读写比pkl文件快,但二者差距并不大; npy只能写入一个数组/矩阵而pkl可以支持迭代写入,写入不同长度的数据 csv耗时最长,但生成的文件经过了压缩,体积为其它格式的1/2,适用于空间紧张的场景。