使用Python获取数据,并使用pyecharts可视化,绘制国内、国际日增长人数地图,matplotlib绘制方寸图。同时代码是在notebook中完成,
随笔记录所的所学,此博客为我记录文章所用,发布到此,仅供网友阅读参考。作者:北山啦
写在前面
:这个已经不是什么新鲜的话题了,所以请大佬勿喷我服了,这个CSDN,由于网页变化,爬取代码报错,修改后,就G了。为了能发出来,我要修改一些keywords,分成两个部分来发布
导入相关模块
代码语言:javascript复制import time
import json
import requests
from datetime import datetime
import pandas as pd
import numpy as np
1. 疫情数据的获得
通过Tencent 新闻发布的网页进行获得
对于静态网页,我们只需要把网页地址栏中的url传到get请求中就可以轻松地获取到网页的数据。 对于动态网页抓取的关键是先分析网页数据获取和跳转的逻辑,再去写代码 。
右击检查,选择Network,Ctrl R即可
记得安装快速第三方库
代码语言:javascript复制pip3 install -i https://pypi.tuna.tsinghua.edu.cn/simple pyecharts
代码语言:javascript复制# 定义抓取数据函数
def Domestic():
url = 'https://api.inews.qq.com/newsqa/v1/query/inner/publish/modules/list?modules=statisGradeCityDetail,diseaseh5Shelf'
response = requests.get(url=url).text
data = json.loads(response)['data']['diseaseh5Shelf']
return data
def Oversea():
url = 'https://view.inews.qq.com/g2/getOnsInfo?name=disease_foreign'
reponse = requests.get(url=url).json()
data = json.loads(reponse['data'])
return data
domestic = Domestic()
oversea = Oversea()
print(domestic.keys())
print(oversea.keys())
2. 初步分析
提取各地区数据明细
代码语言:javascript复制# 提取各地区数据明细
areaTree = domestic['areaTree']
# 查看并分析具体数据
areaTree
提取国外地区数据明细
代码语言:javascript复制# 提取国外地区数据明细
foreignList = oversea['foreignList']
# 查看并分析具体数据
foreignList
就可以看到在json数据存储的结构了
3. 数据处理
3.1 国内各省疫情数据提取
代码语言:javascript复制# Adresss:https://beishan.blog.csdn.net/
china_data = areaTree[0]['children']
china_list = []
for a in range(len(china_data)):
province = china_data[a]['name']
confirm = china_data[a]['total']['confirm']
heal = china_data[a]['total']['heal']
dead = china_data[a]['total']['dead']
nowConfirm = confirm - heal - dead
china_dict = {}
china_dict['province'] = province
china_dict['nowConfirm'] = nowConfirm
china_list.append(china_dict)
china_data = pd.DataFrame(china_list)
china_data.to_excel("国内疫情.xlsx", index=False) #存储为EXCEL文件
china_data.head()
province | nowConfirm | |
---|---|---|
0 | 香港 | 323 |
1 | 上海 | 40 |
2 | 四川 | 34 |
3 | 台湾 | 30 |
4 | 广东 | 29 |
3.2 国际疫情数据提取
代码语言:javascript复制world_data = foreignList
world_list = []
for a in range(len(world_data)):
# 提取数据
country = world_data[a]['name']
nowConfirm = world_data[a]['nowConfirm']
confirm = world_data[a]['confirm']
dead = world_data[a]['dead']
heal = world_data[a]['heal']
# 存放数据
world_dict = {}
world_dict['country'] = country
world_dict['nowConfirm'] = nowConfirm
world_dict['confirm'] = confirm
world_dict['dead'] = dead
world_dict['heal'] = heal
world_list.append(world_dict)
world_data = pd.DataFrame(world_list)
world_data.to_excel("国外疫情.xlsx", index=False)
world_data.head()
country | nowConfirm | confirm | dead | heal | |
---|---|---|---|---|---|
0 | 美国 | 7282611 | 30358880 | 552470 | 22523799 |
1 | 西班牙 | 193976 | 3212332 | 72910 | 2945446 |
2 | 法国 | 2166003 | 2405255 | 57671 | 181581 |
3 | 秘鲁 | 111940 | 422183 | 19408 | 290835 |
4 | 英国 | 90011 | 104145 | 13759 | 375 |