Python: Requests库的调用方法以及控制访问的参数

文章背景：网络爬虫已经成为自动获取互联网数据的主要方式。Requests模块是Python的第三方模块，能够满足日常的网络请求，而且简单好用。因此，下面对Requests库的使用进行介绍。

对于网络爬虫而言，主要用到的是get()和head()这两个方法。

requests.request(method, url, **kwargs)

requests.get(url, params=None, **kwargs)

requests.head(url, **kwargs)

requests.post(url, data=None, json=None, **kwargs)

requests.put(url, data=None, **kwargs)

requests.patch(url, data=None, **kwargs)

requests.delete(url, **kwargs)

Whenever a call is made to requests.get() and friends, you are doing two major things.

First, you are constructing a Request object which will be sent off to a server to request or query some resource.

Second, a Response object is generated once Requests gets a response back from the server.

The Response object contains all of the information returned by the server and also contains the Request object you created originally.

r = requests.get(url)

返回一个包含服务器资源的Response对象。

If we want to access the headers the server sent back to us, we do this:

代码语言：javascript复制

r.headers

However, if we want to get the headers we sent the server, we simply access the request, and then the request’s headers:

代码语言：javascript复制

r.request.headers

参考资料：

[1] 中国大学MOOC: Python网络爬虫与信息提取（https://www.icourse163.org/course/BIT-1001870001）

[2] Requests: HTTP for Humans（https://requests.readthedocs.io/en/master/）

[3] python爬虫基础requests库的使用以及参数详解（https://blog.csdn.net/weixin_45887687/article/details/106162634）

0 人点赞