Python: Requests库的调用方法以及控制访问的参数

2022-09-20 13:57:25 浏览数 (1)

文章背景:网络爬虫已经成为自动获取互联网数据的主要方式。Requests模块是Python的第三方模块,能够满足日常的网络请求,而且简单好用。因此,下面对Requests库的使用进行介绍。

1 Request库的7个主要方法

对于网络爬虫而言,主要用到的是get()和head()这两个方法。

2 HTTP协议对资源的操作
3 Request库的7个方法解析
3.1 requests.request()

requests.request(method, url, **kwargs)

  • method: 请求方式,对应get/head/post/put/patch/delete/options等7种;
  • url: 拟获取页面的url链接;
  • **kwargs:控制访问的参数,共13个。
    • params: 字典或字节序列,作为参数增加到url中;
    • data: 字典、字节序列或文件对象,作为Request的内容;
    • json: JSON格式的数据,作为Request的内容;
    • headers: 字典,HTTP定制头;
    • cookies: 字典或CookieJar,Request中的cookie;
    • auth: 元组,支持HTTP认证功能;
    • files: 字典类型,传输文件;
    • timeout: 设定超时时间,秒为单位;
    • proxies: 字典类型,设定访问代理服务器,可以增加登录认证;
    • allow_redirects: True/False,默认为True,重定向开关;
    • stream : True/False,默认为True,获取内容立即下载开关;
    • verify: True/False,默认为True,认证SSL证书开关;
    • cert: 本地SSL证书路径。
3.2 requests.get()

requests.get(url, params=None, **kwargs)

3.3 requests.head()

requests.head(url, **kwargs)

3.4 requests.post()

requests.post(url, data=None, json=None, **kwargs)

3.5 requests.put()

requests.put(url, data=None, **kwargs)

3.6 requests.patch()

requests.patch(url, data=None, **kwargs)

3.7 requests.delete()

requests.delete(url, **kwargs)

4 Request and Response Objects

Whenever a call is made to requests.get() and friends, you are doing two major things.

First, you are constructing a Request object which will be sent off to a server to request or query some resource.

Second, a Response object is generated once Requests gets a response back from the server.

The Response object contains all of the information returned by the server and also contains the Request object you created originally.

r = requests.get(url)

返回一个包含服务器资源的Response对象。

If we want to access the headers the server sent back to us, we do this:

代码语言:javascript复制
r.headers

However, if we want to get the headers we sent the server, we simply access the request, and then the request’s headers:

代码语言:javascript复制
r.request.headers
5 Response对象的属性

参考资料:

[1] 中国大学MOOC: Python网络爬虫与信息提取(https://www.icourse163.org/course/BIT-1001870001

[2] Requests: HTTP for Humans(https://requests.readthedocs.io/en/master/

[3] python爬虫基础requests库的使用以及参数详解(https://blog.csdn.net/weixin_45887687/article/details/106162634

0 人点赞