python BeautifulSoup

2020-01-09 14:44:55 浏览数 (17)

通过BeautifulSoup库的get_text方法找到网页的正文：

代码语言：javascript复制

#!/usr/bin/env python
#coding=utf-8

#HTML找出正文

import requests
from bs4 import BeautifulSoup

url='http://www.baidu.com'
html=requests.get(url)

soup=BeautifulSoup(html.text)
print soup.get_text()

beautifulsoup text

0 人点赞