转-复现,机器学习-支持向量机(SVM)方法判断一个网页是列表页还是详情页

2022-08-11 18:03:49 浏览数 (1)

https://mp.weixin.qq.com/s/rAwr0_jWMXagHOvhzrE9DA

https://baijiahao.baidu.com/s?id=1639719949469452687&wfr=spider&for=pc

让电脑做一个二分类。

代码语言:javascript复制
from gerapy_auto_extractor import is_detail, is_list, probability_of_detail, probability_of_list
from gerapy_auto_extractor.helpers import content, jsonify

html = content('detail.html')
print(probability_of_detail(html), probability_of_list(html))
print(is_detail(html), is_list(html))

html = content('list.html')
print(probability_of_detail(html), probability_of_list(html))
print(is_detail(html), is_list(html))

numpy需要卸载重装。

先卸载numpy: pip uninstall numpy

下载numpy:pip install numpy

把网页保存为detail.html、list.html

0 人点赞