爬取汽車之家

爬汽車之家新聞

"""
### 爬取汽車之家新聞
import requests
# 向汽車之家傳送get請求，獲取到頁面
ret = requests.get('')
# print(ret.text)
# 用bs4解析
from bs4 import beautifulsoup
# 例項化得到物件，傳入要解析的文字，解析器
# html.parser內建解析器，速度要稍微慢一點，但是不需要裝第三方模組
# lxml：速度快一些，得安裝 pip install lxml
soup = beautifulsoup(ret.text, 'html.parser') # 傳入乙個字串
# soup = beautifulsoup(open('a.html', 'r'))  # 也可以是乙個檔案
# find(找乙個)
# find_all(找所有)
# 找到頁面的所有的li標籤
li_list = soup.find_all(name='li')
for li in li_list:
# li是tag物件
# print(type(li))  # h3 = li.find(name='h3')
if not h3:
continue
title = h3.text  # 新聞標題
desc = li.find(name='p').text # 新聞摘要
# 物件支援取值，為什麼？ 因為重寫了__getitem__魔法方法
新聞摘要：%s
新聞：%s
"""%(title, desc, url, img))
"""

PYTHON爬取汽車之家資料

使用知識使用說明源 usr bin env python coding utf 8 time 2020 1 16 15 34 author wsx site file cars.py software pycharm import json from multiprocessing import...

Python練習 scrapy 爬取汽車之家文章

autohome.py spider檔案 coding utf 8 import scrapy from autohome.items import autohomeitem class autohomespider scrapy.spider name autohome allowed domai...

手寫爬取靜態頁面汽車之家

scrapy寫多了,手寫爬蟲有點生疏,今天來回顧手寫爬取靜態頁面,以便日後做筆記用,我今天爬取的是汽車之家網頁,第一步匯入requests和bs4 import requests from bs4 import beautifulsoup 第三步解析頁面,在這裡我們用的beautifulsoup...

爬取汽車之家

PYTHON爬取汽車之家資料

Python練習 scrapy 爬取汽車之家文章

手寫爬取靜態頁面汽車之家

相關推薦