爬取zol索尼相機排行榜

乙個很有趣的個人部落格,不信你來撩 fangzengye.com

import requests
import re
import json
from bs4 import beautifulsoup
def

get_one_page
(url)
:user_agent =
headers =
response = requests.get(url,headers)
return response.text

獲取網頁內容

def
get_information
(html_text)
:    pattern = re.
compile
('shtml">(.)
.*?"rank__price">(.)
.*?(.*?)'
, re.s)
items = re.findall(pattern,html_text)
for item in items:
yield

正則匹配

yield整合起資料結構

finaall返回匹配到的列表，裡面為元組

def
recording
(information)
:with
open
('豆瓣top250.txt'
,'a'
,encoding=
'utf-8'
)as f:
f.write(json.dumps(information,ensure_ascii=
false)+
'\n'
)

將爬到的資訊寫入檔案

def
main()
:for i in
range(0
,1):
response = get_one_page(
'')        html_text = get_information(response)
for m in html_text:
recording(m)
print
('正在爬取第'
+str
(i)+
'頁')
print
('爬取完畢！'
)main(

)

爬取zol索尼相機排行榜

import requests import re import json from bs4 import beautifulsoup defget one page url user agent headers response requests.get url,headers return re...

爬取貓眼電影排行榜

匯入我們需要的模組 import reimport requests 一獲取網頁內容 1 宣告目標url，就是爬取的位址 base url 2 模仿瀏覽器 headers 3 發起請求 response requests.get base url,headers headers 4 接收響應的資...

爬取豆瓣電影推薦排行榜

import requests from bs4 import beautifulsoup class dianying def html url self,url html requests.get url soup beautifulsoup html.text,lxml pai soup.se...

爬取zol索尼相機排行榜

爬取zol索尼相機排行榜

爬取貓眼電影排行榜

爬取豆瓣電影推薦排行榜

相關推薦