Python爬蟲學習，抓取網頁上的天氣資訊

今天學習了使用python編寫爬蟲程式，從中國天氣網爬取杭州的天氣。使用到了urllib庫和bs4。bs4提供了專門針對html的解析功能，比用re方便許多。

# coding : utf-8
import sys
reload(sys)
sys.setdefaultencoding( "utf-8" )
from bs4 import beautifulsoup
import csv
import urllib
defget_html
(url):
html = urllib.urlopen(url)
return html.read()
defget_data
(html_text):
final = 
bs = beautifulsoup(html_text, "html.parser")
body = bs.body
data = body.find('div', )
ul = data.find('ul')
li = ul.find_all('li')
for day in li:
temp = 
date = day.find('h1').string
inf = day.find_all('p')
if inf[1].find('span') is
none:
temperature_highest = none
else:
temperature_highest = inf[1].find('span').string
temperature_highest = temperature_highest.replace('c', '')
temperature_lowest = inf[1].find('i').string
temperature_lowest = temperature_lowest.replace('c', '')
return final
defwrite_data
(data, name):
file_name = name
with  open(file_name, 'a') as f:
f_csv = csv.writer(f)
f_csv.writerows(data)
if __name__ == '__main__':
html_doc = get_html('')
result = get_data(html_doc)
write_data(result, 'weather.csv')
print result

執行結果儲存在csv檔案中，如下：

28日（今天）,小雨,,13℃ 29日（明天）,小雨轉陰,15℃,12℃ 30日（後天）,多雲,19℃,14℃ 31日（周一）,小雨,16℃,14℃ 1日（周二）,陰轉多雲,16℃,10℃ 2日（週三）,多雲轉晴,17℃,10℃

3日（周四）,多雲轉晴,18℃,11℃

爬蟲基礎 Python 抓取網頁（學習筆記）

import urllib.request url headers 瀏覽器偽裝 request urllib.request.request url,headers headers 發出請求開啟和讀取url請求並且爬取網頁內容 try response urllib.request.urlopen...

python 爬蟲實現網頁資訊抓取

首先實現關於網頁解析讀取等操作我們要用到以下幾個模組 import urllib import urllib2 import re def test f urllib.urlopen while true firstline f.readline print firstline 我們大概要做幾件事...

python多執行緒爬蟲抓取網頁

突發想法，抓取資料以便採用機器學習分析練手，網頁為年份。步驟如下 1 每乙個子執行緒抓取每一年的網頁 2 抓取網頁後利用正規表示式抽取資料，存入多維list。3 構建sql語句，存入mysql。user bin env python3 coding utf 8 from bs4 import be...

Python爬蟲學習，抓取網頁上的天氣資訊

爬蟲基礎 Python 抓取網頁（學習筆記）

python 爬蟲實現網頁資訊抓取

python多執行緒爬蟲抓取網頁

相關推薦