天氣爬取程式

2021-10-04 08:11:18 字數 2175 閱讀 9625

以下是911天氣爬取的程式

爬取內容包括時間,,天氣,溫度,濕度,風力,風級,降水量,體感溫度,雲量,不過最近911沒資料了,於是又換了天氣爬取**,對應文章:

import requests

from bs4 import beautifulsoup

from collections import defaultdict

from dateutil.relativedelta import relativedelta

from datetime import datetime

class

weather_data

:def

__init__

(self,city,start_year,end_year,start_month=

1,end_month =12)

:"""

:param city: 需爬取的城市全拼

:param start_year: 爬取開始年份

:param end_year: 爬取結束年份

:param start_month: 爬取開始月份

:param end_month: 爬取結束月份

"""self.city = city

self.start_time = datetime.strptime(f"-"

,'%y-%m'

) self.end_time = datetime.strptime(f"-"

,'%y-%m'

)def

_get_original_html

(self)

:"""

網頁爬取

"""url = f""

print

(url)

header =

#填寫自己瀏覽器內容

response = requests.get(url, headers=header)

return response.content.decode(

"utf-8"

)def

_parse_data

(self)

:#一次解析乙個月

soup = beautifulsoup(self.html,

"html.parser"

) data = defaultdict(

dict

)for n, tr in

enumerate

(soup.find_all(

"tr"))

:if n ==0:

continue

if n%2!=

0:date = tr.find(

"a")

.get_text(

)#建立日期字典

#[時間,,天氣,溫度,濕度,風力,風級,降水量,體感溫度,雲量]

data[date]

["day"]=

else

: data[date]

["night"]=

return data

defmain

(self)

: data =

while self.start_time<=self.end_time:

self.html = self._get_original_html())

) self.start_time+=relativedelta(months=1)

return data

if __name__ ==

"__main__"

: t = weather_data(city=

"jinan"

,start_year=

2017

,end_year=

2020

,start_month=

1,end_month=2)

with

open

('weather_dict.txt'

,'w'

,encoding=

'utf-8'

)as f:

for line in t.main():

f.writelines(

str(line)

)

爬取中國天氣

import requests from bs4 import beautifulsoup def parser page url headers response requests.get url,headers headers text response.content.decode utf 8...

python 爬取天氣

準備工作做好了,接下來就是 了 用py爬天氣資訊,需要使用兩個模組,分別是urllib2 獲取資料 和json 解析資料 coding utf 8 import urllib2 import json from city import city cityname raw input 你想查哪個城市的...

Python 爬取天氣資訊

第一次python部落格,僅作紀念。import requests import re from bs4 import beautifulsoup requests庫從網上獲取資源,re bs4 庫,用來提取需要的資訊。開啟要爬取的 右擊檢視其源 找到感興趣的內容,如下 2020年01月12日 星期...