Python 爬蟲，爬取歷史天氣資料

先上原始碼

這次用的是beautifulsoup，解析html,非常的便捷

import datetime
import pandas as pd
import re
import requests
import time
from bs4 import beautifulsoup
headers = 
def get_html(url):
# 這個**第一次請求一般都會被反爬給擋住，所以要多請求幾次
while true:
r = requests.get(url, headers=headers)
print('從', url, '獲取資料')
if 'table' in r.text:
print('成功獲取資料')
return r.content
else:
print('甘霖涼雞掰，不給我資料')
time.sleep(1)
def parse_html(page_content):
soup = beautifulsoup(page_content, features='lxml')
table = soup.find('table')
item_list = table.find_all('tr')
month = 
for i in range(1, len(item_list)):
td = item_list[i].find_all('td')
day = list()
# 日期
# 高溫低溫
nums = re.findall(r'-?\d+', td[2].gettext())
# 天氣和風向
pattern = re.compile(r'\s+')
return month
def parse_date(text):
y = text.find('年')
m = text.find('月')
d = text.find('日')
return datetime.date(int(text[y - 4: y]), int(text[m - 2: m]), int(text[d - 2: d]))
def main():
data = 
for year in [2016, 2017]:
for month in range(1, 13):
print(f'爬取年月的天氣資料')
month_str = '0' + str(month) if month < 10 else str(month)
url = '' + str(year) + month_str + '.html'
h = get_html(url)
data.extend(parse_html(h))
frame = pd.dataframe(data, columns=['date', 'low_tp', 'high_tp', 'weather', 'wind'])
frame.to_csv('weather.csv', index=false)
if __name__ == '__main__':
main()

python爬取歷史天氣資料

import requests from requests.exceptions import requestexception from bs4 import beautifulsoup import os import csv import time def get one page url 獲...

python爬取靜態網頁歷史天氣資料

利用python庫 requests 和 beautifulsoup 對靜態網頁內容爬取這裡給出的例子是對乙個天氣的歷史天氣進行爬取待更新附python 一般網頁都會有 robots.txt 檔案，用來記錄使用者對資料和表單內容的許可權。直接在主頁後面加 robots.txt 即可訪問到。例...

Python爬取中國天氣網天氣資料

由於一些需要，想要獲取今天的天氣資料，於是又撿起了python寫了個爬蟲用來獲取中國天氣網上的氣象資料。由於我需要的資料比較簡單，因為我只需要北京地區當天的溫度最低溫度和最高溫度和天氣，因此部分比較簡單，下面就來講講這個爬取的過程。第一步網頁分析要進行爬蟲設計，首先得分析網頁的請求過程。首...

Python 爬蟲，爬取歷史天氣資料

python爬取歷史天氣資料

python爬取靜態網頁歷史天氣資料

Python爬取中國天氣網天氣資料

相關推薦