python網路爬蟲爬取官網新通知，並傳送郵箱

簡介

考研結束後，因為要關注官網發布的錄取通知，每天都要去官網看好幾遍。於是便做乙個網路爬蟲來自動處理事件，可以判斷當天是否有最新的訊息發布，如果有就傳送到個人郵箱。

該爬蟲主要涉及的部分有：

網頁請求

網頁解析

時間判斷

郵件傳送

網頁請求

網頁請求使用的是常規的requests庫

def
get_response
(self, url)
:print
(url)
response = requests.get(url)
data = response.content
return data

頁面解析首先看一下所要爬取的頁面的結構：

首次所要爬取的目標便是官網通知的標題資訊，可以通過 id='content』下的a標籤來進行定位，這一部分的解析如下：

def
parse_data
(self,data)
:        soup = beautifulsoup(data,
'html.parser'
, from_encoding=
'gb18030'
)all
= soup.find(id=
"content"
)        new_url = self.url+
all.a[
'href'
]print
(all
.a.text)
self.notice =
all.a.text
#print(new_url)
return new_url

時間判斷對於每乙個通知，在其具體的html頁面下會有相應的時間資訊

判斷該日期和當前電腦日期是否相同，即可判斷是否是最新發布的訊息，該時間是通過class_="info-article"來進行定位的，同時做了一定的字串調整過程。系統當前時間則是呼叫了time庫來進行獲得。這裡沒有新公告，我也做了傳送操作，如果不需要直接將沒新公告下的傳送邏輯刪除即可。

def
get_content
(self,new_url)
:        data = requests.get(new_url)
.content
soup = beautifulsoup(data,
'html.parser'
, from_encoding=
'gb18030'
)        tim = soup.find(
'div'
,class_=
"info-article"
).text[6:
16]#print(tim)
current_time = time.strftime(
'%y-%m-%d'
, time.localtime(time.time())
)#print(current_time)
if tim.strip(
)== current_time.strip():
print
("今日有新公告"
)            self.notice = self.notice+
"今日有新公告"
self.send_email(self.notice)
else
:print
("今日沒有新公告"
)            self.notice =
"今日沒有新公告"
self.send_email(self.notice)

郵件傳送郵件傳送則用到了smtplib以及email模組，這裡直接pip install pyemail 就好，這裡需要強調的是password不是登入密碼，而是郵箱的授權碼，具體的獲取過程可以參考授權碼獲取

def
send_email
(self,email_body)
:        from_addr =
'***@qq.com'
password =
'***'
#這裡不是登入密碼，而是郵箱授權碼
# 收信方郵箱
to_addr =
'***@qq.com'
# 發信伺服器
smtp_server =
'smtp.qq.com'
# 郵箱正文內容，第乙個引數為內容，第二個引數為格式(plain 為純文字)，第三個引數為編碼
msg = mimetext(email_body,
'plain'
,'utf-8'
)# 郵件頭資訊
msg[
'from'
]= header(from_addr)
msg[
'to'
]= header(to_addr)
msg[
'subject'
]= header(
'newest notice'
)# 開啟發信服務，這裡使用的是加密傳輸
server = smtplib.smtp_ssl(smtp_server)
server.connect(smtp_server,
465)
# 登入發信郵箱
server.login(from_addr, password)
# 傳送郵件
server.sendmail(from_addr, to_addr, msg.as_string())
# 關閉伺服器
server.quit(
)

做的比較草率，有什麼不足的地方還請見諒，完整的**連線放在github上了，**連

python網路爬蟲爬取官網新通知，並傳送郵箱

python爬蟲爬取抽屜新熱榜

Python 網路爬蟲爬取表情包

python爬蟲爬取網路小說

python網路爬蟲爬取官網新通知，並傳送郵箱

python爬蟲 爬取抽屜新熱榜

Python 網路爬蟲 爬取表情包

python爬蟲爬取網路小說

相關推薦

python爬蟲爬取抽屜新熱榜

Python 網路爬蟲爬取表情包