爬取糗事百科，朗讀段子

一閒下來就不務正業了，寫個爬蟲，聽段子。

額，mac自帶的語音朗讀，windows我就不知道啦，有興趣的可以去研究一下哈。

環境

python 2.7

mac os 10.12

使用朗讀的**

from subprocess import call
call(['say', 'hello pengge'])

當然了，聽起來並不是很順耳，不過我聽了好一會兒之後就習慣了，有種暴走漫畫的感覺 = =!

抓取資料

使用了urllib2，新增乙個header就可以啦

# 抓取某一頁全部的資料
try:
url = '' + str(pageindex)
user_agent = 'mozilla/5.0 (macintosh; intel mac os x 10_12_0)'
headers = 
request = urllib2.request(url, headers = headers)
response = urllib2.urlopen(request)
return response.read().decode('utf-8')
except urllib2.urlerror, e:
if hasattr(e,"reason"):
print
'--------------------------'
print
u"連線糗事百科失敗,錯誤原因：\n",e.reason
print
'--------------------------'
return
none

解析資料

pattern = re.compile('clearfix">.*?href.*?(.*?)h2>.*?.*?(.*?)span>.*?a>(.*?)class="stats.*?class="
number">(.*?)i>',re.s)

程式開始時選擇選項

選擇是展示還是朗讀，展示要每次按鍵，展示一條，朗讀則是自動，一條一條無限朗讀，好在朗讀是個同步的方法，省去自己好多時間

print
'抓取糗事百科，q退出'
print
'需要自動朗讀嗎？y／n'
while
true:
input = raw_input()
if input == "y"
or input == "y":
self.voice = true
print
'已經選擇自動朗讀，curl + z 退出'
break;
elif input == "n"
or input == "n":
self.voice = false
break;
else:
print
'只能輸入 y/n'
self.stop = false
self.getnewpage()
self.showstory()

結束語

爬取糗事百科段子

user bin env python coding utf 8 author holley file baike1.py datetime 4 12 2018 14 32 description import requests import re import csv from bs4 impor...

Scrapy 爬取糗事百科段子

1.python爬蟲實戰一之爬取糗事百科段子 2.在工作目錄建立myproject scrapy startproject myproject3.編寫 myproject myproject items.py coding utf 8 define here the models for your ...

爬取糗事百科段子內容

import requests,sqlite3,re class processdatatool object 資料處理的工具類工具類中一般不寫 init 初始化屬性，只封裝工具方法對資料進行操作。工具類中的方法一般是以工具類居多。classmethod def process data cls,...

爬取糗事百科，朗讀段子

爬取糗事百科段子

Scrapy 爬取糗事百科段子

爬取糗事百科段子內容

相關推薦