簡單爬蟲爬取斗羅大陸3的100個章節

搞了一晚上終於搞好了這個爬蟲。。。話說獲得下一章的標籤真不容易，用到了select選擇器，然後獲得a標籤裡面的href屬性費了好大勁，測試了下爬取100章差不多花了半分鐘

**：

#coding=utf-8
import sys
reload(sys)
sys.setdefaultencoding('utf-8')
import urllib2
from bs4 import beautifulsoup
import requests
f=open('text.txt','wb')
url=""
r=urllib2.urlopen(url).read()
soup=beautifulsoup(r,"html.parser")
links=soup.find_all(id='content')
page=1
while page<100:
for link in links:
d=link.text
f.write(d+'\n')
temp=soup.select('div.bottem2 > a')
lis=beautifulsoup(str(temp),"html.parser").find_all('a')
url=""+lis[3]['href']
r = urllib2.urlopen(url).read()
soup = beautifulsoup(r, "html.parser")
links = soup.find_all(id='content')
page=page+1
f.close()

Python爬蟲爬取全書網小說斗羅大陸

1 匯入模組 2 開啟網頁，獲取原碼 3 獲取章節原碼 4 獲取正文 5 過濾雜質廢話不多說開始爬！今天爬的是全書網斗羅大陸 import urllib.request 開啟和瀏覽url中內容 import re 匹配我們需要的內容 import urllib.request import...

簡單的爬蟲爬取文章

我們會用一些簡單的爬蟲去爬取等，那麼在別人的中我們的應選擇對應的標題等資料作為爬取的內容標桿如以下模擬瀏覽器發請求 connection connect jsoup.connect document doc connect.get elements select doc.select lis...

Python3爬蟲 01 簡單網頁爬取

宇宙黑客王磊磊 python3爬蟲簡單網頁的獲取第乙個簡單的示例爬去hades 官網首頁 import sys import urllib.request print sys.getdefaultencoding url 請求request urllib.request.request url...

簡單爬蟲 爬取斗羅大陸3的100個章節

Python爬蟲 爬取全書網小說斗羅大陸

簡單的爬蟲爬取文章

Python3爬蟲 01 簡單網頁爬取

相關推薦

簡單爬蟲爬取斗羅大陸3的100個章節

Python爬蟲爬取全書網小說斗羅大陸