在pycharm中新建乙個專案kwscrapyspider
2.file->setting->python interpreter安裝scrapy
開啟terminal,執行命令:
scrapy startproject kwspider :建立scrapy專案
cd kwspider
scrapy genspider kuwo kuwo.cn :生成乙個爬蟲(網域名稱為允許爬取範圍)
(venv) e:\work\python\pycharmprojects\kwscrapyspider>scrapy startproject kwspider
new scrapy project 'kwspider'
, using template directory 'e:\work\python\pycharmprojects\kwscrapyspider\venv\lib\site-packages\scrapy\templates\project'
, created in
: e:\work\python\pycharmprojects\kwscrapyspider\kwspider
you can start your first spider with
: cd kwspider
scrapy genspider example example.com
(venv) e:\work\python\pycharmprojects\kwscrapyspider>cd kwspider
(venv) e:\work\python\pycharmprojects\kwscrapyspider\kwspider>scrapy genspider kuwo kuwo.cn
created spider 'kuwo' using template 'basic'
in module:
.(venv) e:\work\python\pycharmprojects\kwscrapyspider>
執行之後,專案目錄結構:
開啟kuwo.py
修改之前:
import scrapy
class
kuwospider
(scrapy.spider)
: name =
'kuwo'
allowed_domains =
['kuwo.cn'
] start_urls =
['']def
parse
(self, response)
:pass
def
parse
(self, response)
: div_list = response.xpath(
"//div[@class='rec_list']//div[@class='item']"
)for item in div_list:
# 歌曲名稱
name = item.xpath(
".//p[@class='name']/span/text()"
).extract_first(
)print
(name)
執行專案:
(venv) e:\work\python\pycharmprojects\kwscrapyspider\kwspider>scrapy crawl kuwo
每日最新單曲推薦
我買了兩本幾公尺的漫畫,另一本,將來送給你
當你放開手,遺忘在昨天
【德雲社】德雲女孩必修曲兒
【最新廣場舞】春暖花開,廣場舞跳起來
(venv) e:\work\python\pycharmprojects\kwscrapyspider\kwspider>
pic_out = item.xpath(
".//div[@class='pic_out']//img/@src"
).extract_first(
)print
(pic_out)
修改之後結果:
Scrapy框架學習
scrapy框架的工作流程 1.首先spiders 爬蟲 將需要傳送請求的url requests 經scrapyengine 引擎 交給scheduler 排程器 2.scheduler 排序,入隊 處理後,經scrapyengine,middlewares 可選,主要有user agent,pr...
爬蟲python框架 Scrapy學習筆記
首先啟用爬蟲裡面的starturl獲取響應response。再通過xpath提取資料,提取的資料通過建立的item物件暫存到item.py 資料中轉站 裡面的item裡面,item資料通過yield返回給管道,管道給寫入檔案儲存起來。items.py item 可以理解為資料的中轉類,因為我們爬取網...
Python爬蟲框架Scrapy學習筆記原創
scrapy toc 開始首先手動安裝windows版本的twisted pip install twisted 18.4.0 cp36 cp36m win amd64.whl 安裝scrapy pip install i scrapy windows系統額外需要安裝pypiwin32 pip in...