Scrapy 命令列工具

①startproject ——全域性命令

在當前路徑下建立乙個名為myproject的·scrapy專案

語法：scrapy startproject myproject

②genspider ——-需要專案

在當前專案中建立spider僅僅是建立spider的一種快捷方法，可以使用提前定義好的模板來生成spider

語法：scrapy genspider name domain.com

這裡domin.com為網域名稱、要爬取的範圍，name為spider（蜘蛛）名

③crawl ——-需要專案

使用spider進行爬取

語法：scrapy crawl name

④check ——-需要專案

用於檢查錯誤

語法：scrapy check

⑤list ——-需要專案

用於列出當前專案中所有可用的spider。在命令列中每行輸出乙個spider。

語法：scrapy list

⑥edit ——需要專案

使用editor中設定的編輯器編輯給定的spider（一般都是選擇其他工具比如idle等進行編寫和除錯）

語法：scrapy edit name

⑧view ——不需要專案

請求url，把它的網頁原始碼儲存成檔案，並開啟網頁

語法：scrapy view url

⑨shell ——-不需要專案

語法：scrapy shell url

進入url進入互動模式，在未啟動spider的情況下嘗試、測試或除錯爬取**。其會自動建立response物件和selector物件，同時會有乙個sel物件。可以呼叫例如：response.body、sel.xpath()等

⑩parse ——=需要專案

獲取給定的url並使用相應的spider分析處理

語法：scrapy parse url

11 runspider ——-不需要專案

在未建立專案的情況下，執行乙個編寫在python檔案中的spider，與crawl的區別是runsider執行的是檔案的名稱+拓展名

語法：scrapy runspider .py

12 version ———-不需要專案

輸出scrapy的版本。配合-v執行時，該命令同時輸出python，twisted以及平台資訊，方便bug提交

語法：scrapy version

語法：scrapy version -v

scrapy 命令列操作

1.建立專案 scrapy startproject myproject cd myproject 2.建立爬蟲 scrapy genspider t crawl myspider www.baidu.com 建立有rules配置 3.執行爬蟲 scrapy crawl myspider 4.錯誤檢...

scrapy爬蟲》scrapy命令列操作

1.mysql資料庫 2.mongodb資料庫 3.redis資料庫 1.建立專案 scrapy startproject myproject cd myproject 2.建立爬蟲 scrapy genspider t crawl myspider www.baidu.com 建立有rules配置...

Scrapy學習過程之七命令列工具

參考關於scrapy命令列工具的配置檔案，其格式為ini。配置檔案存在於以下幾個地方 etc scrapy.cfg or c scrapy scrapy.cfg 這個是系統級配置檔案 config scrapy.cfg xdg config home and scrapy.cfg這個是使用者級專...

Scrapy 命令列工具

scrapy 命令列操作

scrapy爬蟲》scrapy命令列操作

Scrapy學習過程之七 命令列工具

相關推薦

Scrapy學習過程之七命令列工具