scrapy 執行同個專案多個爬蟲

一開始我們預設都是只有乙個爬蟲的,所以執行的**都是在專案下建立乙個py檔案

from scrapy import cmdline
cmdline.execute('scrapy crawl 爬蟲名'.split( ))

但是要執行多個爬蟲就犯難了,在這裡我只是做個筆記加強記憶

原部落格

其中執行以下:

1、在spiders同級建立任意目錄，如：commands

2、在其中建立 crawlall.py 檔案（此處檔名就是自定義的命令）

crawlall.py

from scrapy.commands import
scrapycommand
from scrapy.utils.project import
get_project_settings  
class
command(scrapycommand):
requires_project =true
defsyntax(self):
return
'[options]
'def
short_desc(self):
return
'runs all of the spiders
'def
run(self, args, opts):
spider_list =self.crawler_process.spiders.list()
for name in
spider_list:
self.crawler_process.crawl(name, **opts.__dict__
)        self.crawler_process.start()

到這裡還沒完，settings.py配置檔案還需要加一條。

commands_module = 『專案名稱.目錄名稱』

專案名目錄名稱

commands_module = '

zhihuuser.commands

這就是幾乎完成了,如果需要執行,那麼只要在cmd中cd進專案中scrapy crawlall,或者專案下新建乙個py檔案使用scrapy.cmdline執行, 或者 os.system('scrapy crawlall')

Scrapy 執行多個爬蟲

本文所使用的 scrapy 版本 scrapy 1.8.0 多個爬蟲所有爬蟲顯然，這兩種情況並不一定是等同的。假設當前專案下有 3 個爬蟲，分別名為 route dining experience，並在專案目錄下建立乙個main.py檔案，下面的示例都寫在這個檔案中，專案執行時，在命令列下執行...

Scrapy 執行多個爬蟲spider檔案

1.在專案資料夾中新建乙個commands資料夾 2.在command的資料夾中新建乙個檔案 crawlall.py 3.在crawlall.py 中寫乙個command類，該類繼承 scrapy.commands from scrapy.commands import scrapycommand ...

為Scrapy專案提供多個Spider

scrapy startproject project name在終端輸入上述命令後，會根據生成乙個完整的爬蟲專案此時的專案樹如下 jobcrawler init py items.py middlewares.py pipelines.py settings.py spiders init py...

scrapy 執行同個專案多個爬蟲

Scrapy 執行多個爬蟲

Scrapy 執行多個爬蟲spider檔案

為Scrapy專案提供多個Spider

相關推薦