Scrapy學習筆記 settings

settins中的一些配置

# 你的爬蟲專案的名字 bot_name # 搜尋你編寫的spider的目錄，為乙個列表 spider_modules # 新建的spider的目錄 newspider_module # 是否遵循**的robots.txt來爬取資料 robotsetxt_obey # 設定請求的標識 user_agent # 是否啟用cookie cookies_enabled # 是否啟用telnet控制台 telnetconsole_enabled # 設定請求頭 default_request_headers = # 爬蟲終止的條件：item的個數，頁面的數量，超時的次數，錯誤的次數 closespider_itemcount = 10 closespider_pagecount = 10 closespider_timeout = 10 closespider_errorcount = 10# 設定爬取的最大深度 depth_limit = 3# 設定爬蟲的中介軟體，為乙個字典，數值越低，優先順序越高 spider_middlewares = **********_middlewares = # 設定擴充套件 extensions = # 設定管道，優先順序與中介軟體的設定相同 item_pipelines = file_store = 'files' images_store = 'images' # 設定過期天數，天為單位 file_expires = 90 images_expires = 30# 設定生成的縮圖 images_thumbs = # 過濾器 images_min_height = 110 images_min_width = 110# scrapy的自動限速擴充套件 autothrottle_enabled = true autothrottle_start_delay = 5 autothrottle_max_delay = 60 autothrottle_target_concurrency = 1.0 autothrottle_debug = false # 配置scrapy執行的最大併發請求，預設為16 concurrent_requests = 16# 設定超時時間 download_timeout = 10# 為同一**設定請求延遲 download_delay = 1.5# 對單個**進行併發請求的最大值，只會使用下面的其中乙個,ip為非0時，domain不起作用 concurrent_requests_per_domain = 16 concurrent_requests_per_ip =

16

Scrapy學習筆記 settings

Scrapy學習筆記

Scrapy學習筆記（三）

scrapy 爬蟲學習筆記

Scrapy學習筆記 settings

Scrapy學習筆記

Scrapy學習筆記（三）

scrapy 爬蟲學習筆記

相關推薦