求解爬蟲爬取的鏈結字尾名為什麼會變啊

name = 'bizhixiazai'
allowed_domains = ['netbian.com']
start_urls = ['']
rules = (
rule(linkextractor(allow=r'/index.+htm',restrict_xpaths=['//div[@class="page"]//a']),follow=true),
rule(linkextractor(allow=r'.+htm',restrict_xpaths=['//div[@class="list"]//a']),callback='parse_detail',follow=false)

已經指定了爬取的範圍，為什麼爬取出來的路徑字尾由htm變為了html？網頁鏈結顯示是htm的鏈結啊，沒有發現有html的鏈結啊，小白求解

這是爬取的結果：

、、、2020-12-03 14:55:06 [scrapy.core.engine] debug: crawled (200) (referer: )

2020-12-03 14:55:07 [scrapy.core.engine] debug: crawled (200) (referer: )

2020-12-03 14:55:09 [scrapy.core.engine] debug: crawled (200) (referer: )

2020-12-03 14:55:10 [scrapy.core.engine] debug: crawled (200) (referer: )

2020-12-03 14:55:12 [scrapy.core.engine] debug: crawled (200) (referer: )

、、、這是網頁源截圖：

如圖，class=page下是沒有html的鏈結的，自動把爬取的hml鏈結變為了html鏈結，為什麼，求解

日常小記統計字尾名為 cc c h的檔案數

在專案開發時，有時候想知道原始碼檔案中有多少字尾名為.cc c h的檔案。下面介紹linux幾種方法統計字尾名為.cc c h的檔案數的方法。我以python3的源為例，python3的原始碼共有檔案數這段時間在學習python3，我就把它作為例子啦。感慨下 python3跟python2比變化...

日常小記統計字尾名為 cc c h的檔案數

2011 03 29 16 50 by 吳秦,5801 閱讀,收藏,編輯在專案開發時，有時候想知道原始碼檔案中有多少字尾名為.cc c h的檔案。下面介紹linux幾種方法統計字尾名為.cc c h的檔案數的方法。我以python3的源為例，python3的原始碼共有檔案數這段時間在學習pyth...

求解爬蟲爬取的鏈結字尾名為什麼會變啊

日常小記 統計字尾名為 cc c h的檔案數

日常小記 統計字尾名為 cc c h的檔案數

日常小記 統計字尾名為 cc c h的檔案數

相關推薦

日常小記統計字尾名為 cc c h的檔案數

日常小記統計字尾名為 cc c h的檔案數

日常小記統計字尾名為 cc c h的檔案數