爬蟲常用的requests庫的學習使用

2021-10-05 10:31:06 字數 3692 閱讀 5084

import requests

response = requests.get(

'')print

(type

(response)

)print

(response.status_code)

print

(type

(response.text)

)print

(response.text)

print

(response.cookies)

import requests

response = requests.get(

'')print

(response.text)

out:

,"headers":,

"origin"

:"120.243.219.224"

,"url"

:""}

import requests

data =

response = requests.get(

'',params=data)

print

(response.text)

out:

,"headers":,

"origin"

:"120.243.219.224"

,"url"

:"?name=germey&age=22"

}

import requests

response = requests.get(

'?name=germey&age=22'

)print

(response.text)

out:

,"headers":,

"origin"

:"120.243.219.224"

,"url"

:"?name=germey&age=22"

}

# 獲取cookie

_xsrf=tii9fy2iowantycfwepw1d1jtr6sickh

模擬登入:

import requests

requests.get(

'')response = requests.get(

'')print

(response.text)

# 因為兩次是不同的遊覽器進行的請求,所以得不到原來的那個cookie值

out:

}

import requests

# 模擬登入驗證

s = requests.session(

)s.get(

'')response = s.get(

'')print

(response.text)

# 用requests.session()來進行會話維持,所以能得到之前那次請求的cookie值

}

證書驗證:

import requests

response = requests.get(

'')print

(response.status_code)

out:

200

import requests

from requests.packages import urllib3

urllib3.disable_warnings(

)# 用來消除警告資訊

response = requests.get(

'',verify=

false

)# 利用引數verify來設定是否驗證

print

(response.status_code)

out:

200

**設定:

第一種:

import requests

proxies =

response = requests.get(

'',proxies=proxies)

print

(response.status_code)

第二種:

import requests

proxies =

response = requests.get(

'',proxies=proxies)

print

(response.status_code)

第三種:

pip install requests[socks]

import requests

proxies =

response = requests.get(

'',proxies=proxies)

print

(response.status_code)

超時設定

import requests

response = requests.get(

'',timeout=1)

print

(response.status_code)

out:

200

異常處理:

import requests

from requests.exceptions import readtimeout, connectionerror, requestexception

try:

response = requests.get(

'',timeout=

0.5)

print

(response.status_code)

except readtimeout:

print

('timeout'

)except connectionerror:

print

('connection error'

)except requestexception:

print

('error'

)

out:

connection error

針對python爬蟲requests庫的基礎問題

r requests.get url,kwargs kv r requests.request get params kv print r.url params 字典或位元組序列,作為引數增加到url中 headers 字典,http定製頭 模擬瀏覽器進行訪問 timeout 設定超時時間,秒為單位...

爬蟲 Requests 庫的入門學習

此為北理嵩天老師mooc課程 網路爬蟲與資訊提取 的課程學習筆記,附帶一些其他書籍部落格的資料。使用命令列輸入 pip install requests或者 python m pip install requests方法名稱 說明requests.request 最基礎的,構造請求,支撐其他方法的使...

Python爬蟲 requests庫的基本使用

一 基本認識 1 傳送乙個get請求 import requests if name main 獲取乙個get請求 response requests.get 2 關於獲取請求到資料常見的返回值 import requests if name main 獲取乙個get請求 response requ...