爬蟲基礎知識及requests常用方法總結

一、瀏覽器disable cache 和 preserve log的作用..

二、複製url亂碼情況

from urllib.parse import urlencode

.三、requests請求

res=resquests.get(url)
print(res)  #得到的是物件
print(res.text) #文字
print(res.content) #得到的是二進位制檔案
res.cookies  ===>返回乙個cookies物件
res.cookies.get_dict()===>獲得cookie字典

四、瀏覽器報錯

400 中不到資源 500 伺服器錯誤 200 成功

五、requests.get/requests.post請求引數

requests.get(url(url請求位址),headers=""(請求頭),params,json ，data(不是json型別就需要dumps,form表單資料),cookies(cookies資料),allow_redirects=true(指定是否讓請求重定向),cert(存放安全認證的資訊)=("/path/server(檔名，可以自己命名).crt","/path/key"))

.七、requests傳送請求出現htpps ssl改進方法

方法一、

import requests
response=requests.get("",verify=false)
print(response.text)

缺點：還會出現警告

.改進方法二、

import urllib3
import requsets
urllib3.disable_warnings()
response=requests.get("",verify=false)
print(response.text)

八、requests使用**ip

1、requests傳送http|https協議(使用**ip)

import requests
res=requests.get(url,proxies=)

2、reqursts傳送其他的sock的協議

import requests
res=requests.get(url,proxies=)

九、requests.auth用法

import reqeusts

十、requests file功能

import requests
files=
response=requests.post(url,files=files)
print(response.status_code)

爬蟲 Requests庫基礎知識

4.requests庫的異常 5.爬蟲通用框架 6.url格式方法說明 requests.get 獲取html網頁 requests.head 獲取html網頁頭資訊 requests.post 向html網頁提交post請求 requests.put 向html網頁提交put請求 reques...

爬蟲基礎知識

大資料時代，要進行資料分析，首先要有資料來源。而學習爬蟲，可以讓我們獲取更多的資料來源，並且這些資料來源可以按我們的目的進行採集，去掉很多無關資料。網路爬蟲又被稱為網頁蜘蛛，網路機械人就是模擬客戶端傳送網路請求，接收請求響應，一種按照一定的規則，自動地抓取網際網路資訊的程式。只要是瀏覽器能做的事...

Python網路爬蟲原理及基礎知識

爬蟲步驟 1.獲取網頁，使用urllib,requests等第三方庫構造http請求 2.提取資訊，使用正規表示式或者beautifulsoup，pyquery，lxml等工具分析網頁原始碼，提取所需要的資料 3.儲存資料，mongodb,mysql等多種工具 4.自動化程式，抓取過程中的批處理，異...

爬蟲基礎知識及requests常用方法總結

爬蟲 Requests庫基礎知識

爬蟲基礎知識

Python網路爬蟲 原理及基礎知識

相關推薦

Python網路爬蟲原理及基礎知識