python3中的url編碼和解碼

在用python進行web開發的時候，當url中含有中文，那麼傳遞到後台伺服器的會是編碼過的url，我們可以用python3把編碼後的文字轉化成我們可以識別的內容。如下操作:

import urllib
test_str = "哈哈哈"
print(test_str)
new = urllib.parse.quote(test_str)
print(new)
old = urllib.parse.unquote(new)
print(old)

執行**，執行結果如下:

哈哈哈 %e5%93%88%e5%93%88%e5%93%88

哈哈哈

中間的%e5%93%88%e5%93%88%e5%93%88就是經過url編碼後的內容。

利用pycharm開發工具，我們可以看一下quote編碼函式和unquote解碼函式的構成。

關於quote函式:

def quote(string, safe='/', encoding=none, errors=none):
if isinstance(string, str):
if not string:
return string
if encoding is none:
encoding = 'utf-8'
if errors is none:
errors = 'strict'
string = string.encode(encoding, errors)
else:
if encoding is not none:
raise typeerror("quote() doesn't support 'encoding' for bytes")
if errors is not none:
raise typeerror("quote() doesn't support 'errors' for bytes")
return quote_from_bytes(string, safe)

我們可以看到文字的預設編碼方式是utf-8編碼,我們可以更換成另一種編碼方式來看一下輸出結果

test_str = "哈哈哈"
print(test_str)
new = urllib.parse.quote(test_str, encoding="gbk")
print(new)
old = urllib.parse.unquote(new, encoding="gbk")
print(old)
輸出結果:
哈哈哈%b9%fe%b9%fe%b9%fe
哈哈哈

根據結果也可以了解到urf-8編碼乙個漢字被編為3個百分號開頭的字串，而gbk編碼乙個漢字被編為2個百分號開頭的字串。unqoute也是同樣的道理，故不再贅述。

Python3的URL編碼解碼

前言最近在用python3練習一些爬蟲指令碼的時候，發現一些url的編碼問題，在瀏覽器提交請求api時，如果url中包含漢子，就會被自動編碼掉。呈現的結果是 xx xx xx。如果出現3個百分號為乙個原字元則為utf8編碼，如果2個百分號則為gb2312編碼。下面為大家演示編碼和解碼的 from ...

Python3中的編碼

1.編碼 1.1 ascii 乙個字元佔8位，1.2 utf 8 乙個字元佔8位乙個中文佔24位 1.3 gbk 乙個字元佔8位乙個中文佔16位 1.4 unicode 1.4 只說unicode的32位乙個字元佔32位乙個中文字同樣佔32位問題 unicode中的編碼方式能否utf 8中...

python3編碼宣告 python3編碼問題彙總

這兩天寫了個監測網頁的爬蟲，作用是跟蹤乙個網頁的變化，但執行了一晚出現了乙個問題。希望大家不吝賜教！我用的是python3，錯誤在對html response的decode時丟擲，原樣為 response urllib.urlopen dsturl content response.read dec...

python3中的url編碼和解碼

Python3的URL編碼解碼

Python3中的編碼

python3編碼宣告 python3編碼問題彙總

相關推薦