python編碼問題 2

先上**：

# -*- coding: utf-8 -*-
import sys
import urllib2
import re
import chardet
import sys
print sys.getdefaultencoding()  
html = ''
src = urllib2.urlopen(html).read()
print chardet.detect(keyword)
print chardet.detect(src)
match = re.compile(keyword)
list = match.findall( src)
for line in list:
print line

在win7上輸出是：

ascii

手機看新聞

windows上的idle(python gui)，預設編碼為ascii碼（第一行）；

cp936 -> cp1252 ， why????

讀取網頁的編碼貌似取自網頁。

為什麼，cp1252的編碼能夠在gb2312的編碼的字串中找到匹配？

解答

Python2編碼問題

以下內容說的都是 python 2.x 版本我們看到的輸入輸出都是字元 characters 計算機程式並不能直接處理，需要轉化成位元組資料 bytes 因為程式只能處理 bytes 資料。例如檔案網路傳輸等，處理的都是 bytes 資料二進位制數字。孤立的 byte 是毫無意義的，所...

Python2編碼問題

python2 編碼問題

coding utf 8 import sys reload sys sys.setdefaultencoding utf8 第一行是讓以utf8格式解析後面三行是讓python直譯器在decode時候用utf8進行decode 這樣所有字串都是utf8的了，如果遇到非utf8字串可以用deco...

python編碼問題 2

Python2編碼問題

Python2編碼問題

python2 編碼問題

相關推薦