python 2 7的字串編碼問題

【如何計算漢字字串的長度】

s=u"我的"

len(s)=2

print(s.encode('utf-8')) # utf-8環境漢字長度是以字為單位，print的時候必須編碼為非unicode字元

【unicode字元】

unichr(11) # 將十進位制的11轉換為unicode字元 u'\0x0b'，除錯狀態下其作為dict的value顯示空格，作為key時顯示u'\0x0b'。

【print中文】

print str(u'你們') # 錯誤, str函式適用於非unicode，比如str('你們')

print u'你們' # 非控制台下面錯誤，print用ascii轉換unicode，必須先以utf8編碼，即 print(u'你們'.encode('utf8'))

>> print u'你們' # 控制台下面是正確的，原因未知，但是可以肯定的時候在控制台python是用utf8進行轉碼

【如何在**中直接列印unicode】

print u'你們' + unicode('我的', encoding='utf8') # 中文必須先轉換unicode之後串接

或者reload(sys)

sys.setdefaultencoding("utf-8") # python預設是以ascii進行編譯碼，跟"coding: utf-8 "頭無關

print u'你們' + '我的' # 這樣就可以混合串接列印,python會自動以utf-8編碼方式先將"我的"轉unicode，後一起encode

所以在python2.7中，setdefaultencoding()非常重要，可以避免很多編碼錯誤。

【coding: utf-8檔案頭】

沒有檔案頭的時候 a=u'我' 顯示亂碼，print '我' 正常；所以暫時來看主要影響unicode，當然可能也跟執行的作業系統環境有關。

python2 7中文編碼 python2 7

我從外部api中獲得了乙個字串 u4ece u8d77 u70b9 u5411 u6b63 u5357 u65b9 u5411 u51fa u53d1,u884c u9a76170 u7c73,u76f4 u884c u8fdb u5165 u4e2d u5173 u6751 u4e1c u8def...

python2 7編碼問題

在使用python從庫里拿到乙個字段進行比較時，出現以下錯誤 ascii codec can t decode byte 0xe5 in position 3 ordinal not inrange 128 原因 python的str預設是ascii編碼，和unicode編碼衝突,而從資料庫中拿到的...

python2 7編碼的問題

python2.7預設的ascii編碼，遇到中文時會出現一大堆毛病。1 對於中文的輸出，特別是輸出到檔案中去，需要在編碼開題加上 coding utf 8，中文字元需要做相應的編碼轉換。如 str u 啦啦啦，德瑪西亞 e str.encode utf 8 with open out.txt w a...

python 2 7的字串編碼問題

python2 7中文編碼 python2 7

python2 7編碼問題

python2 7編碼的問題

相關推薦