python實現檔案編碼格式轉換

方法一

def 
_read(from_filename, from_encode):
with 
open(from_filename, 
"r", 
encoding=from_encode) as f:
for lines in 
iter(lambda: tuple(islice(f, 
1000000)), ()):
_queue.put(lines)
_queue.put(none)
def 
convert_file_to_utf8(p_task, **kwargs):
""":param
:param
kwargs:
:return
:"""
local_file = kwargs['ti'].xcom_pull(task_ids=p_task)
convert_file = local_file[0:len(local_file) - 3] + "csv"
th = process(target=_read, 
args=(local_file, 
"gb18030"))
th.start()
with 
open(convert_file, 
"w", 
encoding="utf-8") as f:
while true:
lines = _queue.get()
if lines is none:
break
f.write(''.join(lines))
th.join()

方法二

with 
open(filename, 
'r', 
encoding="gb18030") as f:
i = 0
for row_ in f:
row = row_.strip().encode("utf-8").decode("utf-8")
i += 1

效能比較：對1m大小檔案進行實驗。方法一的消耗時間為44893微秒，方法二消耗時間為

49015微秒。使用方法一比使用方法二轉換檔案格式更快。

Python 中的檔案編碼格式

在python中檔案的預設編碼格式是utf 8，我們也可以告訴python檔案的編碼格式，例如指定檔案的編碼格式為gb2312 coding gb2312 下面是乙個例子檔案編碼格式為gb2312 print 檔案編碼格式是gb2312 如果不指定執行結果報錯 file test.py line ...

python改變檔案的編碼格式

1 defchange file code coding,files name 2try 3 cache data linecache.getlines files name 4 with open files name,wb as out file 5for line in range len c...

檔案編碼格式

知道問題所在，還是沒有解決，又苦苦搜尋，終於在 stackoverflow 上找到靈感，可以把 open 的方式變為二進位制，也就是下面裡的 open filename,rb 這下好了，至少後面的read 可以通過。再之後就產生了以下發現問題的路真心不好走，在此mark 下。coding ut...

python實現檔案編碼格式轉換

Python 中的檔案編碼格式

python改變檔案的編碼格式

檔案編碼格式

相關推薦