ANSI和UTF8格式檔案的互相轉換

utf8檔案有獨特的檔案頭標誌，前面3個位元組為：0xef, 0xbb, 0xbf；

ansi檔案沒有標頭檔案，直接開始放置內容。

轉換關鍵點2：

utf8和ansi相互轉換時，均需要使用unicode作為中間轉換值；也就是utf8先轉為unicode,然後再將unicode轉換為ansi；反之亦然。

以下**可以將ansi編碼格式的檔案轉換為utf8編碼的檔案：

#include #include #include #include #include #include using namespace std;
int fileansi2utf8(char* strfilename)
byte head[3];
fread(head,3,1,fp); 3);
//判斷是否帶有bom檔案頭
if (head[0] == 0xef && head[1] == 0xbb && head[2] == 0xbf)
fseek(fp,0,seek_set);filereader.seektobegin();
char linebuffer[512] = ;
char linenew[1024]=;
file *fpnew = fopen("newutf8.txt", "w+");//file *fp = fopen("cc.cfg", "r+");
if(fpnew == null)
const unsigned char arybom = ;
fwrite(arybom, sizeof(arybom),1,fpnew);
while(fgets(linebuffer, 512, fp))
fclose(fp);
fclose(fpnew);
return 0;
}

以下**可以將utf8編碼格式的檔案轉換為ansi編碼的檔案：

int fileutf82ansi(char* strfilename)
byte head[3];
fread(head,3,1,fp); 3);
//判斷是否帶有bom檔案頭
if (head[0] != 0xef || head[1] != 0xbb || head[2] != 0xbf)
fseek(fp,3,seek_set);filereader.seektobegin();
char linebuffer[512] = ;
char linenew[1024]=;
file *fpnew = fopen("newansi.txt", "w+");//file *fp = fopen("cc.cfg", "r+");
if(fpnew == null)
//const unsigned char arybom = ;
//fwrite(arybom, sizeof(arybom),1,fpnew);
while(fgets(linebuffer, 512, fp))
fclose(fp);
fclose(fpnew);
return 0;
}

UTF8格式簡介

utf是unicode傳輸格式，有utf8,utf16,utf32等。這裡介紹utf8。utf8分成單位元組雙位元組三位元組四位元組模式。具體如下 0 x 7bit 相容ascii碼 110 xx 10 11bit 1110 x 10 10 16bit 漢字所在 11110 10 10 10 ...

流式校驗UTF8格式

由於某個工程需要校驗資料是否是utf8格式的，然後翻了翻rfc，確實挺簡單的編碼格式，所以直接寫了，但是越寫越感覺不對，位元組不夠時真的需要cache嗎？想肯定還有非常簡單的方法，果不其然，找到了這篇流式校驗 utf8 原始碼 static unsigned char types static u...

Mac Excel開啟UTF 8格式的檔案亂碼

今天在寫python的時候發現了乙個問題，用elementtree解析xml的時，生成的csv檔案用excel開啟的時候出現了亂碼的情況但是用word或者文字開啟的時候就是正常的原因分析此種情況一般是匯出的檔案編碼的問題。在簡體中文環境下，excel開啟的csv檔案預設是ansi編碼，如果cs...

ANSI和UTF8格式檔案的互相轉換

UTF8格式簡介

流式校驗UTF8格式

Mac Excel開啟UTF 8格式的檔案亂碼

相關推薦