C 中判斷文字檔案編碼格式的靜態類

1、定義乙個判斷文字檔案編碼格式的靜態類--textencodingtype：

using system;
using system.collections.generic;
using system.linq;
using system.text;
using system.threading.tasks;
using system.io;
namespace demo_menu
/// /// 通過給定的檔案流，判斷檔案的編碼型別
/// 
/// 檔案流
/// 檔案的編碼型別
public static system.text.encoding gettype(filestream fs)
;byte unicodebig = new byte ;
byte utf8 = new byte ; //帶bom
encoding reval = encoding.default;
binaryreader r = new binaryreader(fs, system.text.encoding.default);
int i;
int.tryparse(fs.length.tostring(), out i);
byte ss = r.readbytes(i);
if (isutf8bytes(ss) || (ss[0] == 0xef && ss[1] == 0xbb && ss[2] == 0xbf))
else if (ss[0] == 0xfe && ss[1] == 0xff && ss[2] == 0x00)
else if (ss[0] == 0xff && ss[1] == 0xfe && ss[2] == 0x41)
r.close();
return reval;
}/// /// 判斷是否是不帶 bom 的 utf8 格式
/// 
/// 
/// 
private static bool isutf8bytes(byte data)
//標記位首位若為非0 則至少以2個1開始 如:110***xx...........1111110x
if (charbytecounter == 1 || charbytecounter > 6)}}
else
charbytecounter--;}}
if (charbytecounter > 1)
return true;}}
}

2、在需要的地方呼叫它，如：

private void mi_openfile_click(object sender, eventargs e)
}

如何判斷文字檔案的編碼格式？

這裡指的文字是用於windows系統中的擴充套件名為.txt的檔案。notepad 記事本只支援四種格式 ansi unicode unicode big endian uft 8，在delphi中如何判斷與讀取這些不同格式的文字呢？首先，不同編碼的文字，是根據文字的前兩個位元組來定義其編碼格式的...

判斷乙個文字檔案的編碼格式

不同編碼的文字，是根據文字的前兩個位元組來定義其編碼格式的。定義如下 ansi 無格式定義 unicode 前兩個位元組為fffe unicode big endian 前兩位元組為feff utf 8 前兩位元組為efbb 檔案的字符集在windows下有兩種，一種是ansi，一種unicode。...

判斷乙個文字檔案的編碼格式

檔案的字符集在windows下有兩種，一種是ansi，一種unicode。對於unicode，windows支援了它的三種編碼方式，一種是小尾編碼 unicode 一種是大尾編碼 bigendianunicode 一種是utf 8編碼。我們可以從檔案的頭部來區分乙個檔案是屬於哪種編碼。當頭部開始的兩...

C 中判斷文字檔案編碼格式的靜態類

如何判斷文字檔案的編碼格式？

判斷乙個文字檔案的編碼格式

判斷乙個文字檔案的編碼格式

相關推薦