icu 字串編碼探測及字串編碼轉換例項

編譯： g++ -o x x.cpp -licuuc -licui18n

請大家確認是否安裝icu庫

#include #include #include #include #include #define buf_max     4096
/* * data，    傳入引數， 需要探測的字串
* len，     傳入引數， 探測字串長度
* detected  傳出引數， 探測的最有可能的字元編碼名稱, 呼叫者需要釋放該欄位
**/bool detecttextencoding(const char *data, int32_t len, char **detected)
#endif
if(matchcount > 0)
printf("charset = %s\n", *detected);
ucsdet_close(csd);
return true;}/*
* toconvertername,      轉換後的字元編碼
* fromconvertername,    轉換前的字元編碼
* target,               儲存轉換後的字串， 傳出引數
* targetcapacity,       儲存容量，target的大小
* source，              需要轉換的字串
* sourcelength,         source的大小
**/int convert(const char *toconvertername, const char *fromconvertername,
char *target, int32_t targetcapacity, const char *source, int32_t sourcelength)
int main(int argc, char **argv)
file *file;
char *filename = argv[1];
file = fopen(filename, "rb");
if(file == null) 
int len = 0;
char *detected = null;
char *buffer = new char[buf_max];
char *target = new char[buf_max * 2];
while(true)
//轉換為utf8字元編碼
if(convert("utf-8", detected, target, buf_max * 2, (const char*)buffer, len) != u_zero_error)
printf("%s", target); //列印出轉換的檔案的字串
if(len < buf_max)
break;
}delete  buffer;
delete  target;
delete  detected;
fclose(file);
return 0;
}

字串編碼

1.unicode 的編碼方式編碼類似1小時和60分鐘的關係，本質的時間刻度還是相同的。unicode 編碼有 utf 8 utf 16 和 utf 32 它們都是將數字轉換到程式資料的編碼方案。utf 8 以位元組為單位。表示乙個字元時，能用乙個位元組就不用兩個或者三個位元組表示。utf 16 ...

編碼及字串方法

python學習第三天記憶要點一 1.ascii碼為8bit 2.gbk為16bit 3.utf 8英文8bit 歐洲16bit 中文24bit 二字串與整形，只要不為空或0，都為true 三 1.字串索引為從0開始 2.name 12312313 msg name a b c a為開始位址，...

ORACLE in 字串,字串,字串

因為傳進來的引數是字串,字串,字串，要實現in 字串,字串,字串 select from htl price p where p.hotel id 30073328 and p.able sale date between to date 2009 03 27 yyyy mm dd and to ...

icu 字串編碼探測及字串編碼轉換例項

字串編碼

編碼及字串方法

ORACLE in 字串,字串,字串

相關推薦