CU4C字符集檢測和轉換，C 版本

1.icuuc簡介

icu4c是icu在c/c++平台下的版本, icu(international component for unicode)是基於」ibm公共許可證」的，與開源組織合作研究的, 用於支援軟體國際化的開源專案。icu4c提供了c/c++平台強大的國際化開發能力，軟體開發者幾乎可以使用icu4c解決任何國際化的問題，根據各地的風俗和語言習慣，實現對數字、貨幣、時間、日期、和訊息的格式化、解析，對字串進行大小寫轉換、整理、搜尋和排序等功能，必須一提的是，icu4c提供了強大的bidi演算法，對阿拉伯語等bidi語言提供了完善的支援。

tar -zxvf icu4c-49_1_2-src.tgz

cd icu/source

./configure

make

make install12

3453.**

3.1 myicu.h

#ifndef myicu_h

#define myicu_h

#include 「unicode/utypes.h」

#include 「unicode/ucsdet.h」

#include 「unicode/ucnv.h」

#include

using namespace std;

#define buf_max 4096

class myicu;

#endif //myicu_h12

3456

78910

1112

1314

1516

1718

1920

2122

2324

2526

273.2 myicu.cpp

#include 「myicu.h」

const int buffsize=8192;

myicu::myicu(const char* filename):m_filename(filename)

myicu::~myicu()

bool myicu::detecttextencoding()

cout<#endif

if(matchcount > 0)

cout<<"charset = "3456

78910

1112

1314

1516

1718

1920

2122

2324

2526

2728

2930

3132

3334

3536

3738

3940

4142

4344

4546

4748

4950

5152

5354

5556

5758

5960

6162

6364

6566

6768

6970

7172

7374

7576

7778

7980

8182

8384

8586

8788

8990

9192

9394

9596

9798

99100

101102

103104

105106

107108

109110

111112

3.3 main.cpp

#include 「myicu.h」

#include

#define buf_max 4096

int main(){

const char* filename = 「123.txt」;

myicu myicu(filename);

//char* buff = new char[126];

bool flag = myicu.detecttextencoding();

if(!flag){

std::cout<<「解析錯誤!」1將/usr/local/目錄加進去,然後再

ldconfig

1就行了。

你們可以試下自己準備的檔案。

CU4C字符集檢測和轉換，C 版本

Unicode字符集和多字符集

字符集和字符集編碼詳解

寬位元組字符集和多字符集

CU4C字符集檢測和轉換，C 版本

Unicode字符集和多字符集

字符集和字符集編碼詳解

寬位元組字符集和多字符集

相關推薦