倒排索引案例（一）

（1）第一次三個文字樣式做測試：

檔案內容：

分別為a.txt,b.txt,c.txt 裡面的資料：

（2）第一次預期輸出結果樣式：

inverted a.txt 3

inverted b.txt 1

inverted c.txt 3

mapreduce a.txt 2

mapreduce b.txt 2

mapreduce c.txt 3

hadoop a.txt 1

hadoop b.txt 1

hadoop c.txt 2

hdfs a.txt 1

hdfs b.txt 1

hdfs c.txt 2

index a.txt 4

index b.txt 1

index c.txt 3

map a.txt 1

map b.txt 1

map c.txt 1

reduce a.txt 1

reduce b.txt 1

reduce c.txt 1

倒排索引 a.txt 1

倒排索引 b.txt 1

倒排索引 c.txt 1

大資料 a.txt 1

大資料 b.txt 1

大資料 c.txt 1

1.獲取檔案物件------》2.然後獲取問價的名字-------》

/**
* 第一次預期輸出結果
* itstar--a.txt	3
* itstar--b.txt	2
* itstar--c.txt	2
* pingping--a.txt	 1
* pingping--b.txt	3
* pingping--c.txt	 1
* ss--a.txt	2
* ss--b.txt	1
* ss--c.txt	1
*/string name;
text t = new text();
intwritable v = new intwritable();
@override
protected void setup(context context) throws ioexception, interruptedexception 
@override
protected void map(longwritable key, text value, context context) throws ioexception, interruptedexception }}

reducer業務邏輯：

1.定義乙個計數器---------》2.迭代器進行統計單詞出現的次數---------》3.然後進行輸出

/**
* reduce
* 將每個檔案中的單詞進行統計
*/class indexreducedrive extends reducer
context.write(key, new intwritable(count));
}

1.獲取配置檔案------》2.定義乙個主類job-------》3.配置主類資訊，map類，reduce類-------》4.map類的輸入和輸出

-----------》5.reducer的類----------》6.判斷檔案是否存在-------》7.設定檔案的輸入路徑--------》8.設定檔案的輸出路徑

--------》9.最後來判斷程序是否執行完

//判斷輸出路徑是否存在
path path = new path(args[1]);
filesystem fs = filesystem.get(conf);
if (fs.exists(path)) ;
//獲取問價配置資訊
configuration conf = new configuration();
//獲取主類job
job job1 = job.getinstance(conf);
//配置主類class
job1.setjarbyclass(indexdriveone.class);
//設定map類的資訊
job1.setmapoutputkeyclass(text.class);
job1.setmapoutputvalueclass(intwritable.class);
//設定reducer類的資訊
job1.setreducerclass(indexreducedrive.class);
job1.setoutputkeyclass(text.class);
job1.setoutputvalueclass(intwritable.class);
//判斷輸出路徑是否存在
path path = new path(args[1]);
filesystem fs = filesystem.get(conf);
if (fs.exists(path)) 
//設定輸入路徑
fileinputformat.setinputpaths(job1, new path(args[0]));
//設定輸出路徑
fileoutputformat.setoutputpath(job1, path);
system.exit(job1.waitforcompletion(true) ? 0 : 1);}}