MapReduce經典樣例

mapreduce核心思想是分治

以下樣例只涉及基礎學習和少量資料，並不需要連線虛擬機器

以下樣例均可在系統建立的資料夾的part-r-0000中檢視結果

在檔案輸入一定數量單詞，統計各個單詞出現次數

package qfnu;
import j**a.io.ioexception;
import org.apache.hadoop.conf.configuration;
import org.apache.hadoop.fs.path;
import org.apache.hadoop.io.intwritable;
import org.apache.hadoop.io.longwritable;
import org.apache.hadoop.io.text;
import org.apache.hadoop.mapreduce.job;
import org.apache.hadoop.mapreduce.reducer;
import org.apache.hadoop.mapreduce.lib.input.fileinputformat;
import org.apache.hadoop.mapreduce.lib.output.fileoutputformat;
text, intwritable>	}	
}// reducer 元件
class wordcountreducer extends reducer
context.write(key, new intwritable(count));	}	
}public class wordcountdriver 
}

我們可以在d:/hadooptest/ansofcount的part-r-0000看到

倒排索引是為了更加方便的搜尋

package qfnu;
import j**a.io.ioexception;
import org.apache.hadoop.conf.configuration;
import org.apache.hadoop.fs.path;
import org.apache.hadoop.io.intwritable;
import org.apache.hadoop.io.longwritable;
import org.apache.hadoop.io.text;
import org.apache.hadoop.mapreduce.job;
import org.apache.hadoop.mapreduce.reducer;
import org.apache.hadoop.mapreduce.lib.input.fileinputformat;
import org.apache.hadoop.mapreduce.lib.input.filesplit;
import org.apache.hadoop.mapreduce.lib.output.fileoutputformat;
text, text>	}}
class invertedindexcombiner extends reducer<
text, text, text, text>
int splitindex = key.tostring().indexof(":");
keyinfo.set(key.tostring().substring(0, splitindex));
valueinfo.set(key.tostring().substring(splitindex+1) + ":" + count);
context.write(keyinfo, valueinfo);	}}
class invertedindexreducer extends reducer
valueinfo.set(filelist);
context.write(key, valueinfo);	}	
}public class invertedindexdriver 
}

package qfnu;
import j**a.io.ioexception;
import org.apache.hadoop.conf.configuration;
import org.apache.hadoop.fs.path;
import org.apache.hadoop.io.intwritable;
import org.apache.hadoop.io.longwritable;
import org.apache.hadoop.io.text;
import org.apache.hadoop.mapreduce.job;
import org.apache.hadoop.mapreduce.reducer;
import org.apache.hadoop.mapreduce.lib.input.fileinputformat;
import org.apache.hadoop.mapreduce.lib.output.fileoutputformat;
@override
throws ioexception, interruptedexception 
}class dedupreducer extends reducer
}public class dedupdriver 
}

package qfnu;
import j**a.io.ioexception;
import j**a.util.comparator;
import j**a.util.treemap;
import org.apache.hadoop.conf.configuration;
import org.apache.hadoop.fs.path;
import org.apache.hadoop.io.intwritable;
import org.apache.hadoop.io.longwritable;
import org.apache.hadoop.io.text;
import org.apache.hadoop.mapreduce.job;
import org.apache.hadoop.mapreduce.reducer;
import org.apache.hadoop.mapreduce.lib.input.fileinputformat;
import org.apache.hadoop.mapreduce.lib.output.fileoutputformat;
text, intwritable>
}	}@override
throws ioexception, interruptedexception 	}}
class topnreducer extends reducer
});@override
protected void reduce(text key, iterablevalues,
reducer.context context) throws ioexception, interruptedexception 
}		for(integer i : treemap.keyset()) 	}	
}public class topndriver 
}

訊號量基礎和兩個經典樣例

1 乙個訊號量能夠初始化為非負值 2 semwait操作能夠使訊號量減1，若訊號量的值為負，則執行semwait的程序被堵塞。否則程序繼續執行。3 semsignal操作使訊號量加1。若訊號量的值小於等於0。則被semwait操作堵塞的程序講被接觸堵塞。ps semwait相應p原語。semsign...

protobuf c應用樣例

autogen.sh configure make make install 根據協議格式生成原始碼與標頭檔案 amessage.proto 檔案內容如下 message amessage 根據amessage.proto 生成c語言標頭檔案與原始碼 protoc c c out amessage....

rapidjson使用樣例

rapidjson預設支援的字元格式是utf 8的，一般中間介面是json檔案的話儲存為utf 8比較通用一些。如果是unicode的需要轉換。但從原始碼中的ch型別看，應該是支援泛型的，具體在用到了可以仔細研究一下。這篇文件中有json解析相關庫的效能比較，rapidjson還是各方面均衡比較突出...

MapReduce經典樣例

訊號量基礎和兩個經典樣例

protobuf c應用樣例

rapidjson使用樣例

相關推薦