HBase與MapReduce整合附案例

text word = new text();

intwritable one = new intwritable(1);

/*** map函式：處理行，有幾行處理幾行。如test1.txt文字中有4行，那麼map()呼叫4次

*/@override

protected void map(longwritable key, text value,context context)

throws ioexception, interruptedexception

} }2.編寫reducer

public class myreduce extends tablereducer
result.set(sum);
put put = new put(bytes.tobytes(md5hash.ge***5ashex(bytes.tobytes(k2.tostring()))));
put.addcolumn(bytes.tobytes("f1"), bytes.tobytes("word"), bytes.tobytes(k2.tostring()));
put.addcolumn(bytes.tobytes("f1"), bytes.tobytes("sum"), bytes.tobytes(result.tostring()));
context.write(new text(md5hash.ge***5ashex(bytes.tobytes(k2.tostring()))),put);
}	}

3.編寫driver

public class wordcount 
}

注意：

1.將hadoop配置檔案拷貝至工程src下:

配置檔案如下：

core-site.xml

hdfs-site.xml

mapred-sie.xml

yarn-site.xml

2.將hadoop對應的jar新增至classpath下

3.執行！

HBASE 資料操作，MapReduce

前面已經對hbase有了不少了解了，這篇重點在實踐操作。hbase本身是乙個很好的key value的儲存系統，但是也不是萬能的，很多時候還是要看用在什麼情形，怎麼使用。kv之類的資料庫就是要應用在這類快速查詢的應用上，而不是像傳統的sql那樣關聯查詢，分組計算，這些可就不是hbase的長處了。下面...

MapReduce執行插入Hbase時的報錯解決

剛寫了個mapreduce用於資料處理，並把結果寫到hbase中儲存，但在執行mapreduce過程中報錯我記得應該是執行完下面的hadoop env.sh那一步之後我做了次同步，又執行了次報了.noclassdeffounderror的錯大概是這麼個情況錯誤已經修復，當時忘了截圖了 nocl...

Spark與MapReduce的區別

spark中最核心的概念是rdd 彈性分布式資料集近年來，隨著資料量的不斷增長，分布式集群平行計算如mapreduce dryad等被廣泛運用於處理日益增長的資料。這些設計優秀的計算模型大都具有容錯性好可擴充套件性強負載平衡程式設計方法簡單等優點，從而使得它們受到眾多企業的青睞，被大多數...

HBase與MapReduce整合 附案例

HBASE 資料操作，MapReduce

MapReduce執行插入Hbase時的報錯解決

Spark與MapReduce的區別

相關推薦

HBase與MapReduce整合附案例