wordcount經典題使用Hive完成單詞統計

準備資料

在hive建立資料庫、表、匯入資料

create
database interview;
#建立資料庫
use interview;
#使用資料庫
create
table wordcount(line string)
;#建立單詞統計表（這裡表中一行的資料是文件中的一行的字串）
load
data
local inpath '/home/data/wordcount' overwrite into
table wordcount;
#載入資料
create wordsingle(word string)
;#建立表用於一行存放乙個單詞
insert
into
table wordsingle  select explode(split(line,
" ")
)as w from
(select line from wordcount)
as line;
#將wordcount表中的單詞打散，乙個個存放
select word,
count
(word)
as con from wordsingle group
by word;
#統計

使用Storm實現WordCount

這裡用到的bolt可能會多一些，乙個spout負責推送資料，乙個bolt負責切詞，再來乙個bolt負責統計。最關鍵的是，相同的單詞應該交給同乙個bolt來處理，分發策略的選用就得嚴謹一些了，依據分發的單詞來分發 field 這個類就負責將準備的資料向後傳送，除此之外，什麼都不做。public cla...

使用Scala進行wordcount

1.定義陣列 val stringlist list hello aythna hello kirito hello liliya hello luluxiu hello nana hello sandy 2.壓縮陣列呼叫flatmap方法按空格來分割 stringlist.flatmap x ...

使用hadoop做wordcount筆記

以前寫的mapreduce的wordcount，都忘了怎麼執行了 hadoop jar home dmc hadoop share hadoop tools lib hadoop streaming 2.6.0.jar reducer reducer1.py file reducer1.py inp...

wordcount經典題 使用Hive完成單詞統計

使用Storm實現WordCount

使用Scala進行wordcount

使用hadoop做wordcount筆記

相關推薦

wordcount經典題使用Hive完成單詞統計