HIVE的小案例

資料

record_time 通話時間

imei 基站編號

cell 手機編號

drop_num 掉話秒數

duration 通話持續總秒數

2011-07-13 00:00:00+08,356966,29448-37062,0,0,0,0,0,g,0 2011-07-13 00:00:00+08,352024,29448-51331,0,0,0,0,0,g,0 2011-07-13 00:00:00+08,353736,29448-51331,0,0,0,0,0,g,0 2011-07-13 00:00:00+08,353736,29448-51333,0,0,0,0,0,g,0 2011-07-13 00:00:00+08,351545,29448-51333,0,0,0,0,0,g,0 2011-07-13 00:00:00+08,353736,29448-51343,1,0,0,8,0,g,0 2011-07-13 00:00:00+08,359681,29448-51462,0,0,0,0,0,g,0 2011-07-13 00:00:00+08,354707,29448-51462,0,0,0,0,0,g,0 2011-07-13 00:00:00+08,356137,29448-51470,0,0,0,0,0,g,0 2011-07-13 00:00:00+08,352739,29448-51971,0,0,0,0,0,g,0 2011-07-13 00:00:00+08,354154,29448-51971,0,0,0,0,0,g,0 2011-07-13 00:00:00+08,127580,29448-51971,0,0,0,0,0,g,0 2011-07-13 00:00:00+08,354264,29448-51973,0,0,0,0,0,g,0 2011-07-13 00:00:00+08,354733,29448-51973,1,0,0,36,0,g,0 2011-07-13 00:00:00+08,356807,29448-51973,0,0,0,0,0,g,0 2011-07-13 00:00:00+08,125470,29448-51973,1,0,0,13,0,g,0 2011-07-13 00:00:00+08,353530,29448-52061,1,0,0,46,0,g,0

2011-07-13 00:00:00+08,352417,29448-5231,1,0,0,2,0,g,0

原始表

create table cell_monitor( record_time string, imei string, ph_num int, call_num int, drop_num int, duration int, drop_rate double, net_type string, erl string )row format delimited fields terminated by ','

stored as textfile;

建立結果表

create table cell_drop_monitor( imei string, total_call_num int, total_drop_num int, d_rate double )row format delimited fields terminated by '\t'

stored as textfile;

插入原始資料

load data local inpath '/test/cdr_summ_imei_cell_info.csv' into table cell_monitor;

統計sql語句

from cell_monitor cm  
insert overwrite table cell_drop_monitor
select cm.imei,sum(cm.drop_num),sum(cm.duration),sum(cm.drop_num)/sum(cm.duration) d_rate
group by cm.imei
sort by d_rate desc;

取別名

選擇基站編號求和掉話秒數求和通話時間比較平均**率取別名 d_rate

分組為cm.imei

sort by d_rate desc; 倒序排序

建表

create table docs(line string);

載入資料到表裡

load data local inpath '/test/wc.txt' into table docs;

按照空格切割查詢,形成陣列

select split(line,' ') from docs;
執行hive> select split(line,' ') from docs;
ok["from","cell_monitor","cm","",""]
["insert","overwrite","table","cell_drop_monitor"]
[""]
["from","cell_monitor","cm","",""]
["insert",""]

explode(array) 陣列一條記錄有多個引數,將引數拆分,每個引數生成一列

hive> select explode(split(line,' '))from docs;
okfrom
cell_monitor
cminsert
overwrite
table
cell_drop_monitor
from
cell_monitor
cm

建立結果表

create table wc(word string,totalword int);

統計sql語句

from (select explode(split(line,' ')) as word from docs) w insert into table wc  
select word, count(1) as totalword
group by word
order by word;

結果

hive> select * from wc;
ok	6
cell_drop_monitor	1
cell_monitor	2
cm	2
from	2
insert	2
overwrite	1
table	1
time taken: 0.18 seconds, fetched: 8 row(s)

HIVE的小案例

Hive基礎案例

hive 行列轉換案例

hive案例調優

HIVE的小案例

Hive基礎 案例

hive 行列轉換案例

hive案例調優

相關推薦

Hive基礎案例