07分布式資料倉儲 HIVE 函式

hive函式，自帶函式，和自定義函式

自帶函式100多個包括，基本函式（map），聚合函式（reduce），集合函式（map），其他函式

自定義函式包括udf（map），udaf（reduce）。

show functions;

desc function from_unixtime;

desc function extended from_unixtime;

1 簡單函式（在map端執行）

語法: from_unixtime(bigint unixtime[, string format])

select from_unixtime(1323308943,'yyyymmdd') from user;

select * from user where (name ='liguozhong' or name= 'fyl') and *** = 'b';

select cast(1.9 as int) from user;

select if(2<5,'one','two') from user;

select (case when *** = 'b' then 'box' when *** = 'g' then 'girl' else 'box girl') form user;

select get_json_object('','$.name') from user;

select parse_url('','host') from user;

selct collect_list(name) from user;

select concat(name,***,'over') from user;

2 聚合函式（在reduce端執行）

語法: count(*), count(expr), count(distinct expr[, expr_.])

select count(*) from user;

select count(distinct t) from user;

select sum(money),count(1) from user;

3 集合函式（在map端執行）

語法: a[n]

create table user as select array("tom","mary","tim") as t from student;

select t[0],t[1],t[2] from user;

4 其他函式，包括視窗函式，分析函式，混合函式，udtf

視窗函式

分析函式

混合函式

udtf

分布式資料倉儲Hive

第六章分布式資料倉儲hive 1.hive的由來了解乙個技術或者名詞應該知道它產生的初衷 2.在hive中使用了4個主要的資料模型表，外部表，分割槽和桶。3.hive執行過程中，其元資料可能會不斷被讀取，更新和修改，因此這些元資料不宜存放再hadoop的hdfs中，否則會降低元資料的訪問效率，...

02分布式資料倉儲 HIVE 表的相關操作

show tables show create table user 建表內部表 create table user name string,password string 簡單建表複雜建表語句外部表 create external table sogouq1 dt string,webses...

資料倉儲專題（3）分布式資料倉儲事實表設計思考

一前言最近在設計資料倉儲的資料邏輯模型，考慮到海量資料儲存在分布式資料倉儲中的技術架構模式，需要針對傳統的面相關係型資料倉儲的資料儲存模型進行技術改造。設計出一套真正適合分布式資料倉儲的資料儲存模型。二事實表設計基礎事實表記錄發生在現實世界中的操作型事件，其所產生的可度數值。事實表的設計完全...

07分布式資料倉儲 HIVE 函式

分布式資料倉儲Hive

02分布式資料倉儲 HIVE 表的相關操作

資料倉儲專題（3） 分布式資料倉儲事實表設計思考

相關推薦

資料倉儲專題（3）分布式資料倉儲事實表設計思考