hive高階分割槽表與三種複雜資料型別

在大資料中，最常用的一種思想就是分治，我們可以把大的檔案切割劃分成乙個個的小的檔案，這樣每次操作乙個小的檔案就會很容易了，同樣的道理，在hive當中也是支援這種思想的，就是我們可以把大的資料，按照每天，或者每小時進行切分成乙個個的小的檔案，這樣去操作小的檔案就會容易得多了。

建立分割槽表語法：

create table order_partition(
order_no string,
event_time string
)partitioned by(event_month string)
row format delimited fields terminated by '\t';

載入資料到分割槽：

load data local inpath '/home/hadoop/data/order.txt' overwrite into table order_partition
partition  (event_month='2014-05');

修改表，進行手動新增方式：

alter table order_partition add if not exists partition (event_month='2014-06') ;

多分割槽表：

create table order_mulit_partition(
order_no string,
event_time string
)partitioned by(event_month string, step string)
row format delimited fields terminated by '\t';
load data local inpath '/home/hadoop/data/order.txt' overwrite into table order_mulit_partition
partition  (event_month='2014-05', step='1');

create table `emp_dynamic_partition`( `empno` int, `ename` string, `job` string, `mgr` int, `hiredate` string, `sal` double, `comm` double) partitioned by(deptno int) row format delimited fields terminated by '\t'; ##匯入資料時將動態分割槽欄位寫在最後 insert into table emp_dynamic_partition partition (deptno) select empno,ename,job,mgr,hiredate,sal,comm,deptno from emp;

array陣列型別：

create table hive_array(
name string,
work_locations array)
row format delimited fields terminated by '\t'
collection items terminated by ',';
load data local inpath '/home/hadoop/data/hive_array.txt' 
overwrite into table hive_array;		
select name,work_locations[0] from hive_array;##陣列第乙個
select name,size(work_locations) from hive_array；##陣列大小
select * from hive_array where array_contains(work_locations,'tianjin');	##陣列裡包含tianjin

father:xiaoming#mother:xiaohuang#brother:xiaoxu

create table hive_map(
id int,
name string,
members map,
age int
)row format delimited fields terminated by ','
collection items terminated by '#'
map keys terminated by ':';
load data local inpath '/home/hadoop/data/hive_map.txt' 
overwrite into table hive_map;
select id,name,age,members['father'] from hive_map;##讀取key為father的元素
select map_keys(members) from hive_map;## 把所有map的key拿出來
select userinfo.name,userinfo.age from hive_struct;##用點的方式取資料
				hive分割槽表與資料關聯的三種方式
3.把資料直接上傳到分割槽目錄上，讓分割槽表和資料產生關聯的三種方式 1 方式一 上傳資料後修復 上傳資料 hive db 614 dfs mkdir p user hive warehouse db 614.db user info6 month 202011 day 22 hive db 614...
				Hive的三種複雜資料型別
hive的複雜資料型別主要分為3類 array map和struct。測試資料 列間用 t 分隔，第二列用逗號分隔。zhangsan chengdu,shanghai,beijing lisi tianjin,taiyuan,chongqing wangwu xian,nanning,beijing...
				Hive分割槽表與分桶
在hive select查詢中，一般會掃瞄整個表內容，會消耗很多時間做沒必要的工作。分割槽表指的是在建立表時，指定partition的分割槽空間。分割槽語法 分割槽表操作增加分割槽 刪除分割槽 alter table employees drop ifexists partition country...

hive高階 分割槽表與三種複雜資料型別

hive分割槽表與資料關聯的三種方式

Hive的三種複雜資料型別

Hive分割槽表與分桶

相關推薦

hive高階分割槽表與三種複雜資料型別