2、靜態分割槽
二、hive分桶
1.分割槽列的值將表劃分為乙個個的資料夾
2.查詢時語法使用"分割槽"列和常規列類似
3.查詢時hive會只從指定分割槽查詢資料,提高查詢效率
--建立單級分割槽表
create
table
ifnot
exists employee_partition(
name string,
work_place array
,***_age struct<***:string,age:int
>
,skills_score mapint>
,depart_title map>
)partitioned by
(month string)
row format delimited
fields
terminated
by'|'
collection items terminated
by','
map keys
terminated
by':'
stored as textfile;
alter
table employee_partition add
partition
(month
='202012');
--向分割槽表載入資料
load
data
local inpath '/root/employee.txt'
into
table employee_partition partition
(month
='202012'
);
--多級分割槽
create
table
ifnot
exists employee_partition2(
name string,
work_place array
,***_age struct<***:string,age:int
>
,skills_score mapint>
,depart_title map>
)partitioned by
(month string,
date string)
row format delimited
fields
terminated
by'|'
collection items terminated
by','
map keys
terminated
by':'
stored as textfile;
alter
table employee_partition2 add
partition
(month
="202011"
,date
="01"
)partition
(month
="202011"
,date
="02"
)partition
(month
="202012"
,date
="01"
)partition
(month
="202012"
,date
="02");
--向分割槽表載入資料
load
data
local inpath '/root/employee.txt'
into
table employee_partition2 partition
(month
='202012'
,date
='01');
--檢視分割槽表有多少分割槽
show partitions employee_partition2;
使用動態分割槽需設定屬性
set hive.exec.dynamic.partition=true;
set hive.exec.dynamic.partition.mode=nonstrict;
動態分割槽建表語句和靜態分割槽相同
動態分割槽插入資料
--set hive.exec.dynamic.partition=true;
--set hive.exec.dynamic.partition.mode=nonstrict;
create
table
ifnot
exists employee_hr(
name string,
id int
,num string,
time2 string
)row format delimited
fields
terminated
by'|'
;load
data
local inpath '/root/employee_hr.txt'
into
table employee_hr;
--matias mcgrirl|1|945-639-8596|2011-11-24
create
table
ifnot
exists employee_hr_partition(
name string,
id int
,num string,
time2 string
)partitioned by
(month string,
date string)
row format delimited
fields
terminated
by'|'
;insert
into
table employee_hr_partition partition
(month
,date
)select name,id,num,time2,
month
(time2)
asmonth
,date
(time2)
asdate
from employee_hr;
1.分桶對應於hdfs中的檔案(分割槽對應資料夾)
更高的查詢處理效率
使抽樣(sampling)更高效
一般根據"桶列"的雜湊函式將資料進行分桶
2.分桶只有動態分桶
set hive.enforce.bucketing = true;
3.定義分桶
clustered by (employee_id) into 2 buckets
Hive分割槽 分桶
create table t user partition id int name string partitioned by country string row format delimited fields terminated by load data local inpath root h...
Hive 分區分桶操作
在大資料中,最常用的一種思想就是分治,我們可以把大的檔案切割劃分成乙個個的小的檔案,這樣每次操作乙個小的檔案就會很容易了,同樣的道理,在hive當中也是支援這種思想的,就是我們可以把大的資料,按照每天,或者每小時進行切分成乙個個的小的檔案,這樣去操作小的檔案就會容易得多了。企業常見的分割槽規則 按天...
Hive分區分桶基本操作
重置hive 登入mysql root m mysql uroot p1 mysql drop database hive create database hive 修改資料庫編碼 alter database grant allon hive.to hive identified by 1 gra...