Hive分割槽 分桶

2021-10-11 18:11:54 字數 3478 閱讀 2025

2、靜態分割槽

二、hive分桶

1.分割槽列的值將表劃分為乙個個的資料夾

2.查詢時語法使用"分割槽"列和常規列類似

3.查詢時hive會只從指定分割槽查詢資料,提高查詢效率

--建立單級分割槽表

create

table

ifnot

exists employee_partition(

name string,

work_place array

,***_age struct<***:string,age:int

>

,skills_score mapint>

,depart_title map>

)partitioned by

(month string)

row format delimited

fields

terminated

by'|'

collection items terminated

by','

map keys

terminated

by':'

stored as textfile;

alter

table employee_partition add

partition

(month

='202012');

--向分割槽表載入資料

load

data

local inpath '/root/employee.txt'

into

table employee_partition partition

(month

='202012'

);

--多級分割槽

create

table

ifnot

exists employee_partition2(

name string,

work_place array

,***_age struct<***:string,age:int

>

,skills_score mapint>

,depart_title map>

)partitioned by

(month string,

date string)

row format delimited

fields

terminated

by'|'

collection items terminated

by','

map keys

terminated

by':'

stored as textfile;

alter

table employee_partition2 add

partition

(month

="202011"

,date

="01"

)partition

(month

="202011"

,date

="02"

)partition

(month

="202012"

,date

="01"

)partition

(month

="202012"

,date

="02");

--向分割槽表載入資料

load

data

local inpath '/root/employee.txt'

into

table employee_partition2 partition

(month

='202012'

,date

='01');

--檢視分割槽表有多少分割槽

show partitions employee_partition2;

使用動態分割槽需設定屬性

set hive.exec.dynamic.partition=true;

set hive.exec.dynamic.partition.mode=nonstrict;

動態分割槽建表語句和靜態分割槽相同

動態分割槽插入資料

--set hive.exec.dynamic.partition=true;

--set hive.exec.dynamic.partition.mode=nonstrict;

create

table

ifnot

exists employee_hr(

name string,

id int

,num string,

time2 string

)row format delimited

fields

terminated

by'|'

;load

data

local inpath '/root/employee_hr.txt'

into

table employee_hr;

--matias mcgrirl|1|945-639-8596|2011-11-24

create

table

ifnot

exists employee_hr_partition(

name string,

id int

,num string,

time2 string

)partitioned by

(month string,

date string)

row format delimited

fields

terminated

by'|'

;insert

into

table employee_hr_partition partition

(month

,date

)select name,id,num,time2,

month

(time2)

asmonth

,date

(time2)

asdate

from employee_hr;

1.分桶對應於hdfs中的檔案(分割槽對應資料夾)

更高的查詢處理效率

使抽樣(sampling)更高效

一般根據"桶列"的雜湊函式將資料進行分桶

2.分桶只有動態分桶

set hive.enforce.bucketing = true;

3.定義分桶

clustered by (employee_id) into 2 buckets

Hive分割槽 分桶

create table t user partition id int name string partitioned by country string row format delimited fields terminated by load data local inpath root h...

Hive 分區分桶操作

在大資料中,最常用的一種思想就是分治,我們可以把大的檔案切割劃分成乙個個的小的檔案,這樣每次操作乙個小的檔案就會很容易了,同樣的道理,在hive當中也是支援這種思想的,就是我們可以把大的資料,按照每天,或者每小時進行切分成乙個個的小的檔案,這樣去操作小的檔案就會容易得多了。企業常見的分割槽規則 按天...

Hive分區分桶基本操作

重置hive 登入mysql root m mysql uroot p1 mysql drop database hive create database hive 修改資料庫編碼 alter database grant allon hive.to hive identified by 1 gra...