Hive之資料操作

基本語法

select [all | distinct] select_expr, select_expr, ... from tablename [where where_condition]

1、hive命令列執行

select * from lyz;

2、linux命令列執行

hive -e "select * from lyz" hive -s -e "select * from lyz"

hive -v -e "select * from lyz"

3、執行檔案中的hql

hive -f "/home/lyz.sql"

4、指令碼執行hql

#!/bin/bash

hive -e "select * from lyz"

1、配置變數

set val = ''
$

2、環境變數

$,注env檢視所有環境變數

1、內錶資料載入

1) 建立表時載入
> create table newtable as select col1, col2 from oldtable
2)建立表時是指定資料的位置
> create table tablename() location ''
3) 本地資料載入
> load data local inpath 'localpath' [overwrite] into table tablename
4) 載入hdfs資料
> load data inpath 'hdfspath' [overwrite] into table tablename
注： 這個操作是移動資料
5) 使用hadoop命令拷貝資料到指定的位置(hive的shell中執行和linux的shell執行)
6) 由查詢語句載入資料
insert [overwrite | into] table tablename
select col1, col2
from table
where ...
例項：insert overwrite test_m
select name, address
from testtext
where name = 'test';
from table
insert [overwrite | into] table tablename
select col1, col2
where ...
例項：from testtext
insert overwrite test_m
select name, address
where name = 'test';

注意

1) 字段對應不同於一些關係型資料庫

2) 在hive shell下執行linux shell

> ! ls /home

2、外表資料載入1) 建立表時是指定資料的位置

create external table tablename() location ''

2) 查詢插入,同內錶

3) 使用hadoop命令拷貝資料到指定的位置(hive的shell中執行和linux的shell執行)

3、分割槽表資料載入

1) 內部分割槽表和外部分割槽表資料載入

內部分割槽表資料載入方式類似於內錶

外部分割槽表資料載入方式類似於外表

注意：資料存放的路徑層次要和表的分割槽一致；如果分割槽表沒有新增分割槽，即使目標路徑下已經沒有資料了，但依然查不到資料

2) 不同之處

載入資料指定目標表的不同，需要指定分割槽

3) 本地資料載入

load data local inpath 'localpath' [overwrite] into table tablename partition(pn = '')

4) 載入hdfs資料

load data inpath 'hdfspath' [overwrite] into table tablename partition(pn='')

5) 由查詢語句載入資料

insert [overwrite] into table tablename partition(pn='')
select col1, col2
from table
where ...

例項：

#建立內部分割槽表
create table test_p(
name string,
val string
)partitioned by (st string)
row format delimited fields terminated by '\t' lines terminated by '\n'
stored as textfile;
#本地資料載入
load data local inpath '/usr/local/src/data' into table test_p partition (st='20180602');
#載入hdfs資料
load data inpath '/data/data' into table test_p partition(st='20180602')
#由查詢語句載入資料
insert  into table test_p partition(st='20180602')
select name, address
from lyz
where name = 'lyz';
#建立外部分割槽表
create table test_ep(
name string,
val string
)partitioned by (st string)
row format delimited fields terminated by '\t' lines terminated by '\n'
stored as textfile
location '/external/data';
hadoop fs -mkdir /external/data/st=20180602
hadoop fs -copyfromlocal /usr/local/src/data /external/data/st=20180602
alter table test_ep add partition(st='20180602');  #注意：利用hadoop命令將檔案拷貝到外部分割槽表指定分割槽下的目錄中，必須用此命令為表新增分割槽後才能查詢到表中的資料
show partitions test_ep;
select * from test_ep;

4、hive資料載入注意的問題1) 分隔符問題，且分隔符預設只有單個字元

比如有以下建表語句：

create table test_p(
name string,
val string
)partitioned by (st string)
row format delimited fields terminated by '#\t' lines terminated by '\n'
stored as textfile;

此時，hive只會根據#分隔每一列內容

2) 資料型別對應問題

load資料，字段型別不能互相轉化，查詢返回null

select查詢插入，字段型別不能互相轉化時，插入資料為null

3) select查詢插入資料，字段值順序要與表中字段順序一致，名稱可不一致

hive在資料載入時不做檢查，查詢時檢查

4) 外部分割槽表需要新增分割槽才能看到資料(重要)

hive資料操作

select from employees 在這種情況下可以簡單的讀取employees對應的儲存目錄下的檔案，然後輸出到格式化後的內容到控制台對於where語句中的過濾條件只是分割槽字段這種情況無論是否使用limit語句限制輸出記錄數也無需mapreduce過程的 select from e...

hive資料操作

vi tb hive.txt 12 34 56 7 12 13 41 2 31 17 21 3 71 2 31 1 12 34 11 2 34 root namenode 82 hive w hive shell 建表結構 hive create table tb hive a int,b int,...

hive 資料操作

日常工作中，經常涉及到將本地檔案寫入hive表，已供查詢計算，或將hive表的資料匯出為本地檔案。1 第一步建立hive 表 create table if not exists user.table user user id int act time string partitioned by ...

Hive之 資料操作

hive資料操作

hive資料操作

hive 資料操作

相關推薦

Hive之資料操作