hive資料操作

vi tb_hive.txt

12 34 56

7 12 13

41 2 31

17 21 3

71 2 31

1 12 34

11 2 34

[root@namenode-82 hive_w]# hive shell

建表結構

hive>create table tb_hive (a int, b int, c int) row format delimited fields terminated by '\t';

匯入檔案

hive>load data local inpath '/work/wangliqin/hive_w/tb_hive.txt' overwrite into table tb_hive ;

copying data from file:/work/wangliqin/hive_w/tb_hive.txt

copying file: file:/work/wangliqin/hive_w/tb_hive.txt

loading data to table default.tb_hive

deleted hdfs://namenode-82:54310/user/hive/warehouse/tb_hive

oktime taken: 0.511 seconds

檢視表hive> show tables;

okt_hive

time taken: 0.142 seconds

刪除表hive> drop table t_hive;

oktime taken: 2.356 seconds

[root@namenode-82 hive_w]#

hadoop fs -cat /user/hive/warehouse/tb_hive/tb_hive.txt

16 2 3

61 12 13

41 2 31

17 21 3

71 2 31

1 12 34

11 2 34

1。hive不支援insert into語句、不支援date和datetime型別、truncate table t_hive（清空語句）、delete from table t_hive（刪除語句）、不支援 in (子查詢語句);

2。hive 清空資料可通過 hive>dfs -rmr /user/hive/warehouse/表名來清空該錶下的資料，以便保持表元資料資訊不丟失；或者通過create table 表名 like 表名，也能夠。

3。hive連表查詢，能夠通過內連線或者半連線 from 表1 left semi join 表2 on （表1.列名 = 表2.列名）。表2僅僅能在on**現，不能在select中引用

4，hive中在不須要全域性排序的情況下，寫排序語句時，最好用distribute by 表名.欄位名 sort by表名.欄位名 asc | desc 的形式，盡量不用order by形式（僅僅通過乙個reduce來完畢全部的排序結果）

5,hive的表分為外部表和內部表

hive 建立內部表時，會將資料移動到資料倉儲指向的路徑。若建立外部表。僅記錄資料所在的路徑。不正確資料的位置做不論什麼改變。

在刪除表的時候，內部表的元資料和資料會被一起刪除，而外部表僅僅刪除元資料，不刪除資料。

這樣外部表相對來說更加安全些。資料組織也更加靈活，方便共享源資料。

hive資料操作

hive資料操作

hive 資料操作

Hive之資料操作

hive資料操作

hive資料操作

hive 資料操作

Hive之 資料操作

相關推薦

Hive之資料操作