hive表刪除分割槽後，重新插入，查詢不到資料問題

[root@hadoop001 hivedata]# hadoop fs -get /user/hive/warehouse/emp_dept_partition/deptno=30/000000_0 emp_dept_partition-deptno30 [root@hadoop001 hivedata]# ls

dept emp_dept_partition-deptno30 emp.txt

hive> alter table emp_dept_partition drop if exists partition(deptno=30); dropped the partition deptno=30 oktime taken: 0.652 seconds hive> select * from emp_dept_partition where deptno=30; oktime taken: 0.507 seconds deptno=30分割槽對應的資料夾也不存在了。 [root@hadoop001 hivedata]# hadoop fs -ls /user/hive/warehouse/emp_dept_partition/ found 2 items drwxr-xr-x - root supergroup 0 2018-01-08 20:59 /user/hive/warehouse/emp_dept_partition/deptno=10

drwxr-xr-x - root supergroup 0 2018-01-08 20:59 /user/hive/warehouse/emp_dept_partition/deptno=20

[root@hadoop001 hivedata]# hadoop fs -mkdir /user/hive/warehouse/emp_dept_partition/deptno=30 [root@hadoop001 hivedata]# hadoop fs -put emp_dept_partition-deptno30 /user/hive/warehouse/emp_dept_partition/deptno=30 [root@hadoop001 hivedata]# hadoop fs -cat /user/hive/warehouse/emp_dept_partition/deptno=30/emp_dept_partition-deptno30 7499 allen salesman 7698 1981/2/20 1600.0 300.0 7521 ward salesman 7698 1981/2/22 1250.0 500.0 7654 martin salesman 7698 1981/9/28 1250.0 1400.0 7698 blake manager 7839 1981/5/1 2850.0 \n 7844 turner salesman 7698 1981/9/8 1500.0 0.0 7900 james clerk 7698 1981/12/3 950.0 \n

[root@hadoop001 hivedata]#

hive> select * from emp_dept_partition where deptno=30;
oktime taken: 0.116 seconds

語法：alter table table_name add [if not exists] partition partition_spec [location 『location』][, partition partition_spec [location 『location』], …];

hive> alter table emp_dept_partition add if not exists partition (deptno=30); oktime taken: 0.209 seconds hive> select * from emp_dept_partition where deptno=30; ok7499 allen salesman 7698 1981/2/20 1600.0 300.0 30 7521 ward salesman 7698 1981/2/22 1250.0 500.0 30 7654 martin salesman 7698 1981/9/28 1250.0 1400.0 30 7698 blake manager 7839 1981/5/1 2850.0 null 30 7844 turner salesman 7698 1981/9/8 1500.0 0.0 30 7900 james clerk 7698 1981/12/3 950.0 null 30

time taken: 0.168 seconds, fetched: 6 row(s)

我們平時通常是通過alter table add partition方式增加hive的分割槽的，但有時候會通過hdfs put/cp命令往表目錄下拷貝分割槽目錄，如果目錄多，需要執行多條alter語句，非常麻煩。hive提供了乙個"recover partition"的功能。

具體語法如下：

msck repair table table_name;

原理相當簡單，執行後，hive會檢測如果hdfs目錄下存在但表的metastore中不存在的partition元資訊，更新到metastore中。

測試

#當前沒有partition元資訊
hive> show partitions cr_cdma_bsi_mscktest;
oktime taken: 0.104 seconds
#建立兩個分割槽目錄
hive> dfs -mkdir /user/hive/warehouse/cr_cdma_bsi_mscktest/month=201603;
hive> dfs -mkdir /user/hive/warehouse/cr_cdma_bsi_mscktest/month=201604;
#使用msck修復分割槽
hive> msck repair table cr_cdma_bsi_mscktest;
okpartitions not in metastore:	cr_cdma_bsi_mscktest:month=201603
partitions not in metastore:	cr_cdma_bsi_mscktest:month=201604
repair: added partition to metastore cr_cdma_bsi_mscktest:month=201603
repair: added partition to metastore cr_cdma_bsi_mscktest:month=201604
time taken: 0.286 seconds, fetched: 2 row(s)
#再次檢視，發現已經成功更新元資訊
hive> show partitions cr_cdma_bsi_mscktest;
okmonth=201603
month=201604
time taken: 0.102 seconds, fetched: 1 row(s)

直接建立分割槽表的分割槽的資料夾，並上傳對應分割槽的資料檔案後，這些資料都是手動新增的，所以mysql並無記錄對應分割槽的元資料，所以hive不能查詢對應的結果。

需要在hive重新整理分割槽資訊（說白了就是想儲存元資料的資料庫新增分割槽資料）。

hive表刪除分割槽後，重新插入，查詢不到資料問題

etcd 刪除後重新插入，版本號重新計數

hive分割槽表刪除部分資料

hive刪除表或分割槽但是HDFS裡面資料還在

hive表刪除分割槽後，重新插入，查詢不到資料問題

etcd 刪除後重新插入，版本號重新計數

hive分割槽表刪除部分資料

hive刪除表或分割槽但是HDFS裡面資料還在

相關推薦