Hive客戶端呼叫的幾種方式

hive命令指的是hive_home/bin/hive.sh，hive表示用來執行互動式查詢和批量處理的指令碼。hive可以直接敲hive命令進入interactive模式，也可以hive -e 執行簡單命令或者hive -f 執行乙個sql指令碼檔案。具體官方給出的用法如下：

to get help, run "hive -h" or "hive --help".

usage (as it is in hive 0.9.0):

usage: hive

commands. e.g. -d a=b or --define a=b

-e sql from command line

-f sql from files

-h,--help print help information

-h connecting to hive server on remote host

--hiveconf use value for given property

commands. e.g. --hivevar a=b

-i initialization sql file

-p connecting to hive server on port number

-s,--silent silent mode in interactive shell

-v,--verbose verbose mode (echo executed sql to the

console)

version information

as of hive 0.10.0 there is one additional command line option:

--database specify the database to use

note: the variant "-hiveconf" is supported as well as "--hiveconf".

see variable substitution for examples of using thehiveconfoption.

下面我們具體看一下三種用法的詳細介紹：

方式1：hive –f /root/shell/hive-script.sql（適合多條sql組成的檔案）

hive-script.sql類似於script一樣，直接寫查詢命令就行

例如：[root@cloud4 shell]# vi hive_script3.sql

select * from t1;

select count(*) from t1;

不進入互動模式，執行乙個

hive script

這裡可以和靜音模式-s聯合使用,通過第三方程式呼叫，第三方程式通過hive的標準輸出獲取結果集。

$hive_home/bin/hive -s -f /home/my/hive-script.sql （不會顯示mapreduct的操作過程）

那麼問題來了：如何傳遞引數呢？

demo如下：

start_hql.sh 內容：

#!/bin/bash

# -s 列印輸出mapreduce日誌

hive \

-hivevar id=1 \

-hivevar col2=2 \

-s -f test.sql

test.sql 內容：

-- 資料庫

use tmp;

-- 表名

select *

from tmp_jzl_20140725_test11

where

id='$' and col2='$';

方式2：hive -e 'sql語句'（適合短語句）

直接執行sql語句

例如：[root@cloud4 shell]# hive -e 'select * from t1'

靜音模式：

[root@cloud4 shell]# hive -s -e 'select * from t1' (用法與第一種方式的靜音模式一樣，不會顯示mapreduce的操作過程)

此處還有一亮點，用於匯出資料到linux本地目錄下

例如：

[root@cloud4 shell]# hive -e 'select * from t1' > test.txt

有點類似pig匯出分析結果一樣，都挺方便的

方式3：hive （直接使用hive互動式模式）

都挺方便的

介紹一種有意思的用法：

1.sql的語法

#hive 啟動

hive>quit; 退出hive

hive> show databases; 檢視資料庫

hive> create database test; 建立資料庫

hive> use default; 使用哪個資料庫

textfile

預設格式，資料不做壓縮，磁碟開銷大，資料解析開銷大。

可結合gzip、bzip2使用（系統自動檢查，執行查詢時自動解壓），但使用這種方式，hive不會對資料進行切分，從而無法對資料進行並行操作。

例項：

[plain]

view plain

copy

> create table test1(str string)

> stored as textfile;

time taken: 0.786 seconds

#寫指令碼生成乙個隨機字串檔案，匯入檔案：

> load data local inpath '/home/work/data/test.txt' into table test1;

copying data from file:/home/work/data/test.txt

copying file: file:/home/work/data/test.txt

loading data to table default.test1

time taken: 0.243 seconds

sequencefile是hadoop api提供的一種二進位制檔案支援，其具有使用方便、可分割、可壓縮的特點。

sequencefile支援三種壓縮選擇：none, record, block。 record壓縮率低，一般建議使用block壓縮。

示例：

[plain]

view plain

copy

> create table test2(str string)

> stored as sequencefile;

time taken: 5.526 seconds

hive> set hive.exec.compress.output=true;

hive> set io.seqfile.compression.type=block;

hive> insert overwrite table test2 select * from test1;

rcfile是一種行列儲存相結合的儲存方式。首先，其將資料按行分塊，保證同乙個record在乙個塊上，避免讀乙個記錄需要讀取多個block。其次，塊資料列式儲存，有利於資料壓縮和快速的列訪問。rcfile檔案示例：

例項：[plain]

view plain

copy

> create table test3(str string)

> stored as rcfile;

time taken: 0.184 seconds

> insert overwrite table test3 select * from test1;

總結：相比textfile和

sequencefile

，rcfile由於列式儲存方式，資料載入時效能消耗較大，但是具有較好的壓縮比和查詢響應。資料倉儲的特點是一次寫入、多次讀取，因此，整體來看，rcfile相比其餘兩種格式具有較明顯的優勢。

hive>show tables; 檢視該資料庫中的所有表

hive>show tables 『*t*』; //支援模糊查詢

hive>show partitions t1; //檢視表有哪些分割槽

hive>drop table t1 ; 刪除表

hive不支援修改表中資料，但是可以修改表結構，而不影響資料

有local的速度明顯比沒有local慢：

hive>load data inpath '/root/inner_table.dat' into table t1; 移動hdfs中資料到t1表中

hive>load data local inpath '/root/inner_table.dat' into table t1; 上傳本地資料到hdfs中

hive> !

ls; 查詢當前linux資料夾下的檔案

hive> dfs -

ls /

; 查詢當前hdfs檔案系統下 '/'目錄下的檔案

Hive客戶端呼叫的幾種方式

web客戶端的幾種儲存方式

幾種常見的服務端認證客戶端的方式

網頁呼叫 iOS Android 客戶端

Hive客戶端呼叫的幾種方式

web客戶端的幾種儲存方式

幾種常見的服務端認證客戶端的方式

網頁呼叫 iOS Android 客戶端

相關推薦