Sqoop的簡單使用案例

4.1、匯入資料

在sqoop中，「匯入」概念指：從非大資料集群(rdbms)向大資料集群(hdfs，hive，hbase)中傳輸資料，叫做：匯入，即使用import關鍵字。

4.1.1、rdbms到hdfs

1) 確定mysql服務開啟正常

2) 在mysql中新建一張表並插入一些資料

$ mysql -uroot -p000000

mysql> create database company;

mysql> create table company.staff(id int(4) primary key not null auto_increment, name varchar(255), *** varchar(255));

mysql> insert into company.staff(name, ***) values('thomas', 'male');

mysql> insert into company.staff(name, ***) values('catalina', 'female');　　3) 匯入資料

(1)全部匯入

$ bin/sqoop import \

--connect jdbc:mysql://hadoop104:3306/company \

--username root \

--password 000000 \

--table staff \

--target-dir /user/company \

--delete-target-dir \

--num-mappers 1 \

--fields-terminated-by "\t"　　(2)查詢匯入

$ bin/sqoop import \

--connect jdbc:mysql://hadoop104:3306/company \

--username root \

--password 000000 \

--target-dir /user/company \

--delete-target-dir \

--num-mappers 1 \

--fields-terminated-by "\t" \

--query 'select name,*** from staff where id <=3 and $conditions;'　　尖叫提示：must contain '$conditions' in where clause.

尖叫提示：如果query後使用的是雙引號，則$conditions前必須加轉移符，防止shell識別為自己的變數。

(3)匯入指定列

$ bin/sqoop import \

--connect jdbc:mysql://hadoop104:3306/company \

--username root \

--password 000000 \

--target-dir /user/company \

--delete-target-dir \

--num-mappers 1 \

--fields-terminated-by "\t" \

--columns id,*** \

--table staff　　尖叫提示：columns中如果涉及到多列，用逗號分隔，分隔時不要新增空格

(4)使用sqoop關鍵字篩選查詢匯入資料

$ bin/sqoop import \

--connect jdbc:mysql://hadoop104:3306/company \

--username root \

--password 000000 \

--target-dir /user/company \

--delete-target-dir \

--num-mappers 1 \

--fields-terminated-by "\t" \

--table staff \

--where "id=2"　　尖叫提示：在sqoop中可以使用sqoop import -d property.name=property.value這樣的方式加入執行任務的引數，多個引數用空格隔開。

4.1.2、rdbms到hive

$ bin/sqoop import \

--connect jdbc:mysql://hadoop104:3306/company \

--username root \

--password 000000 \

--table staff \

--num-mappers 1 \

--hive-import \

--fields-terminated-by "\t" \

--hive-overwrite \

--hive-table staff_hive　　尖叫提示：該過程分為兩步，第一步將資料匯入到hdfs，第二步將匯入到hdfs的資料遷移到hive倉庫

尖叫提示：第一步預設的臨時目錄是/user/admin/表名

4.2、匯出資料

在sqoop中，「匯出」概念指：從大資料集群(hdfs，hive，hbase)向非大資料集群(rdbms)中傳輸資料，叫做：匯出，即使用export關鍵字。

4.2.1、hive/hdfs到rdbms

$ bin/sqoop export \

--connect jdbc:mysql://hadoop104:3306/company \

--username root \

--password 000000 \

--export-dir /user/hive/warehouse/staff_hive \

--table staff \

--num-mappers 1 \

--input-fields-terminated-by "\t"　　尖叫提示：mysql中如果表不存在，不會自動建立

思考：資料是覆蓋還是追加

4.3、指令碼打包

使用opt格式的檔案打包sqoop命令，然後執行

1) 建立乙個.opt檔案

$ mkdir opt

$ touch opt/job_hdfs2rdbms.opt

2) 編寫sqoop指令碼

$ vi opt/job_hdfs2rdbms.opt

export

--connect

jdbc:mysql://hadoop104:3306/company

--username

root

--password

000000

--table

staff

--num-mappers

1　　--export-dir

/user/hive/warehouse/staff_hive

--input-fields-terminated-by

"\t"　　3) 執行該指令碼

$ bin/sqoop --options-file opt/job_hdfs2rdbms.opt

Sqoop （二）Sqoop 的簡單使用案例

二匯出資料三指令碼打包在sqoop中，匯入概念指從非大資料集群 rdbms 向大資料集群 hdfs，hive，hbase 中傳輸資料，叫做匯入，即使用import關鍵字。確定mysql服務開啟正常在mysql中新建一張表並插入一些資料 mysql uroot p000000 mysq...

Sqoop 的簡單使用案例

1 匯入資料在 sqoop 中，匯入概念指從非大資料集群 rdbms 向大資料集群 hdfs，hive，hbase 中傳輸資料，叫做匯入，即使用 import 關鍵字。1 rdbms 到 hdfs 1 確定 mysql 服務開啟正常 2 在 mysql 中新建一張表並插入一些資料 mysql...

Sqoop的簡單使用案例

rdbms到hive rdbms到hbase hive hdfs到rdbms 指令碼打包 1.確定mysql服務開啟正常 2.在mysql中新建一張表並插入一些資料 mysql uroot proot mysql create database company mysql create table ...

Sqoop的簡單使用案例

Sqoop （二）Sqoop 的簡單使用案例

Sqoop 的簡單使用案例

Sqoop的簡單使用案例

相關推薦