hive DML一些學習

1.載入load檔案資料到hive表裡面（先在hive裡面建立與要匯入資料字段相同的表，然後再將資料載入進去）：

load data [local] inpath

'filepath'

[overwrite] into table tablename ;（

無論從linux還是hdfs上載入都不走mr）

local：帶local-------->表示

從linux檔案系統中載入

不帶local---->

表示從hdfs檔案系統中載入

filepath：local的話就寫linux的路徑，不是local就寫hdfs的路徑

overwrite

：有overwrite-------->表示覆蓋沒有

overwrite------>表示在原來表資料的基礎上追加

例子：load data local inpath

'/home/hadoop/class.txt

'overwrite into table class ;(從linux，將某個linux的目錄下的資料檔案載入到hive表後，linux的檔案還是存在的，只是被

複製乙份到hive表所在路徑下)

load data inpath

'/ruoze/class.txt

'overwrite into table class ;（從hdfs，在hdfs中，將某個目錄下的資料載入到hive表中後，該目錄的資料檔案會

被移動到hive表對應的路徑下）

2.ctas create table tablename as select ...... 使用查詢語句，將資料和表結構匯入到新建的表中，事先不需要建立好錶，建立表的同時往裡面載入資料

（要走mr）

例子：create table t1 as select * from student;(student表的

全部字段複製到t1中)

create table t2 as select name,age from student;

(student表的

部分字段複製到t1中)

3.建立表的時候指定路徑，然後將資料檔案直接上傳到該錶所在的hdfs的路徑下面即可：

create table [if not exists] [db_name.]table_name

[row format row_format]

[location hdfs_path]

例子： create table if not exists student7(name string,age int,class string) row format delimited fields terminated by '\t' location '/chenping/test';先在指定目錄下建立乙個表

hadoop fs -put class.txt /chenping/test/ 然後將資料檔案直接拷貝至hdfs的

/chenping/test目錄下，然後檢視student7的資料即可檢視到資料

4.插入insert使用

1）insert到

單錶中：

inserting data into hive tables from queries

將查詢其他表的資料inert到hive的另外一張表裡面去，事先在hive中建立好與資料相對應的表結構

（要走mr）

insert overwrite table tablename1 select_statement1 from from_statement;（overwrite覆蓋）

insert into table tablename1 select_statement1 from from_statement;（into追加）

例子：insert overwrite table student4 select * from student;

insert into table student4 select * from student;

insert到

多表

中：from from_statement 將乙個表中的資料插入到其他多表中

（要走mr）

insert overwrite/into table tablename1 select_statement1

insert overwrite/into table tablename2 select_statement2

insert overwrite/into table tablename2 select_statement2 ...;

例子：from student 全欄位插入

insert into table student5 select *

insert into table student6 select *;

例子：from student

部分字段插入

insert into table student6 select *

insert into table student_name select name

insert into table student_name_id select name,age;

2）將查詢其他表的資料insert到檔案系統中：

writing data into the filesystem from queries 將通過sql語句查詢到的結果匯出到檔案系統中

insert overwrite [local] directory directory1

[row format row_format]

select ... from ...

(要走mr)

local：有local表示匯出到linux檔案系統中，無local表示匯入到hdfs上

例子：insert overwrite directory '/chenping/testhive' row format delimited fields terminated by '/t' select name,age from student;(匯出到hdfs檔案系統上，路徑也要注意)

5.export匯出/inport匯入的使用

（不走mr）

export匯出，將整個表的資料以及元資料匯出到hdfs檔案中 export table tablename

to 'export_target_path' ；

例子：export table student to '/test';

檢視結果：[hadoop@hadoop000 data]$ hadoop fs -lsr /test （發現/test目錄下有

元資料以及資料）

-rwxr-xr-x 1 hadoop supergroup 1295 2017-09-15 12:43 /test/_metadata

drwxr-xr-x - hadoop supergroup 0 2017-09-15 12:43 /test/data

-rwxr-xr-x 1 hadoop supergroup 71 2017-09-15 12:43 /test/data/student.txt

import匯入，將匯出的檔案匯入到hive表裡 import [[external] table new_or_original_tablename

from 'source_path';

例子：import table student1 from '

/test

'; （事先不用建立student1表）

一些學習小記

端到端指的是輸入是未經處理的原始資料，輸出是最後的結果。一開始，輸入端不是直接的原始資料，而是在原始資料中提取的特徵。由於影象畫素數多資料維度高，因此通過手工提取影象的關鍵特徵實現降維。但在端到端中，特徵可以自己去學習，無需人為干預。end to end的好處通過縮減人工預處理和後續處理，盡可能...

codeblocks一些學習

codeblocks下，怎樣建立工程，進行多檔案編譯？如下是書上的兩個檔案。自定義快捷鍵 sudo apt get install codeblocks contrib codeblocks視窗自動隱藏建立自己的workspace add files shapelib test.f90 build...

一些人，一些事，一些

我覺得這是國內it企業浮躁和傳統的官本位性質決定的，導致國內企業都本末倒置。要想改變命運，我覺得有以下出路 1.不做技術了，改做混混混混的概念很廣泛的，比如銷售經理幹部皮包公司之類其實都屬於這類。中國就是這樣，越浮越掙錢，只有混混才能發財。要不更進一步，做流氓，廣義的流氓，也很不錯。2.專心...

hive DML一些學習

一些學習小記

codeblocks一些學習

一些人，一些事，一些

相關推薦