hive的啟動:
1、啟動hadoop
2、開啟 metastore 在開啟 hiveserver2服務
nohup hive --service metastore >> log.out 2>&1 &
nohup hive --service hiveserver2 >> log.out 2>&1 &
檢視程序是否起起來:
tandemac:bin tanzhengqiang$ jps -ml | grep hive
資料結構
字段備註詳細描述
video id
11位字串
uploader
agecategory
length
views
**次數
rate
滿分5分
ratings
流量conments
related ids
2、使用者表
字段備註字段型別
uploader
上傳者使用者名稱
string
videos
intfriends
朋友數量
int建表:
建立原始表:gulivideo_ori,gulivideo_user_ori
建立目標表:gulivideo_orc,gulivideo_user_orc
gulivideo_ori:
create tablegulivideo_ori(
videoid string,
uploader string,
ageint,
category array,
lengthint,
viewsint,
ratefloat,
ratingsint,
commentsint,
relatedid array)
row format delimited
fields terminatedby "\t" --欄位與字段之間的資料按/t分割
collection items terminated by "&" --陣列中的資料是按&分割
stored as textfile;
gulivideo_user_ori:
create tablegulivideo_user_ori(
uploader string,
videosint,
friendsint)
row format delimited
fields terminatedby"\t"
storedas textfile;
gulivideo_orc:
create tablegulivideo_orc(
videoid string,
uploader string,
ageint,
category array,
lengthint,
viewsint,
ratefloat,
ratingsint,
commentsint,
relatedid array)clustered by(uploader) into 8 buckets --按照欄位uploader分成8個桶
row format delimited
fields terminatedby"\t"
collection items terminatedby "&"
storedas orc;
gulivideo_user_orc:
create tablegulivideo_user_orc(
uploader string,
videosint,
friendsint)
row format delimited
fields terminatedby"\t"
storedas orc;
導數:gulivideo_ori:
load data inpath 『/gulidata/output/video/2008/0222『 into table gulivideo_ori;
gulivideo_user_ori:
load data inpath "/gulidata/input/user/2008/0903" into table gulivideo_user_ori;
gulivideo_orc:
insert into table gulivideo_orc select * from gulivideo_ori;
gulivideo_user_orc:
insert into table gulivideo_user_orc select * from gulivideo_user_ori;
統計分析
最終**:
selectvideoid,
uploader,
age,
category,
length,
views,
rate,
ratings,
commentsfromgulivideo_orcorder byviewsdesclimit10;
最終**:
原文:
總結工作中常用到的linux命令
tar.bz2 命令 tar jxvf tar.bz2 tar.z 命令 tar zxvf tar.z tar.gz 命令 tar zxvf tar.gz ps 關於這些引數你可以用man 幫助,注意區分大小寫 大多以tar.gz 和tar.bz2打包軟體,大多是通過 configure make ...
運維工作中常見錯誤總結分享
作為乙個小運維,要時刻學習 總結。最近收集了一下常見的錯誤,和大家分享一下。希望對大家有用 一 解除安裝的時候出現的錯誤 umount dev nb1 device is busy 解決 找到是什麼程序使得他busy,用 lsof dev nb1 kill掉那個程序,然後重新umount即可。二 g...
工作中常用的hive行列轉換方法
一 列轉行 1.測試資料準備 表dev.dev three kingdoms中存放三國武將各項屬性案列存放的資料。drop table dev.dev three kingdoms create table if not exists dev.dev three kingdoms kingdom s...