unix_timestamp:返回當前或指定時間的時間戳from_unixtime:將時間戳轉為日期格式
current_date:當前日期
current_timestamp:當前的日期加時間
to_date:抽取日期部分
year:獲取年
month:獲取月
day:獲取日
hour:獲取時
minute:獲取分
second:獲取秒
weekofyear:當前時間是一年中的第幾周
dayofmonth:當前時間是乙個月中的第幾天
months_between:兩個日期間的月份
add_months:日期加減月
datediff:兩個日期相差的天數
date_add:日期加天數
date_sub:日期減天數
last_day:日期的當月的最後一天
date_format : 按指定格式返回日期
round: 四捨五入ceil: 向上取整
floor: 向下
upper: 轉大寫 lower: 轉小寫
length: 長度
trim: 前後去空格
lpad: 向左補齊,到指定長度
rpad: 向右補齊,到指定長度
regexp_replace: select regexp_replace(『100-200』, 『(\d+)』, 『num』); 使用正規表示式匹配目標字串,匹配成功後替換!
size: 集合中元素的個數map_keys: 返回map中的key
map_values: 返回map中的value
array_contains: 判斷array中是否包含某個元素
sort_array: 將array中的元素排序
select
videoid,
uploader,
views
from gulivideo_orc
order
by views desc
limit
10
select
category_name,
count(*
) c_n
from gulivideo_orc
lateral view explode(category) tmp as category_name --這個備表階段, 新加了字段
group
by category_name
order
by c_n desc
limit
10
select
category_name,
count(*
) c_n
from
(select
videoid,
uploader,
views,
category
from gulivideo_orc
order
by views desc
limit
20)t1
lateral view explode(t1.category) tmp as category_name
group
by category_name
select
category_name,
count(*
) c_n
from
(select
t3.videoid,
t3.category
from
(select
relatedid_name
from
(select
videoid,
uploader,
views,
relatedid
from gulivideo_orc
order
by views desc
limit
50)t1
lateral view explode(t1.relatedid) tmp as relatedid_name
) t2 join gulivideo_orc t3
on t2.relatedid_name=t3.videoid
)t4
lateral view explode(t4.category) tmp as category_name
group
by category_name
order
by c_n desc
select
t1.videoid,
t1.views,
t1.category_name
from
(select
videoid,
views,
category_name
from gulivideo_orc
lateral view explode(category) tmp as category_name
)t1where t1.category_name=
'music'
order
by t1.views desc
limit
10
select
t2.videoid,
t2.views,
t2.category_name,
t2.rk
from
(select
t1.videoid,
t1.views,
t1.category_name,
rank(
)over
(partition
by t1.category_name order
by t1.views desc
) rk
--視窗函式不改變行數, 但是變列數, 並讓結果有序
from
(select
videoid,
views,
category_name
from gulivideo_orc
lateral view explode(category) tmp as category_name
--若按類別分組 , 每個類別 的每條資料拿不到 , 無法得到類別中的前10
--此時想到視窗函式不會改變表中資料的 行數
)t1)t2
where t2.rk<=
10
select
t2.uploader,
t2.views
from
(select
uploader,
videos
from gulivideo_user_orc
order
by videos desc
limit
10)t1 join gulivideo_orc t2
on t1.uploader=t2.uploader
order
by t2.views desc
limit
20
select
t3.uploader,
t3.views,
t3.rk
from
(select
t2.uploader,
t2.views,
rank(
)over
(partition
by t2.uploader order
by t2.views desc
) rk
from
(select
uploader,
videos
from gulivideo_user_orc
order
by videos desc
limit
10)t1 join gulivideo_orc t2
on t1.uploader=t2.uploader
)t3where t3.rk<=
20
select
t2.uploader,
t1.views
from
(select
videoid,
uploader,
views
from gulivideo_orc
order
by views desc
limit
20) t1 join
(select
uploader,
videos
from gulivideo_user_orc
order
by videos desc
limit
10) t2
on t1.uploader=t2.uploader
hive sql優化整理
hive sql優化方法引數一些整理,方便快速查詢使用 1.map數量與reduce數量的控制 輸入檔案大小指實際檔案大小,與檔案格式textfile,orc等無關,壓縮的檔案格式會小很多設定引數要適當調整 map數量控制 set hive.input.format org.apache.hadoo...
遞迴習題整理
1 子集問題 求n個正整數構成的乙個給定集合a 的子集,子集的和要等於乙個給定的正整數d。請輸出所有符合條件的子集。解題思路 1 從原始集合中分離出乙個元素,它有兩種選擇 選擇放入接軌集合,或者不放入結果集合 2 對於剩下的集合,重複1的動作,直到原始集合為空集,證明所有子集已經選取完成了 子集問題...
LeetCode習題整理(一)
將兩個有序鍊錶合併為乙個新的有序鍊錶並返回。新煉表是通過拼接給定的兩個鍊錶的所有節點組成的。我的想法是逐個比較兩個鍊錶各項的大小 模擬過程 l1第一項比較l2第一項相等 執行l2的第一項插入到l1的第二項,此時的l1 1,1,2,4 l2 1,3,4 l1需指向下乙個結點,兩表指向下乙個結點,迴圈過...