比如這兒有乙個廣告,有的是廣告位,有的是非廣告位
使用者瀏覽的時候,就會產生乙個排序的資料,我們抽象成下面的乙個表
create
table window_test_table (
id int
,--使用者id
sq string,
--可以標識每個商品
cell_type int
,--標識每個商品的型別,比如廣告,非廣告
rank int
--這次搜尋下商品的位置,比如第乙個廣告商品就是1,後面的依次2,3,4...
)row format delimited fields
terminated
by','
;
匯入資料
1,flower,10,1
1,tree,26,3
1,hive,10,4
1,hadoop,13,5
1,spark,26,6
1,flink,14,7
1,sqoop,10,8
load
data
local inpath '/home/hadoop/data/window'
into
table window_test_table;
假設26代表廣告,想獲取每個使用者每次瀏覽中,非廣告型別商品的自然排序,如下效果:
1,flower,10,1
1,tree,26,null
1,hive,10,3
1,hadoop,13,4
1,spark,26,null
1,flink,14,5
1,sqoop,10,6
select id,
sq,cell_type,
case
when cell_type =
26then
null
else row_number(
)over
(partition
by id order
by rank)
end rank
from window_test_table;
結果是:
並沒有排序到
我們檢視sql的執行計畫
stage dependencies:
stage-1 is a root stage
stage-0 depends on stages: stage-1
stage plans:
stage: stage-1
tezedges:
reducer 2
dagname: hadoop_20190331200315_a6425b27-68cd-4f04-b67d-d38ae2fc8207:21
vertices:
map 1
map operator tree:
tablescan
alias: window_test_table
statistics: num rows: 1 data size: 104 basic stats: complete column stats: none
reduce output operator
key expressions: id
(type: int), rank (type: int)
sort order: ++
map-reduce partition columns: id
(type: int)
statistics: num rows: 1 data size: 104 basic stats: complete column stats: none
value expressions: sq (type: string), cell_type (type: int)
reducer 2
reduce operator tree:
select operator
expressions: key.reducesinkkey0 (type: int), value._col0 (type: string), value._col1 (type: int), key.reducesinkkey1 (type: int)
outputcolumnnames: _col0, _col1, _col2, _col3
statistics: num rows: 1 data size: 104 basic stats: complete column stats: none
ptf operator
function definitions:
input definition
input alias: ptf_0
output shape: _col0: int, _col1: string, _col2: int, _col3: int
type: windowing
windowing table definition
input alias: ptf_1
name: windowingtablefunction
order by: _col3
partition by: _col0
raw input shape:
window functions:
window function definition
alias: row_number_window_0
name: row_number
window function: genericudafrownumberevaluator
window frame: preceding(max)~following(max)
ispivotresult: true
statistics: num rows: 1 data size: 104 basic stats: complete column stats: none
select operator
expressions: _col0 (type: int), _col1 (type: string), _col2 (type: int), case when ((_col2 =
26))
then (null) else (row_number_window_0) end (type: int)
outputcolumnnames: _col0, _col1, _col2, _col3
statistics: num rows: 1 data size: 104 basic stats: complete column stats: none
file output operator
compressed: false
statistics: num rows: 1 data size: 104 basic stats: complete column stats: none
table:
input format: org.apache.hadoop.mapred.textinputformat
output format: org.apache.hadoop.hive.ql.io.hiveignorekeytextoutputformat
serde: org.apache.hadoop.hive.serde2.lazy.lazy******serde
stage: stage-0
fetch operator
limit: -1
processor tree:
listsink
可以發現,case when 是在視窗之後執行的
改寫成:
select id,
sq,cell_type,
case
when cell_type !=
26then row_number(
)over
(partition
bycase
when cell_type !=
26then id else rand(
)end
order
by rank)
else
null
end nature_rank
from window_test_table;
即可
Jquery parseInt函式問題解決方案
對時間進行分割計算 var begin 09 00 var begintime begin.split var beginhour parseint begintime 0 parseint 方法首先檢視位置0處的 字元,判斷它是否是個有效數字 如果不是,該方法將返回nan,不再繼續執行其他操作。但...
記錄乙個redis安裝報錯問題解決
今天在centos7上安裝redis時,先ruby3.2.0後,報gem install redis 出現error while executing gem gem exception unable to require openssl.錯誤 試了網上很多的方法還是出現了這個問題。最後索性刪除了本來...
分享乙個MySQL死鎖問題解決的方法
2017 2 25 17 38 41 org.hibernate.util.jdbcexceptionreporter logexceptions 嚴重 lock wait timeout exceeded try restarting transaction 2017 2 25 17 39 05 ...