a,2017-10-11,300
a,2017-10-12,200
a,2017-10-13,100
a,2017-10-15,100
a,2017-10-16,300
a,2017-10-17,150
a,2017-10-18,340
a,2017-10-19,360
b,2017-10-11,400
b,2017-10-12,200
b,2017-10-15,600
c,2017-10-11,350
c,2017-10-13,250
c,2017-10-14,300
c,2017-10-15,400
c,2017-10-16,200
d,2017-10-13,500
e,2017-10-14,600
e,2017-10-15,500
d,2017-10-14,600
分析 : 給每個使用者乙個編號,用日期減去編號,如果是同一天,那麼就是連續的.
a,2017-10-11,300,1,2017-10-10
a,2017-10-12,200,2,2017-10-10
a,2017-10-13,100,3,2017-10-10
a,2017-10-15,100,4,2017-10-11
a,2017-10-16,300,5,2017-10-11
a,2017-10-17,150,6,2017-10-11
a,2017-10-18,340,7,2017-10-11
a,2017-10-19,360,8,2017-10-11
b,2017-10-11,400
b,2017-10-12,200
b,2017-10-15,600
c,2017-10-11,350
c,2017-10-13,250
c,2017-10-14,300
c,2017-10-15,400
c,2017-10-16,200
d,2017-10-13,500
e,2017-10-14,600
e,2017-10-15,500
d,2017-10-14,600
1:建表,載入資料
create table t_jd(shopid string,dt string,sale int)
row format delimited fields terminated by ',';
load data local inpath '/root/sale.dat' into table t_jd;
2:打編號
select shopid,dt,sale,
row_number() over(partition by shopid order by dt) as rn
from t_jd;
結果 :
3 根據編號,生成連續日期
select shopid,dt,sale,rn,
date_sub(to_date(dt),rn)
from
(select shopid,dt,sale,
row_number() over(partition by shopid order by dt) as rn
from t_jd) tmp;
結果 :
4 分組,求count
select shopid,count(1) as cnt
from
(select shopid,dt,sale,rn,
date_sub(to_date(dt),rn) as flag
from
(select shopid,dt,sale,
row_number() over(partition by shopid order by dt) as rn
from t_jd) tmp) tmp2
group by shopid,flag;
結果 :
5 篩選出連續天數大於等於3的
select shopid from
(select shopid,count(1) as cnt
from
(select shopid,dt,sale,rn,
date_sub(to_date(dt),rn) as flag
from
(select shopid,dt,sale,
row_number() over(partition by shopid order by dt) as rn
from t_jd) tmp) tmp2
group by shopid,flag) t***
where t***.cnt>=3;
結果 :
6 去重
select distinct shopid from
(select shopid,count(1) as cnt
from
(select shopid,dt,sale,rn,
date_sub(to_date(dt),rn) as flag
from
(select shopid,dt,sale,
row_number() over(partition by shopid order by dt) as rn
from t_jd) tmp) tmp2
group by shopid,flag) t***
where t***.cnt>=3;
結果 :
多組資料要求求出最大平台的長度
已知乙個已經從小到大排列好的陣列,說這個陣列中的乙個平台 plateau 就是連續的一串值相同的元素,並且這一串元素不能再延 伸。例如,在 1,2,2,3,3,3,4,5,5,6 中 1,2.2,3.3.3,4,5.5,6 都是平台。試編寫乙個程式,接收乙個陣列,把這個陣列中最長的 平台找出來。在上...
從需求出發 差異化讓雲計算走得更遠
從需求出發 差異化讓雲計算走得更遠 最近,友友ceo姚巨集宇談到如何選擇雲計算產品時表示,在選擇雲計算之前,第乙個看你是不是需要,是不是需要自己蓋棟樓的。選擇的前提是看自己的需求。第二假如自己確實有這個必要,就要看看自己的實力,通常很大的公司很大的企業,並且自己營銷力量很強,像這種企業可能會選擇自己...
hive的日期處理函式及常用需求
1.只有日期 hive default select current date ok c0 2019 12 19 time taken 0.059 seconds,fetched 1 row s 2.含時間 hive default select current timestamp ok c0 20...