pyspark 日期格式

2022-05-22 19:21:13 字數 3740 閱讀 5859

1. 獲取當前日期

from pyspark.sql.functions import

current_date

spark.range(3).withcolumn('date',current_date()).show()

# +---+----------+# | id| date|# +---+----------+# | 0|2018-03-23|# | 1|2018-03-23|

2. 獲取當前日期和時間
from pyspark.sql.functions import

current_timestamp

spark.range(3).withcolumn('date',current_timestamp()).show()

# +---+--------------------+# | id| date|# +---+--------------------+# | 0|2018-03-23 17:40:...|# | 1|2018-03-23 17:40:...|# | 2|2018-03-23 17:40:...|# +---+--------------------+

3. 日期格式轉換

from pyspark.sql.functions import

date_format

df = spark.createdataframe([('2015-04-08',)], ['a'])

df.select(date_format('a', 'mm/dd/yyy').alias('date')).show()

12

3454. 字元轉日期

from pyspark.sql.functions import

to_date, to_timestamp

# 1.轉日期

df = spark.createdataframe([('1997-02-28 10:30:00',)], ['t'])

df.select(to_date(df.t).alias('date')).show()

# [row(date=datetime.date(1997, 2, 28))]

# 2.帶時間的日期

df = spark.createdataframe([('1997-02-28 10:30:00',)], ['t'])

df.select(to_timestamp(df.t).alias('dt')).show()

# [row(dt=datetime.datetime(1997, 2, 28, 10, 30))]

# 還可以指定日期格式

df = spark.createdataframe([('1997-02-28 10:30:00',)], ['t'])

df.select(to_timestamp(df.t, 'yyyy-mm-dd hh:mm:ss').alias('dt')).show()

# [row(dt=datetime.datetime(1997, 2, 28, 10, 30))]

5. 獲取日期中的年月日

from pyspark.sql.functions import

year, month, dayofmonth

df = spark.createdataframe([('2015-04-08',)], ['a'])

df.select(year('a').alias('year'),

month('a').alias('month'),

dayofmonth('a').alias('day')

).show()

6. 獲取時分秒

from pyspark.sql.functions import

hour, minute, second

df = spark.createdataframe([('2015-04-08 13:08:15',)], ['a'])

df.select(hour('a').alias('hour'),

minute('a').alias('minute'),

second('a').alias('second')

).show()

7. 獲取日期對應的季度

from pyspark.sql.functions import

quarter

df = spark.createdataframe([('2015-04-08',)], ['a'])

df.select(quarter('a').alias('quarter')).show()

8. 日期加減

from pyspark.sql.functions import

date_add, date_sub

df = spark.createdataframe([('2015-04-08',)], ['d'])

df.select(date_add(df.d, 1).alias('d-add'),

date_sub(df.d, 1).alias('d-sub')

).show()

9. 月份加減

from pyspark.sql.functions import

add_months

df = spark.createdataframe([('2015-04-08',)], ['d'])

df.select(add_months(df.d, 1).alias('d')).show()

10. 日期差,月份差

from pyspark.sql.functions import

datediff, months_between

# 1.日期差

df = spark.createdataframe([('2015-04-08','2015-05-10')], ['d1', 'd2'])

df.select(datediff(df.d2, df.d1).alias('diff')).show()

# 2.月份差

df = spark.createdataframe([('1997-02-28 10:30:00', '1996-10-30')], ['t', 'd'])

df.select(months_between(df.t, df.d).alias('months')).show()

11. 計算下乙個日子的日期

計算當前日期的下乙個星期1,2,3,4,5,6,7的具體日子,屬於實用函式

from pyspark.sql.functions

import

next_day

# "mon", "tue", "wed", "thu", "fri", "sat", "sun".

df = spark.createdataframe([('2015-07-27',)], ['d'])

df.select(next_day(df.d, 'sun').alias('date')).show()

12. 本月的最後乙個日期

from pyspark.sql.functions import

last_day

df = spark.createdataframe([('1997-02-10',)], ['d'])

df.select(last_day(df.d).alias('date')).show()

pyspark系列 日期函式

日期函式 from pyspark.sql.functions import current date spark.range 3 withcolumn date current date show id date 0 2018 03 23 1 2018 03 23 from pyspark.sql...

python 日期 格式轉換 英文 日期格式轉換

一 date型轉字串 filter date date,yyyymm filter date time,hh mm 在控制器中使用必須注入 filter 模組 scope.dt1 new date 控制器中使用 scope.dt2 filter date scope.dt1,yyyy mm dd h...

Oracle日期格式

日期處理完全版 to date格式 day dd number 12 dy abbreviated fri day spelled out friday ddspth spelled out,ordinal twelfth month mm number 03 mon abbreviated mar...