一、spark导java包获取时间

在spark 中获取时间用到java.util.{Calendar,Date} 以及java.text.SimpleDateFormat来对时间输出格式作规范

可以进入sparkshell界面测试:spark-shell

首先先导入包

import java.text.SimpleDateFormat

import java.util.{Calendar, Date}

 获取当前时间:

def getNowTime(): String = {

//实例化一个Date对象并且获取时间戳(毫秒级)

val time = new Date().getTime

//设置时间格式

val format = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss")

//将时间格式套用在获取的时间戳上

   format.format(time)

}

调用该函数得到的结果为

2021-07-06 17:44:48

当想要获取非今天时间或者年份,月份,日期,小时,则要用到Calendar包

val cal = Calendar.getInstance //实例化Calendar对象

 如果想获取昨天的时间方法一:

//将-1添加到Calendar.Date中,即加载到昨天的时间

//day为1时,就是在当前时间加一天,即是明天

cal.add(Calendar.DATE, -1) 

val time1: Date = cal.getTime //获取时间

val newtime: String = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss").format(time1) //设置格式并且对时间格式化

 如果想获取昨天的时间方法二:

取出数字型的时间  再减去24*60*60*1000,就得到昨天的时间了

val yesterday  = new Date(new Date().getTime()-24*60*60*1000)
val matter = new SimpleDateFormat("yyyy-MM-dd")
val time = matter.format(yesterday)

如果想获取年,月,日,小时等

val week = cal.get(Calendar.DAY_OF_WEEK)

println("星期:"+week)

val year = cal.get(Calendar.YEAR)

println("年份:"+year)

val month = cal.get(Calendar.MONTH)

println("月份:"+(month+1)) //国外的月份是从0-11,所以要加1

val Day = cal.get(Calendar.DAY_OF_MONTH)

println("日子:"+Day)

val hour = cal.get(Calendar.HOUR_OF_DAY)

println("小时:"+hour)

val minute = cal.get(Calendar.MINUTE)

println("分钟:"+minute)

value second = cal.get(Calendar.SECOND)

println("秒:"+second)

value millisecond = cal.get(Calendar.MILLISECOND)

println("毫秒:"+millisecond)

输出的结果为:

2021-07-06 17:44:48

星期:3
年份:2021
月份:7
日子:6
小时:17
分钟:44
秒:48
毫秒:901

二、sparkSql获取时间

SPARK SQL主要通过内置日期时间函数实现。

可以进入sparksql界面测试:spark-sql

2.1 获取当前时间

1.current_date获取当前日期

select current_date;

2021-07-06

2.current_timestamp和now()获取当前时间

select current_timestamp;
select now();

2021-07-06 18:38:11.781

3.unix_timestamp返回当前时间的unix时间戳

select unix_timestamp();

1625569197

2.2 从日期时间中提取字段

1.year,month,day/dayofmonth,hour,minute,second(获取日期中的 年,月,日,天,时分秒)

Examples:

> SELECT day('2021-07-06 18:38:11.781'); 

6

 2.dayofweek (1 = Sunday, 2 = Monday, ..., 7 = Saturday) 获取星期几,dayofyear获取年中的第几天

Examples:

> SELECT dayofweek('2021-07-06');   //3

Since: 2.3.0

3.weekofyear

weekofyear(date) - Returns the week of the year of the given date. A week is considered to start on a Monday and week 1 is the first week with >3 days.

Examples:

> SELECT weekofyear('2021-07-06');   //27

4.trunc截取某部分的日期,其他部分默认为01

第二个参数 ["year", "yyyy", "yy", "mon", "month", "mm"]

Examples:

> SELECT trunc('2021-07-06', 'MM');
 2021-07-01
> SELECT trunc('2021-07-06', 'YEAR');
 2021-01-01
> SELECT trunc('2021-07-06 18:38:11.781', 'mm');
2021-07-01

5.date_trunc ["YEAR", "YYYY", "YY", "MON", "MONTH", "MM", "DAY", "DD", "HOUR", "MINUTE", "SECOND", "WEEK", "QUARTER"]

Examples:

> SELECT date_trunc('2021-07-06 18:38:11.781', 'HOUR');  //2021-07-06 18:00:00

Since: 2.3.0

6.date_format将时间转化为某种格式的字符串

Examples:

> SELECT date_format('2021-07-06', 'y');    2021

2.3 日期时间转换

1.将日期转换为时间戳:

SELECT unix_timestamp('2021-07-06 18:38:11.781');
SELECT unix_timestamp('2021-07-06 18:38:11.781','yyyy-MM-dd HH:mm:ss');

1625567891

2.from_unixtime将时间戳换算成当前时间,to_unix_timestamp将时间转化为时间戳

Examples:

> SELECT from_unixtime(0);
> SELECT from_unixtime(0, 'yyyy-MM-dd HH:mm:ss'); 
 
1970-01-01 08:00:00

>SELECT to_unix_timestamp('2021-07-06', 'yyyy-MM-dd');  

1625500800

3.to_date/date将字符串转化为日期格式,to_timestamp(Since: 2.2.0)

> SELECT to_date('2021-07-06 18:38:11.781');

2021-07-06

> SELECT to_date('2021-07-06 18:38:11.781', 'yyyy-MM-dd HH:mm:ss');

2021-07-06
> SELECT to_timestamp('2021-07-06 18:38:11.781');

2021-07-06 18:38:11.781

4.quarter 将1年4等分(range 1 to 4)

Examples:

> SELECT quarter('2021-07-06');  //3

2.4 日期、时间的相关计算

1.months_between两个日期之间的月数

months_between(timestamp1, timestamp2) - Returns number of months between timestamp1 and timestamp2.

Examples:

> SELECT months_between('2021-06-07', '2021-10-01');  //3.80645161

2. add_months返回日期后n个月后的日期

Examples:

> SELECT add_months('2021-07-06', 3);  //2021-10-06

3.last_day(date),next_day(start_date, day_of_week)

Examples:

> SELECT last_day('2021-07-06');  //2021-07-31
> SELECT next_day('2021-06-07', 'TU');  //2021-07-13

4.date_add,date_sub(减)

date_add(start_date, num_days) - Returns the date that is num_days after start_date.

Examples:

> SELECT date_add('2021-07-06', 1);  //2021-07-07
> SELECT date_sub('2021-07-06', 1);  //2021-07-05

5.datediff(两个日期间的天数)

datediff(endDate, startDate) - Returns the number of days from startDate to endDate.

Examples:

> SELECT datediff('2021-07-06', '2021-07-10'); -4
> SELECT datediff('2021-07-10', '2021-07-06');  4

6.关于UTC时间 to_utc_timestamp

to_utc_timestamp(timestamp, timezone) - Given a timestamp like '2017-07-14 02:40:00.0', interprets it as a time in the given time zone, and renders that time as a timestamp in UTC. For example, 'GMT+1' would yield '2017-07-14 01:40:00.0'.

Examples:

> SELECT to_utc_timestamp('2021-07-06', 'Asia/Seoul');  //2021-07-05 15:00:00
> SELECT to_utc_timestamp('2021-07-06', 'Asia/Beijing');  //2021-07-06 00:00:00

7.from_utc_timestamp

from_utc_timestamp(timestamp, timezone) - Given a timestamp like '2017-07-14 02:40:00.0', interprets it as a time in UTC, and renders that time as a timestamp in the given time zone. For example, 'GMT+1' would yield '2017-07-14 03:40:00.0'.

Examples:

> SELECT from_utc_timestamp('2021-07-06', 'Asia/Seoul');  //2021-07-06 09:00:00

三、实战:

3.1 scala实现:

import java.text.SimpleDateFormat
import java.util.{Calendar, TimeZone}

object Utils {
 
/**获取今日、昨日、前日、上周的日期
   */
  def get_related_date():List[String] = {
    val calendar = Calendar.getInstance()
    val today = new SimpleDateFormat("yyyyMMdd").format(calendar.getTime)
    calendar.add(Calendar.DATE, -1)
    val yestoday = new SimpleDateFormat("yyyyMMdd").format(calendar.getTime)
    calendar.add(Calendar.DATE, -1)
    val day_before_yestoday = new SimpleDateFormat("yyyyMMdd").format(calendar.getTime)
    calendar.add(Calendar.DATE, -5)
    val last_week_date = new SimpleDateFormat("yyyyMMdd").format(calendar.getTime)
    List(yestoday,day_before_yestoday,last_week_date)
  }

  /**获取时间戳范围
   * returns the timestamp range of the input date
   * @param date the date to analyse
   */
  def get_timestamp_range(date: String):List[Long] = {
    val start = new SimpleDateFormat("yyyyMMdd").parse(date).getTime()
    val end = start + 23*60*60*1000 + 59*60*1000 + 59*1000
    List(start,end)
  }

  /**获取昨天、前天、上周的三个时间戳范围
   */
  def get_related_date_timestamp_range():List[List[Long]] = {
    val date = get_related_date()
    val range_list = List(get_timestamp_range(date(0)),get_timestamp_range(date(1)),get_timestamp_range(date(2)))
    range_list
  }

}

3.2 Java实现:

import java.util.Date;
import java.text.SimpleDateFormat;
import java.util.Calendar;

public class Time {
    public static void main(String[] args) {

        SimpleDateFormat df = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");//设置日期格式
        System.out.println(df.format(new Date()));// new Date()为获取当前系统时间

        Date now = new Date();
        SimpleDateFormat dateFormat = new SimpleDateFormat("yyyy/MM/dd HH:mm:ss");//可以方便地修改日期格式
        String time = dateFormat.format(now);
        System.out.println(time);

        Long startTs = System.currentTimeMillis(); // 当前时间戳
        System.out.println(startTs);

        Calendar c = Calendar.getInstance();//可以对每个时间域单独修改   
        int year = c.get(Calendar.YEAR);
        int month = c.get(Calendar.MONTH) + 1;  //从零开始
        int date = c.get(Calendar.DATE);
        int hour = c.get(Calendar.HOUR_OF_DAY);
        int minute = c.get(Calendar.MINUTE);
        int second = c.get(Calendar.SECOND);
        System.out.println("年:" + year + "\n" + "月:" + month + "\n" + "日:" + date + "\n" + "时:" + hour + "\n" + "分:" + minute + "\n" + "秒:" + second);

    }
}