0

I need to get the minimum value from the Spark data frame and transform it. Currently, I just get this value and transform it using DateTime, however, I need it in the unix_timestamp format as the result. So how can I convert DateTime to unix_timestamp either using Scala functions or Spark functions?

Here is my current code which for now returns DateTime:

val minHour = new DateTime(df.agg(min($"event_ts"))
          .as[Timestamp].collect().head))
          .minusDays(5)
          .withTimeAtStartOfDay())

I tried using Spark functions as well but I was not able to switch timestamp to start time of day (which can be achieved using DateTime withTimeAtStartOfDay function):

val minHour = new DateTime(df.agg(min($"event_ts").alias("min_ts"))
.select(unix_timestamp(date_sub($"min_ts", 5)))
.as[Long].collect().head)

1 Answer 1

1

date_sub will cast your timestamp to a date, so the time will be automatically shifted to the start of day.

df.show
+-------------------+----------+
|           event_ts|event_hour|
+-------------------+----------+
|2017-05-01 00:22:01|1493598121|
|2017-05-01 00:22:08|1493598128|
|2017-05-01 00:22:01|1493598121|
|2017-05-01 00:22:06|1493598126|
+-------------------+----------+

df.agg(
    min($"event_ts").alias("min_ts")
).select(
    unix_timestamp(date_sub($"min_ts", 5)).alias("min_ts_unix")
).withColumn(
    "min_ts", $"min_ts_unix".cast("timestamp")
).show
+-----------+-------------------+
|min_ts_unix|             min_ts|
+-----------+-------------------+
| 1493164800|2017-04-26 00:00:00|
+-----------+-------------------+
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.