4

I am having problem on converting unix time to timestamp.

I have a Dataframe, one column is PosTime. I would like to convert it to Timestamp, but it is half working. can you help?

scala> adsb.printSchema()
root
 |-- Icao: string (nullable = true)
 |-- Alt: long (nullable = true)
 |-- Lat: double (nullable = true)
 |-- Long: double (nullable = true)
 |-- PosTime: long (nullable = true)
 |-- Spd: double (nullable = true)
 |-- Trak: double (nullable = true)
 |-- Type: string (nullable = true)
 |-- Op: string (nullable = true)
 |-- Cou: string (nullable = true)

scala> adsb.show(50)
+------+------+---------+----------+-------------+-----+-----+----+--------------------+--------------------+
|  Icao|   Alt|      Lat|      Long|      PosTime|  Spd| Trak|Type|                  Op|                 Cou|
+------+------+---------+----------+-------------+-----+-----+----+--------------------+--------------------+
|ABECE7|  4825|40.814442| -111.9776|1506875131778|197.0|356.0|B739|     Delta Air Lines|       United States|
|4787B0| 38000|     null|      null|         null| null| null|B738|           Norwegian|              Norway|
|D3B18A|  4222|     null|      null|         null| null| null|null|                null|Unknown or unassi...|
|3C3F78|118400|     null|      null|         null| null| null|null|                null|             Germany|
|AA1C45|   -75|40.695969|-74.166321|1506875131747|157.4| 25.6|null|                null|       United States|
 scala> val adsb1 = adsb.withColumn("PosTime", $"PosTime".cast(TimestampType))

scala> adsb_sort.show(100)
+------+-------+---------+---------+--------------------+-------+-------+----+----+--------------------+
|  Icao|    Alt|      Lat|     Long|             PosTime|    Spd|   Trak|Type|  Op|                 Cou|
+------+-------+---------+---------+--------------------+-------+-------+----+----+--------------------+
|FFFFFF|   null|     null|     null|                null|   null|   null|null|null|Unknown or unassi...|
|FFFFFF|1049093|      0.0|      0.0|49800-05-04 14:39...|28672.0| 1768.7|null|null|Unknown or unassi...|
|FFFFFF|  12458|      0.0|      0.0|49800-12-11 06:39...|    0.0| 2334.4|null|null|Unknown or unassi...|

1 Answer 1

13

Spark interprets Long as timestamp in seconds but it looks like data is in milliseconds:

scala> spark.sql("SELECT CAST(1506875131778 / 1000 AS timestamp)").show
+-------------------------------------------------------------------------+
|CAST((CAST(1506875131778 AS DOUBLE) / CAST(1000 AS DOUBLE)) AS TIMESTAMP)|
+-------------------------------------------------------------------------+
|                                                     2017-10-01 18:25:...|
+-------------------------------------------------------------------------+

If I am right, just divide by 1000:

adsb.withColumn("PosTime", ($"PosTime" / 1000).cast(TimestampType))
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.