After transformation, the column processed_time on your DataFrame is of type TimestampType. Therefore, the column values are of type java.sql.Timestamp.
The trailing zero that you see is the number of nanoseconds (because java.sql.Timestamp precision allows it). It's just here because when doing your_df.show(), the method toString is called on java.sql.Timestamp.
If you just want to have your result formatted (but as a String), you can add .cast(StringType) when modifying your processed_time column :
df.withColumn(
"processed_time",
to_utc_timestamp(
unix_timestamp(col("processed_time")).cast(TimestampType),
"UTC"
).cast(StringType)
)
You can also use date_format, as written in the comments :
df.withColumn(
"processed_time",
date_format(
to_utc_timestamp(
unix_timestamp(col("processed_time")).cast(TimestampType),
"UTC"
),
"yyyy-MM-dd HH:mm:ss"
)
)
If you really need a TimestampType, then you can just forget about the trailing zero during your transforms, and then just use a SimpleDateFormat afterwards for display :
val firstTimestampFromDf: java.sql.Timestamp = df
.select("processed_time")
.head
.getTimestamp(0)
import java.text.SimpleDateFormat
val simpleDateFormat = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss")
val firstTimestampFromDfFormatted = simpleDateFormat.format(firstTimestampFromDf)