I'm working with Spark now but I find out that using ORDER BY in Spark SQL is very slow to sort a DataFrame. So how to sort a DataFrame without Spark SQL ?
1 Answer
I'm not sure if I've fully understand what you need.
Anyway, if you want to sort a DF you could use sortBy (or sortByKey in case of (K,V))
For example, if we assume to have a DF (in this case coming from Spark SQL), we can sort it like this:
val sqlResult = sqlContext.sql("select first_column, second_column from logs").toDF()
val result = sqlResult.sortBy(x=>x._1) // first column sorting
As said before, you can sort any DF, but I just want to show another way to "access" data with Spark SQL, and then sorting them with Spark core functionalities.
Hope it could help!
FF
1 Comment
Fabio Fantoni
If I helped you, could you please rate and accept the answer? have a good day!