2

I'm working with Spark now but I find out that using ORDER BY in Spark SQL is very slow to sort a DataFrame. So how to sort a DataFrame without Spark SQL ?

1 Answer 1

1

I'm not sure if I've fully understand what you need.

Anyway, if you want to sort a DF you could use sortBy (or sortByKey in case of (K,V))

For example, if we assume to have a DF (in this case coming from Spark SQL), we can sort it like this:

val sqlResult = sqlContext.sql("select first_column, second_column from logs").toDF()
val result = sqlResult.sortBy(x=>x._1) // first column sorting

As said before, you can sort any DF, but I just want to show another way to "access" data with Spark SQL, and then sorting them with Spark core functionalities.

Hope it could help!

FF

Sign up to request clarification or add additional context in comments.

1 Comment

If I helped you, could you please rate and accept the answer? have a good day!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.