0

I am running a Spark application on the driver.

Is is simple as follow

val count=0; 
val test_dataframe =//extrenal frame
count=test.count();
println("The count of frame is " + count);

My question is that if the third line is always executed after computing the count of frame. Is it possible the driver first run the println command, before executing the dataframe and its count?

5
  • 1
    how will you print the count without performing count operation on dataframe? Commented Sep 11, 2017 at 13:49
  • The count is initially declared as zero. Commented Sep 12, 2017 at 3:38
  • I saw that. However that didn't make any sense to me to print just zero. What if you put that println statement before count=test.count() this line? Commented Sep 12, 2017 at 3:40
  • my question is that possible println run without processing datafraem and then printing 0? Commented Sep 12, 2017 at 5:18
  • I asssume that instead of test.count you meant test_dataframe.count. As your declared dataframe name is test_dataframe and not test. Commented Sep 12, 2017 at 20:08

1 Answer 1

1

No it is not possible that driver will execute println before test.count() in above mentioned code as count is a tearminal operation and call to a terminal operation forces spark to perform computation before moving on.

If you want async count then here is a code snippet which works:

var future = test.rdd.countAsync
println("The count before future evaluation: " + count)
count = future.get
println("The count after future evaluation: " + count)

Note that countAsync action is not available directly on dataframe. It can be performed on RDD.

Sign up to request clarification or add additional context in comments.

3 Comments

I feel not, "test.count" would be an Asynchronous operation in the nature!
count is a synchronous action by default. There are several async actions supported by spark. If you want asyc behaviour then you need to explicitly mention that in your code.check this class: spark.apache.org/docs/2.2.0/api/java/index.html?org/apache/…
@Luckylukee does above explanation clears your doubt? Also I have updated the answer with a code snippet for async count.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.