Monitoring a task in apache Spark

Question

I start spark master using : ./sbin/start-master.sh as described at : http://spark.apache.org/docs/latest/spark-standalone.html

I then submit the Spark job :

sh ./bin/spark-submit \
  --class simplespark.Driver \
  --master spark://`localhost`:7077 \
    C:\\Users\\Adrian\\workspace\\simplespark\\target\\simplespark-0.0.1-SNAPSHOT.jar

How can run a simple app which demonstrates a parallel task running ?

When I view http://localhost:4040/executors/ & http://localhost:8080/ there are no tasks running :

enter image description here

The .jar I'm running (simplespark-0.0.1-SNAPSHOT.jar) just contains a single Scala object :

package simplespark

    import org.apache.spark.SparkContext

    object Driver {

      def main(args: Array[String]) {

        val conf = new org.apache.spark.SparkConf()
          .setMaster("local")
          .setAppName("knn")
          .setSparkHome("C:\\spark-1.1.0-bin-hadoop2.4\\spark-1.1.0-bin-hadoop2.4")
          .set("spark.executor.memory", "2g");

        val sc = new SparkContext(conf);
        val l = List(1)

        sc.parallelize(l)

        while(true){}

      }
    }

Update : When I change --master spark://localhost:7077 \ to --master spark://Adrian-PC:7077 \

I can see update on the Spark UI :

enter image description here

I have also updated Driver.scala to read default context, as I'm not sure if I set it correctly for submitting Spark jobs :

package simplespark

import org.apache.spark.SparkContext

object Driver {

  def main(args: Array[String]) {

    System.setProperty("spark.executor.memory", "2g")

    val sc = new SparkContext();
    val l = List(1)

    val c = sc.parallelize(List(2, 3, 5, 7)).count()
    println(c)

    sc.stop

  }
}

On Spark console I receive multiple same all same messages :

14/12/26 20:08:32 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory

So it appears that the Spark job is not reaching the master ?

Update2 : After I start (thanks to Lomig Mégard comment below) the worker using :

./bin/spark-class org.apache.spark.deploy.worker.Worker spark://Adrian-PC:7077

I receive error :

14/12/27 21:23:52 INFO SparkDeploySchedulerBackend: Executor app-20141227212351-0003/8 removed: java.io.IOException: Cannot run program "C:\cygdrive\c\spark-1.1.0-bin-hadoop2.4\spark-1.1.0-bin-hadoop2.4/bin/compute-classpath.cmd" (in directory "."): CreateProcess error=2, The system cannot find the file specified
14/12/27 21:23:52 INFO AppClient$ClientActor: Executor added: app-20141227212351-0003/9 on worker-20141227211411-Adrian-PC-58199 (Adrian-PC:58199) with 4 cores
14/12/27 21:23:52 INFO SparkDeploySchedulerBackend: Granted executor ID app-20141227212351-0003/9 on hostPort Adrian-PC:58199 with 4 cores, 2.0 GB RAM
14/12/27 21:23:52 INFO AppClient$ClientActor: Executor updated: app-20141227212351-0003/9 is now RUNNING
14/12/27 21:23:52 INFO AppClient$ClientActor: Executor updated: app-20141227212351-0003/9 is now FAILED (java.io.IOException: Cannot run program "C:\cygdrive\c\spark-1.1.0-bin-hadoop2.4\spark-1.1.0-bin-hadoop2.4/bin/compute-classpath.cmd" (in directory "."): CreateProcess error=2, The system cannot find the file specified)
14/12/27 21:23:52 INFO SparkDeploySchedulerBackend: Executor app-20141227212351-0003/9 removed: java.io.IOException: Cannot run program "C:\cygdrive\c\spark-1.1.0-bin-hadoop2.4\spark-1.1.0-bin-hadoop2.4/bin/compute-classpath.cmd" (in directory "."): CreateProcess error=2, The system cannot find the file specified
14/12/27 21:23:52 ERROR SparkDeploySchedulerBackend: Application has been killed. Reason: Master removed our application: FAILED
14/12/27 21:23:52 ERROR TaskSchedulerImpl: Exiting due to error from cluster scheduler: Master removed our application: FAILED
14/12/27 21:23:52 INFO DAGScheduler: Submitting 2 missing tasks from Stage 0 (ParallelCollectionRDD[0] at parallelize at Driver.scala:14)
14/12/27 21:23:52 INFO TaskSchedulerImpl: Adding task set 0.0 with 2 tasks
Java HotSpot(TM) Client VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0

I'm running the scripts on Windows using Cygwin. To fix this error I copy the Spark installation to cygwin C:\ drive. But then I receive a new error :

Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Master removed our application: FAILED
        at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1185)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1174)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1173)
        at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
        at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
        at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1173)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:688)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:688)
        at scala.Option.foreach(Option.scala:236)
        at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:688)
        at org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1391)
        at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
        at akka.actor.ActorCell.invoke(ActorCell.scala:456)
        at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
        at akka.dispatch.Mailbox.run(Mailbox.scala:219)
        at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
        at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
        at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
        at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
        at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Java HotSpot(TM) Client VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0

Lomig Mégard · Accepted Answer · 2014-12-27 21:05:47Z

1

You have to start the actual computation to see the job.

val c = sc.parallelize(List(2, 3, 5, 7)).count()
println(c)

Here count is called an action, you need at least one of them to begin a job. You can find the list of available actions in the Spark doc.

The other methods are called transformations. They are lazily executed.

Don't forget to stop the context at the end, instead of your infinite loop, with sc.stop().

Edit: For the updated question, you allocate more memory to the executor than there is available in the worker. The defaults should be fine for simple tests.

You also need to have a running worker linked to your master. See this doc to start it.

./sbin/start-master.sh
./bin/spark-class org.apache.spark.deploy.worker.Worker spark://IP:PORT

edited Dec 27, 2014 at 21:05

answered Dec 26, 2014 at 17:37

Lomig Mégard

1,83814 silver badges18 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

blue-sky Over a year ago

thanks but I don't think your answer solves my issue, please see question update. I'm using your suggestion sc.stop

Lomig Mégard Over a year ago

Are you sure you have a worker with enough memory? Try to remove the spark.executor.memory configuration. With this simple example the default is enough. And indeed, the localhost cannot be used as a master. You need to specify the real host name.

blue-sky Over a year ago

Removing spark.executor.memory does not have any effect. I think this issue may be due to fact that I'm running locally on Windows & executing submit scripts using Cygwin. I will update when test on a proper UNIX environment.

Lomig Mégard Over a year ago

Are you sure you have a worker running by the way? See my updated answer.

blue-sky Over a year ago

I did not have a worker running. please see question update. I think this issue may be related to Cygin intallation

|

Collectives™ on Stack Overflow

Monitoring a task in apache Spark

1 Answer 1

6 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

6 Comments

Your Answer

Sign up or log in

Post as a guest

Related