I start spark master using : ./sbin/start-master.sh as described at : http://spark.apache.org/docs/latest/spark-standalone.html
I then submit the Spark job :
sh ./bin/spark-submit \
--class simplespark.Driver \
--master spark://`localhost`:7077 \
C:\\Users\\Adrian\\workspace\\simplespark\\target\\simplespark-0.0.1-SNAPSHOT.jar
How can run a simple app which demonstrates a parallel task running ?
When I view http://localhost:4040/executors/ & http://localhost:8080/ there are no
tasks running :


The .jar I'm running (simplespark-0.0.1-SNAPSHOT.jar) just contains a single Scala object :
package simplespark
import org.apache.spark.SparkContext
object Driver {
def main(args: Array[String]) {
val conf = new org.apache.spark.SparkConf()
.setMaster("local")
.setAppName("knn")
.setSparkHome("C:\\spark-1.1.0-bin-hadoop2.4\\spark-1.1.0-bin-hadoop2.4")
.set("spark.executor.memory", "2g");
val sc = new SparkContext(conf);
val l = List(1)
sc.parallelize(l)
while(true){}
}
}
Update : When I change --master spark://localhost:7077 \ to --master spark://Adrian-PC:7077 \
I can see update on the Spark UI :

I have also updated Driver.scala to read default context, as I'm not sure if I set it correctly for submitting Spark jobs :
package simplespark
import org.apache.spark.SparkContext
object Driver {
def main(args: Array[String]) {
System.setProperty("spark.executor.memory", "2g")
val sc = new SparkContext();
val l = List(1)
val c = sc.parallelize(List(2, 3, 5, 7)).count()
println(c)
sc.stop
}
}
On Spark console I receive multiple same all same messages :
14/12/26 20:08:32 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory
So it appears that the Spark job is not reaching the master ?
Update2 : After I start (thanks to Lomig Mégard comment below) the worker using :
./bin/spark-class org.apache.spark.deploy.worker.Worker spark://Adrian-PC:7077
I receive error :
14/12/27 21:23:52 INFO SparkDeploySchedulerBackend: Executor app-20141227212351-0003/8 removed: java.io.IOException: Cannot run program "C:\cygdrive\c\spark-1.1.0-bin-hadoop2.4\spark-1.1.0-bin-hadoop2.4/bin/compute-classpath.cmd" (in directory "."): CreateProcess error=2, The system cannot find the file specified
14/12/27 21:23:52 INFO AppClient$ClientActor: Executor added: app-20141227212351-0003/9 on worker-20141227211411-Adrian-PC-58199 (Adrian-PC:58199) with 4 cores
14/12/27 21:23:52 INFO SparkDeploySchedulerBackend: Granted executor ID app-20141227212351-0003/9 on hostPort Adrian-PC:58199 with 4 cores, 2.0 GB RAM
14/12/27 21:23:52 INFO AppClient$ClientActor: Executor updated: app-20141227212351-0003/9 is now RUNNING
14/12/27 21:23:52 INFO AppClient$ClientActor: Executor updated: app-20141227212351-0003/9 is now FAILED (java.io.IOException: Cannot run program "C:\cygdrive\c\spark-1.1.0-bin-hadoop2.4\spark-1.1.0-bin-hadoop2.4/bin/compute-classpath.cmd" (in directory "."): CreateProcess error=2, The system cannot find the file specified)
14/12/27 21:23:52 INFO SparkDeploySchedulerBackend: Executor app-20141227212351-0003/9 removed: java.io.IOException: Cannot run program "C:\cygdrive\c\spark-1.1.0-bin-hadoop2.4\spark-1.1.0-bin-hadoop2.4/bin/compute-classpath.cmd" (in directory "."): CreateProcess error=2, The system cannot find the file specified
14/12/27 21:23:52 ERROR SparkDeploySchedulerBackend: Application has been killed. Reason: Master removed our application: FAILED
14/12/27 21:23:52 ERROR TaskSchedulerImpl: Exiting due to error from cluster scheduler: Master removed our application: FAILED
14/12/27 21:23:52 INFO DAGScheduler: Submitting 2 missing tasks from Stage 0 (ParallelCollectionRDD[0] at parallelize at Driver.scala:14)
14/12/27 21:23:52 INFO TaskSchedulerImpl: Adding task set 0.0 with 2 tasks
Java HotSpot(TM) Client VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
I'm running the scripts on Windows using Cygwin. To fix this error I copy the Spark installation to cygwin C:\ drive. But then I receive a new error :
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Master removed our application: FAILED
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1185)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1174)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1173)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1173)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:688)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:688)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:688)
at org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1391)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
at akka.actor.ActorCell.invoke(ActorCell.scala:456)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
at akka.dispatch.Mailbox.run(Mailbox.scala:219)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Java HotSpot(TM) Client VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0