0

I try make Apache Spark job on scala. I'm novice in Scala and and earlier use Pyspark. Have error when the job starts. Code:

object SparkRMSP_full {
  import org.apache.spark.sql.SparkSession

  def main(args: Array[String]): Unit = {
    val spark = SparkSession
      .builder
      .appName("parse_full_rmsp_job")
      .getOrCreate()

    val raw_data_df = spark.readStream
      .format("kafka")
      .option("kafka.bootstrap.servers", "10.1.24.111:9092")
      .option("subscribe", "dev.etl.fns.rmsp.raw-data")
      .load()

    println(raw_data_df.isStreaming)
    raw_data_df.printSchema
  }
}

spark-submit command:

spark-submit --packages org.apache.spark:spark-streaming-kafka-0-10-assembly_2.11:2.1.0 --master local --num-executors 2 --executor-memory 2g --driver-memory 1g --executor-cores 2 "C:\tools\jar\streaming_spark.jar"

And I have error:

20/07/15 15:05:32 WARN SparkSubmit$$anon$2: Failed to load SparkRMSP_full.
java.lang.ClassNotFoundException: SparkRMSP_full

How I must declare the class correctly?

UPD:

build.sbt:

name := "streaming_spark"

version := "0.1"

scalaVersion := "2.11.12"

libraryDependencies += "org.apache.spark" %% "spark-sql" % "2.3.1"
libraryDependencies += "org.apache.spark" %% "spark-streaming-kafka-0-10-assembly" % "2.3.1"

project sctructure on pastebin

6
  • how did you create a jar file, maven or sbt can you share it? Also add --class SparkRMSP_full Commented Jul 16, 2020 at 8:52
  • I'm build jar file with sbt in IDEA. And I add --class into spark-submit, but same error Commented Jul 16, 2020 at 10:18
  • can you add the sbt content and folder structure? Commented Jul 16, 2020 at 10:47
  • Is complete. Catalogs on pasterbin Commented Jul 16, 2020 at 11:06
  • Can you paste the path of SparkRMSP_full object ? Commented Jul 16, 2020 at 11:11

1 Answer 1

1

Change your spark-submit command like below & try again.

spark-submit \
  --packages org.apache.spark:spark-streaming-kafka-0-10-assembly_2.11:2.1.0 \
  --master local \
  --num-executors 2 \
  --executor-memory 2g \
  --driver-memory 1g \
  --executor-cores 2 \
  --class SparkRMSP_full \  # you might need add your fully qualified package name with class name
  "C:\tools\jar\streaming_spark.jar" 
Sign up to request clarification or add additional context in comments.

5 Comments

which build tool are you using maven or sbt .. can you post that code in question ??
what I must post? I novice in Java-like codding. I run "Build" in IDEA and make jar.
pom.xml or build.sbt file ? and how are you builiding jar file ?
I'm build jar file with sbt in IDEA. build.sbt in question
try building fat jar file using sbt also include main class.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.