3

I am very new to scala spark and Mongo. while trying to load some data to MongoDB by spark with the following code.

import com.mongodb.spark.config.WriteConfig
import com.mongodb.spark.toDocumentRDDFunctions
import org.apache.spark.sql.SparkSession
import org.apache.spark.{SparkConf, SparkContext}
import org.bson.Document

object MongoTest {
  def main(args: Array[String]): Unit = {
    val spark = SparkSession.builder()
      .master("local[*]")
      .appName(this.getClass.getSimpleName)
      .getOrCreate()

    val conf = new SparkConf().setAppName(this.getClass.getSimpleName).set("spark.driver.allowMultipleContexts", "true")
    val sc = new SparkContext(conf)
    val documents = sc.parallelize((1 to 10).map(i => Document.parse(s"{test: $i}")))
    documents.saveToMongoDB(WriteConfig(Map("spark.mongodb.output.uri" -> "mongodb://127.0.0.1:27017/sampledb.testMongo")))
  }
}

The error occurs and my spark submit fails with following error:

 java.lang.NoSuchMethodError: com.mongodb.Mongo.<init>(Lcom/mongodb/MongoClientURI;)V
        at com.mongodb.MongoClient.<init>(MongoClient.java:328)
        at com.mongodb.spark.connection.DefaultMongoClientFactory.create(DefaultMongoClientFactory.scala:43)
        at com.mongodb.spark.connection.MongoClientCache.acquire(MongoClientCache.scala:55)
        at com.mongodb.spark.MongoConnector.acquireClient(MongoConnector.scala:239)
        at com.mongodb.spark.MongoConnector.withMongoClientDo(MongoConnector.scala:152)
        at com.mongodb.spark.MongoConnector.withDatabaseDo(MongoConnector.scala:171)
        at com.mongodb.spark.MongoConnector.withCollectionDo(MongoConnector.scala:184)
        at com.mongodb.spark.MongoSpark$$anonfun$save$1.apply(MongoSpark.scala:116)
        at com.mongodb.spark.MongoSpark$$anonfun$save$1.apply(MongoSpark.scala:115)
        at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$28.apply(RDD.scala:935)
        at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$28.apply(RDD.scala:935)
        at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2101)
        at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2101)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
        at org.apache.spark.scheduler.Task.run(Task.scala:121)
        at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:403)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1405)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:409)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)

I use Spark version 2.4.0 and Scala version 2.11.12. Any idea where I am wrong.?

4
  • What is your MongoDB and Mongo Driver version you are using? Did you pass the mongo driver to the spark-shell or submit? Commented Sep 23, 2020 at 17:01
  • I use MongoDB server version: 4.2.5. Mongo driver is passed to the spark-submit. Commented Sep 23, 2020 at 17:06
  • stackoverflow.com/questions/48782145/… Commented Sep 23, 2020 at 17:24
  • 2
    @chris you are probably not using the right version of the driver as per Spark 2.4.5 and MongoDB 4.2.5. Specify your driver version since I couldn't find the driver for Spark 2.4.5 here Commented Sep 23, 2020 at 17:28

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.