1

I try to save a pyspark dataframe to mongodb using a google cloud dataproc cluster, but it keeps showing me an error message. I'm using spark 2.4.7 and python 3.7, and mongoDB spark connector 2.4.3 Here is my code:

spark = SparkSession.builder\
                    .master("yarn")\
                    .appName("demo")\
                    .config("spark.mongodb.input.uri",
                             "mongodb+srv://my_host:27017/people_db") \
                    .config("spark.mongodb.output.uri",
                            "mongodb+srv://my_host:27017/people_db") \
                    .config('spark.jars.packages',
                            'org.mongodb.spark:mongo-spark-connector_2.12-2.4.3')\
                    .getOrCreate()
df = spark.read\
          .format('csv')\
          .options(header=True)\
          .load(csv_path)

# ----------Some data processing -----------

df.write\    #This is the block of code that shows the error
  .format("com.mongodb.spark.sql.DefaultSource")\
  .mode("append")\
  .option("collection", "people")\
  .save()

Here is the error message:

enter image description here

2
  • 1
    The error is saying the class ConnectionString cannot be found from your classpath. I don't believe Dataproc manages MongoDB related dependencies so a conflict is unlikely. Is the same Spark application running fine on a non-Dataproc cluster? What if you add the mongo-java-driver artifact from search.maven.org/remotecontent?filepath=org/mongodb/spark/… as well to your Spark packages list? Commented Jun 23, 2021 at 1:20
  • Thank you so much @cyxxy for your help, I added the mongo java driver jar file to spark packages list and it works very well Commented Jun 24, 2021 at 8:34

1 Answer 1

0

The mongo driver jar is not included in the class path. The two mongo jars (connector and driver) are essential in spark/jars path. I was able to run on local and also as dataproc job by referring to the below link. Mongo connector : 2.12_3.0.1 Mongo java driver : 3.12 Spark : 3.0.2

Mongo dependencies required

Sign up to request clarification or add additional context in comments.

2 Comments

Your answer could be improved with additional supporting information. Please edit to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers in the help center.
While this link may answer the question, it is better to include the essential parts of the answer here and provide the link for reference. Link-only answers can become invalid if the linked page changes. - From Review

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.