Why does MongoDB Spark Connector fail with AbstractMethodError?

Question

I am trying to insert a spark sql dataframe in a remote mongodb collection. Previously I wrote a java program with MongoClient to check whether the remote collection is accessible and I was successfully able to do so.

My present spark code is as below -

scala> val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc)
warning: there was one deprecation warning; re-run with -deprecation for details
sqlContext: org.apache.spark.sql.hive.HiveContext = org.apache.spark.sql.hive.HiveContext@1a8b22b5
scala> val depts = sqlContext.sql("select * from test.user_details")
depts: org.apache.spark.sql.DataFrame = [user_id: string, profile_name: string ... 7 more fields]
scala> depts.write.options(scala.collection.Map("uri" -> "mongodb://<username>:<pwd>@<hostname>:27017/<dbname>.<collection>")).mode(SaveMode.Overwrite).format("com.mongodb.spark.sql").save()

Ths is giving the following error -

java.lang.AbstractMethodError: com.mongodb.spark.sql.DefaultSource.createRelation(Lorg/apache/spark/sql/SQLContext;Lorg/apache/spark/sql/SaveMode;Lscala/collection/immutable/Map;Lorg/apache/spark/sql/Dataset;)Lorg/apache/spark/sql/sources/BaseRelation;
  at org.apache.spark.sql.execution.datasources.DataSource.write(DataSource.scala:429)
  at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:211)
  ... 84 elided

I also tried the following which is throwing the below error :

scala> depts.write.options(scala.collection.Map("uri" -> "mongodb://<username>:<pwd>@<host>:27017/<database>.<collection>")).mode(SaveMode.Overwrite).save()
java.lang.IllegalArgumentException: 'path' is not specified
  at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$17.apply(DataSource.scala:438)
  at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$17.apply(DataSource.scala:438)
  at scala.collection.MapLike$class.getOrElse(MapLike.scala:128)
  at org.apache.spark.sql.execution.datasources.CaseInsensitiveMap.getOrElse(ddl.scala:117)
  at org.apache.spark.sql.execution.datasources.DataSource.write(DataSource.scala:437)
  at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:211)
  ... 58 elided

I have imported the following packages -

import org.apache.spark.{SparkConf, SparkContext}

import org.apache.spark.sql.SQLContext

import com.mongodb.casbah.{WriteConcern => MongodbWriteConcern}

import com.mongodb.spark.config._

import org.apache.spark.sql.hive.HiveContext

import org.apache.spark.sql._

depts.show() is working as expected, ie. dataframe is successfully Created.

Please can someone provide me any advice/suggestion on this. Thanks

Wan B. · Accepted Answer · 2016-08-16 07:13:55Z

1

Assuming that you are using MongoDB Spark Connector v1.0, You can save DataFrames SQL like below:

// DataFrames SQL example 
df.registerTempTable("temporary")
val depts = sqlContext.sql("select * from test.user_details")
depts.show()
// Save out the filtered DataFrame result
MongoSpark.save(depts.write.option("uri", "mongodb://hostname:27017/database.collection").mode("overwrite"))

For more information see MongoDB Spark Connector: Spark SQL

For a simple demo of MongoDB and Spark using docker see MongoDB Spark Docker: examples.scala - dataframes

answered Aug 16, 2016 at 7:13

Wan B.

18.9k4 gold badges60 silver badges73 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Jacek Laskowski · Accepted Answer · 2018-07-18 10:04:50Z

Have a look at this error and think of possible ways to face it. That is due to a Spark version mismatch between the Spark Connector for MongoDB and Spark you use.

java.lang.AbstractMethodError: com.mongodb.spark.sql.DefaultSource.createRelation(Lorg/apache/spark/sql/SQLContext;Lorg/apache/spark/sql/SaveMode;Lscala/collection/immutable/Map;Lorg/apache/spark/sql/Dataset;)Lorg/apache/spark/sql/sources/BaseRelation;

Quoting the javadoc of java.lang.AbstractMethodError:

Thrown when an application tries to call an abstract method. Normally, this error is caught by the compiler; this error can only occur at run time if the definition of some class has incompatibly changed since the currently executing method was last compiled.

That pretty much explains what you experience (note the part that starts with "this error can only occur at run time").

My guess is that the part Lorg/apache/spark/sql/Dataset in the DefaultSource.createRelation method in the stack trace is exactly the culprit.

In other words, that line uses data: DataFrame not Dataset which are incompatible in this direction, i.e. DataFrame is simply a Scala type alias of Dataset[Row], but any Dataset is not a DataFrame and hence the runtime error.

override def createRelation(sqlContext: SQLContext, mode: SaveMode, parameters: Map[String, String], data: DataFrame): BaseRelation

Collectives™ on Stack Overflow

Why does MongoDB Spark Connector fail with AbstractMethodError?

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related