0

I'm a newbie on Scala, am trying to use Spark to read from a mysql database. I'm facing a class-not-found exception whatever I do. I tried to connect without Spark, using Squeryl, Scalike, etc. Always the same problem. Here's one example I tried :

logger.info("Write part")

val dataframe_mysql = spark.sqlContext
  .read.format("jdbc")
  .option("url", s"jdbc:mysql://${datamart_server}:3306/vol")
  .option("driver", "com.mysql.jdbc.Driver")
  .option("dbtable", "company")
  .option("user", datamart_user).option("password", datamart_pwd)
  .load()

dataframe_mysql.show()

I tried to put the driver classname in a src/main/resources/application.conf:

db.default.driver="com.mysql.jdbc.Driver"

But it didn't help. I've got :

java.sql.SQLException: No suitable driver

I also share the sbt file to show how I add the dependencies :

name := "commercial-api-datamart-feed"
version := "0.1"
scalaVersion := "2.11.6"
libraryDependencies += "org.scala-lang.modules" %% "scala-parser-combinators" % "1.1.0"
libraryDependencies += "ch.qos.logback" % "logback-classic" % "1.1.3" % Runtime
libraryDependencies += "com.typesafe.scala-logging" %% "scala-logging" % "3.9.0"
libraryDependencies += "org.apache.spark" %% "spark-core" % "2.3.0"
libraryDependencies += "org.apache.spark" %% "spark-sql" % "2.3.0"
libraryDependencies += "mysql" % "mysql-connector-java" % "5.1.24" % Runtime

Spark is not mandatory but I think it's better for performance.

2 Answers 2

1

How are you running this code? You'll need to pass the MySQL JAR as --jars; something like --jars /path/to/mysql.jar if starting up spark-shell or spark-submit.

If you prefer running a single JAR, you'll need to ensure that the MySQL JAR is embedded as part of your uber JAR. I've never used SBT but you'll need to check whether the final JAR created has the MySQL classes inside it -- if not, use the relevant SBT flags to make that happen.

Sign up to request clarification or add additional context in comments.

2 Comments

You're right. I'm using spark-submit, and the jar doesn't include mysql driver. (However it doesn't include the other dependencies, so why just mysql driver?). So now when I add the parameter you suggested, it works. I still have a problem using addSbtPlugin (to find the correct version for scala 2.11) but I think I should put it as another question. Thank you!
@IoriYagami AFAIK, the reason it works for other JARs (spark-sql, spark-core) is that they are already part of the classpath when spawning up executors/driver given they are part of the Spark installation. You should see the same behaviour for any "non-Spark" JARs which your application uses. If not, it would be very surprising.
1

You have to make sure the mysql dependencies exist on all of the executors. In my environment, I use maven and specify the dependency like this inside of my pom.xml:

<dependency>
  <groupId>mysql</groupId>
  <artifactId>mysql-connector-java</artifactId>
  <version>5.1.42</version>
</dependency>

I then ensure that this dependency is bundled in my application jar (using the maven shade plugin), so that I don't have to place the jar on all of the executors.

Whether you use spark or not to access mysql over jdbc, you need to make sure the mysql-connector is available on your classpath, wherever you are executing mysql queries from.

1 Comment

Thank you @Travis for the comment. I'm sorry that I forgot to include my sbt file. I'm updating the post.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.