4

SOLVED: prop.setProperty("driver", "oracle.jdbc.driver.OracleDriver") this line must be added to the connection properties.

I'm trying to lunch a spark job in local. I created a jar with dependencies by maven.

This is my pom.xml

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>com.agildata</groupId>
    <artifactId>spark-rdd-dataframe-dataset</artifactId>
    <packaging>jar</packaging>
    <version>1.0</version>

    <properties>    
        <exec-maven-plugin.version>1.4.0</exec-maven-plugin.version>
        <spark.version>1.6.0</spark.version>
    </properties>

    <dependencies>

        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-core_2.11</artifactId>
            <version>${spark.version}</version>
        </dependency>

        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-sql_2.11</artifactId>
            <version>${spark.version}</version>
        </dependency>

        <dependency>
            <groupId>com.oracle</groupId>
            <artifactId>ojdbc7</artifactId>
            <version>12.1.0.2</version>
        </dependency>





    </dependencies>

    <build>
        <plugins>

            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-compiler-plugin</artifactId>
                <version>3.2</version>
                <configuration>
                    <source>1.8</source>
                    <target>1.8</target>
                </configuration>
            </plugin>

            <plugin>
                <groupId>net.alchim31.maven</groupId>
                <artifactId>scala-maven-plugin</artifactId>
                <executions>
                    <execution>
                        <id>scala-compile-first</id>
                        <phase>process-resources</phase>
                        <goals>
                            <goal>add-source</goal>
                            <goal>compile</goal>
                        </goals>
                    </execution>
                    <execution>
                        <id>scala-test-compile</id>
                        <phase>process-test-resources</phase>
                        <goals>
                            <goal>testCompile</goal>
                        </goals>
                    </execution>
                </executions>
            </plugin>


            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-assembly-plugin</artifactId>
                <version>2.4.1</version>
                <configuration>
                    <!-- get all project dependencies -->
                    <descriptorRefs>
                        <descriptorRef>jar-with-dependencies</descriptorRef>
                    </descriptorRefs>
                    <!-- MainClass in mainfest make a executable jar -->
                    <archive>
                        <manifest>
                            <mainClass>example.dataframe.ScalaDataFrameExample</mainClass>
                        </manifest>
                    </archive>

                </configuration>
                <executions>
                    <execution>
                        <id>make-assembly</id>
                        <!-- bind to the packaging phase -->
                        <phase>package</phase>
                        <goals>
                            <goal>single</goal>
                        </goals>
                    </execution>
                </executions>
            </plugin>


        </plugins>
    </build>

</project>

I run the mvn package command and the build is succesfull. After i try to run the job like this way: GMAC:bin gabor_dev$ sh spark-submit --class example.dataframe.ScalaDataFrameExample --master spark://QGMAC.local:7077 /Users/gabor_dev/IdeaProjects/dataframe/target/spark-rdd-dataframe-dataset-1.0-jar-with-dependencies.jar but it throws this: Exception in thread "main" java.sql.SQLException: No suitable driver

Full error message:

16/07/08 13:09:22 INFO BlockManagerMaster: Registered BlockManager
Exception in thread "main" java.sql.SQLException: No suitable driver
    at java.sql.DriverManager.getDriver(DriverManager.java:315)
    at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$2.apply(JdbcUtils.scala:50)
    at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$2.apply(JdbcUtils.scala:50)
    at scala.Option.getOrElse(Option.scala:120)
    at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.createConnectionFactory(JdbcUtils.scala:49)
    at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:120)
    at org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation.<init>(JDBCRelation.scala:91)
    at org.apache.spark.sql.DataFrameReader.jdbc(DataFrameReader.scala:222)
    at org.apache.spark.sql.DataFrameReader.jdbc(DataFrameReader.scala:146)
    at example.dataframe.ScalaDataFrameExample$.main(ScalaDataFrameExample.scala:30)
    at example.dataframe.ScalaDataFrameExample.main(ScalaDataFrameExample.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:497)
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
16/07/08 13:09:22 INFO SparkContext: Invoking stop() from shutdown hook

Interest thing that if i build this way inside the IntelliJ IDEA nested console: mvn package exec:java -Dexec.mainClass=example.dataframe.ScalaDataFrameExample it's running, and there is no error.

This is the relevant scala code part:

val sc = new SparkContext(conf)

    val sqlContext = new SQLContext(sc)

    val url="jdbc:oracle:thin:@xxx.xxx.xx:1526:SIDNAME"

    val prop = new java.util.Properties

      prop.setProperty("user" , "usertst")
      prop.setProperty("password" , "usertst")

      val people = sqlContext.read.jdbc(url,"table_name",prop)

      people.show()

I checked my jar file, and it containst all dependencies. Can anybody help me how to solve this. Thank you!

15
  • copy-paste of the relevant section of the error? Commented Jul 8, 2016 at 10:15
  • copied. check please the question. Commented Jul 8, 2016 at 11:11
  • and in IntelliJIdea works, correct? Are you doing this on a cluster or in local? Commented Jul 8, 2016 at 11:58
  • local. And yes, IletlliJIdea works correct. The db connect works and people.show method print the table to the nested console.. Commented Jul 8, 2016 at 11:59
  • would you try to add the .setMaster("local[2]") to the SparkConf? Commented Jul 8, 2016 at 12:01

2 Answers 2

15

So, the missing driver is the JDBC one and you have to add it to the SparkSQL configuration. You either do it in the application submit, as specified by this answer, or you do it through your Properties object, as you did, with this line:

prop.setProperty("driver", "oracle.jdbc.driver.OracleDriver") 
Sign up to request clarification or add additional context in comments.

Comments

2

Here is how you will connect to postgresql using spark.

SparkSession sparkSession = SparkSession.builder().
            appName("dky").
            master("local[*]").
            getOrCreate();

    Logger.getLogger("org.apache").setLevel(Level.WARN);

    Properties properties = new Properties();
    properties.put("user", "your user name");
    properties.put("password", "your password");

    Dataset<Row> jdbcDF = sparkSession.read().option("driver", "org.postgresql.Driver")
            .jdbc("jdbc:postgresql://localhost:5432/postgres", "your table name along with schema name", properties);
    jdbcDF.show();

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.