3

I have tried with below code in spark and scala, attaching code and pom.xml

package com.Spark.ConnectToHadoop

import org.apache.spark.SparkConf
import org.apache.spark.SparkConf
import org.apache.spark._
import org.apache.spark.sql._
import org.apache.spark.sql.hive.HiveContext
import org.apache.spark.sql.SQLContext
import org.apache.spark.rdd.RDD
//import groovy.sql.Sql.CreateStatementCommand

//import org.apache.spark.SparkConf


object CountWords  {

  def main(args:Array[String]){

    val objConf = new SparkConf().setAppName("Spark Connection").setMaster("spark://IP:7077")
    var sc = new SparkContext(objConf)
val objHiveContext = new HiveContext(sc)
objHiveContext.sql("USE test")


var test= objHiveContext.sql("show tables")
    var i  = 0

    var testing = test.collect()
      for(i<-0 until testing.length){

      println(testing(i))
    }
  }
}

I have added spark-core_2.10,spark-catalyst_2.10,spark-sql_2.10,spark-hive_2.10 dependencies Do I need to add any more dependencies???

Edit:

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>com.Sudhir.Maven1</groupId>
    <artifactId>SparkDemo</artifactId>
    <version>IntervalMeterData1</version>
    <packaging>jar</packaging>

    <name>SparkDemo</name>
    <url>http://maven.apache.org</url>

    <properties>
        <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
        <spark.version>1.5.2</spark.version>
    </properties>

    <dependencies>
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-core_2.10</artifactId>
            <version>1.5.2</version>
        </dependency> 
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-sql_2.10</artifactId>
            <version>1.5.2</version>
        </dependency>
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-catalyst_2.10</artifactId>
            <version>1.5.2</version>
        </dependency>
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-hive_2.10</artifactId>
            <version>1.2.1</version>
        </dependency>
        <dependency>
            <groupId>org.apache.hive</groupId>
            <artifactId>hive-jdbc</artifactId>
            <version>1.2.1</version>
        </dependency>
        <dependency>
            <groupId>junit</groupId>
            <artifactId>junit</artifactId>
            <version>3.8.1</version>
            <scope>test</scope>
        </dependency>     
    </dependencies>
</project>
4
  • 1. How are you running your spark - local/remote? 2. If not running local, do you have a spark version built with hive support? Commented Jan 19, 2016 at 9:22
  • remote, I didn't gety you what does built with hive support mean,. I previously connected when it was 1.4.1 Commented Jan 19, 2016 at 9:26
  • Are you sure you are running your code an installation of spark 1.5 with hive support? The OverrideFunctionRegistry was replaced with HiveFunctionRegistry for spark 1.5 in issues.apache.org/jira/browse/SPARK-8883 Commented Jan 19, 2016 at 9:30
  • yes spark installation 1.5.2 with hive support too(yarn cluster mode).. hive version 1.2.1 Commented Jan 19, 2016 at 9:36

1 Answer 1

5

Looks like you forgot to bump the spark-hive:

    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-hive_2.10</artifactId>
        <version>1.5.2</version>
    </dependency>

Consider introducing maven variable, like spark.version.

   <properties>
        <spark.version>1.5.2</spark.version>
    </properties>

And modifying all your spark dependencies in this manner:

   <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-hive_2.10</artifactId>
        <version>${spark.version}</version>
    </dependency>

Bumping up versions of spark won't be as painful.

Just adding the property spark.version in your <properties> is not enough, you have to call it with ${spark.version} in dependencies.

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks alot, that issue is resolved and getting below exception Caused by: java.lang.RuntimeException: The root scratch dir: /tmp/hive on HDFS should be writable. Current permissions are: rw-rw-rw-, evrn though I have given 777 permissions

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.