0

We are creating a Spark based application using Spark 2.3.0. Our Spark jobs interacts with HBase. While creating JAR, we are getting following compile time exception exception: [ERROR] class file for org.apache.spark.Logging not found This exception occurs in the code, that is reading data from the HBase tables.

We are able to successfully write data into the HBase tables, using the jar's configuration/versions below.

We are using following configuration in pom.xml

<property>
<org.apache.spark.version>2.3.0</org.apache.spark.version>
<scala.version>2.11</scala.version>
<hbase.version>1.0.0-cdh5.4.0</hbase.version>
</property> 

<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_${scala.version}</artifactId>
<version>${org.apache.spark.version}</version>
</dependency>


        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-sql_${scala.version}</artifactId>
            <version>${org.apache.spark.version}</version>
        </dependency>


        <dependency>
            <groupId>org.apache.hbase</groupId>
            <artifactId>hbase-spark</artifactId>
            <version>1.2.0-cdh5.10.0</version>
        </dependency>

        <dependency>
            <groupId>org.apache.hbase</groupId>
            <artifactId>hbase-client</artifactId>
            <version>${hbase.version}</version>
        </dependency>

        <dependency>
            <groupId>org.apache.hbase</groupId>
            <artifactId>hbase-common</artifactId>
            <version>${hbase.version}</version>
        </dependency>

        <dependency>
            <groupId>org.apache.hbase</groupId>
            <artifactId>hbase-server</artifactId>
            <version>${hbase.version}</version>
        </dependency>

        <dependency>
            <groupId>org.apache.hbase</groupId>
            <artifactId>hbase-protocol</artifactId>
            <version>${hbase.version}</version>
        </dependency>

        <dependency>
            <groupId>org.apache.htrace</groupId>
            <artifactId>htrace-core</artifactId>
            <version>3.1.0-incubating</version>
        </dependency>

We found multiple solutions on stackoverflow, all mentioning to use Spark 1.6 instead. java.lang.NoClassDefFoundError: org/apache/spark/Logging

This is not possible for us.

Is there any other workaround to solve this issue?

Thanks

2
  • in the link you mention there is an answer "The error is because you are using Spark 2.0 libraries with the connector from Spark 1.6 (which looks for the Spark 1.6 logging class. Use the 2.0.5 version of the connector." does that helps you to investigate more ? Commented Sep 21, 2018 at 15:26
  • I am not able to find any such jar in the maven cloudera repo: mvnrepository.com/artifact/org.apache.hbase/… Commented Sep 24, 2018 at 7:50

1 Answer 1

0

Replying to an older question I posted here. Since we couldn't roll back to Spark version 1.6 (we are using Spark 2.3), hence we found a workaound of using HBaseContext.bulkGet.

We are doing something as below:

val respDataFrame = keyDf.mapEachPartition((keys) => {
--> creating instance of HTable
--> create list of all the gets 
--> fetch getsList
})
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.