6

I am trying to use jersey Rest-API to fetch the records from HBASE table through java-Spark program then I am getting the below mentioned error however when I am accessing the HBase-table through spark-Jar then code is executing without errors.

I have a 2 worker node for Hbase and 2 worker node for spark which are maintained by same Master.

WARN TaskSetManager: Lost task 1.0 in stage 0.0 (TID 1, 172.31.16.140): java.lang.IllegalStateException: unread block data at java.io.ObjectInputStream$BlockDataInputStream.setBlockDataMode(ObjectInputStream.java:2421) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1382) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370) at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:69) at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:95) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:194) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745)

1
  • Can you provide the code you have written as well.? the question was not enough informative Commented Jan 20, 2016 at 16:16

3 Answers 3

6

ok, i may be know your problem , because i have just experienced .

the reason is very likely miss some hbase jars , because during spark runing , spark need through hbase jar to read data , if not exist , so some exception will throws , what should you do ? it is easy

before submit job , you need add params --jars and join some jar in follows:

--jars /ROOT/server/hive/lib/hive-hbase-handler-1.2.1.jar,
/ROOT/server/hbase/lib/hbase-client-0.98.12-hadoop2.jar,
/ROOT/server/hbase/lib/hbase-common-0.98.12-hadoop2.jar,
/ROOT/server/hbase/lib/hbase-server-0.98.12-hadoop2.jar,
/ROOT/server/hbase/lib/hbase-hadoop2-compat-0.98.12-hadoop2.jar,
/ROOT/server/hbase/lib/guava-12.0.1.jar,
/ROOT/server/hbase/lib/hbase-protocol-0.98.12-hadoop2.jar,
/ROOT/server/hbase/lib/htrace-core-2.04.jar

if can , enjoy it !

Sign up to request clarification or add additional context in comments.

1 Comment

I am using restAPI which is calling above function for fetching the data from HBase through spark so please let me know how to pass these jars.. I tried to set the jars in spark-env.sh but not working SPARK_CLASSPATH=/hbase-1.1.2/lib/hbase-protocol-1.1.2.jar:/hbase-1.1.2/lib/hbase-common-1.1.2.jar:/hbase-1.1.2/lib/htrace-core-3.1.0-incubating.jar:/hbase-1.1.2/lib/hbase-server-1.1.2.jar:/hbase-1.1.2/lib/hbase-client-1.1.2.jar:/hbase/hive-1.2.1/lib/hive-hbase-handler-1.2.1.jar:/hive-1.2.1/lib/hive-common-1.2.1.jar:/hive-1.2.1/lib/hive-exec-1.2.1.jar
0

I've met the same problem in CDH5.4.0 when submit the spark job implemented with java api, here are my solutions:

solution 1:Using spark-submit:

--jars zookeeper-3.4.5-cdh5.4.0.jar, 
hbase-client-1.0.0-cdh5.4.0.jar, 
hbase-common-1.0.0-cdh5.4.0.jar,
hbase-server1.0.0-cdh5.4.0.jar,
hbase-protocol1.0.0-cdh5.4.0.jar,
htrace-core-3.1.0-incubating.jar,
// custom jars which are needed in the spark executors

solution 2:Use SparkConf in code:

SparkConf.setJars(new String[]{"zookeeper-3.4.5-cdh5.4.0.jar",
"hbase-client-1.0.0-cdh5.4.0.jar",
"hbase-common-1.0.0-cdh5.4.0.jar",
"hbase-server1.0.0-cdh5.4.0.jar",
"hbase-protocol1.0.0-cdh5.4.0.jar",
"htrace-core-3.1.0-incubating.jar",
// custom jars which are needed in the spark executors
});

To summary
the problem is caused by missing jars in the spark project, you need to added these jars to your project classpath, besides, use the above 2 solutions to help distribute these jars to your spark cluster.

Comments

0

CDP/CDH:

Step1: Copy the hbase-site.xml file into /etc/spark/conf/ directory. cp /opt/cloudera/parcels/CDH/lib/hbase/conf/hbase-site.xml /etc/spark/conf/

Step2: Add the following libraries to spark-submit/spark-shell.

/opt/cloudera/parcels/CDH/jars/hive-hbase-handler-*.jar
/opt/cloudera/parcels/CDH/lib/hbase/hbase-client-*.jar
/opt/cloudera/parcels/CDH/lib/hbase/hbase-common-*.jar
/opt/cloudera/parcels/CDH/lib/hbase/hbase-server-*.jar
/opt/cloudera/parcels/CDH/lib/hbase/hbase-hadoop2-compat-*.jar
/opt/cloudera/parcels/CDH/lib/hbase/hbase-protocol-*.jar
/opt/cloudera/parcels/CDH/jars/guava-28.1-jre.jar
/opt/cloudera/parcels/CDH/jars/htrace-core-3.2.0-incubating.jar

Spark-shell:

sudo -u hive spark-shell --master yarn --jars /opt/cloudera/parcels/CDH/jars/hive-hbase-handler-*.jar, /opt/cloudera/parcels/CDH/lib/hbase/hbase-client-*.jar, /opt/cloudera/parcels/CDH/lib/hbase/hbase-common-*.jar, /opt/cloudera/parcels/CDH/lib/hbase/hbase-server-*.jar, /opt/cloudera/parcels/CDH/lib/hbase/hbase-hadoop2-compat-*.jar, /opt/cloudera/parcels/CDH/lib/hbase/hbase-protocol-*.jar,/opt/cloudera/parcels/CDH/jars/guava-28.1-jre.jar,/opt/cloudera/parcels/CDH/jars/htrace-core-3.2.0-incubating.jar --files /etc/spark/conf/hbase-site.xml

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.