1

Using Hive with Spark:

Executing two queries one by one using hive context as:

hiveContext.sql("use userdb");

and getting below log:

2016-09-08 15:46:13 main [INFO ] ParseDriver - Parsing command: use userdb
2016-09-08 15:46:14 main [INFO ] ParseDriver - Parse Completed
2016-09-08 15:46:21 main [INFO ] PerfLogger - <PERFLOG method=Driver.run from=org.apache.hadoop.hive.ql.Driver>
2016-09-08 15:46:21 main [INFO ] PerfLogger - <PERFLOG method=TimeToSubmit from=org.apache.hadoop.hive.ql.Driver>
2016-09-08 15:46:21 main [INFO ] PerfLogger - <PERFLOG method=compile from=org.apache.hadoop.hive.ql.Driver>
2016-09-08 15:46:22 main [INFO ] PerfLogger - <PERFLOG method=parse from=org.apache.hadoop.hive.ql.Driver>
2016-09-08 15:46:22 main [INFO ] ParseDriver - Parsing command: use userdb
2016-09-08 15:46:23 main [INFO ] ParseDriver - Parse Completed
2016-09-08 15:46:23 main [INFO ] PerfLogger - </PERFLOG method=parse start=1473329782037 end=1473329783188 duration=1151 from=org.apache.hadoop.hive.ql.Driver>
2016-09-08 15:46:23 main [INFO ] PerfLogger - <PERFLOG method=semanticAnalyze from=org.apache.hadoop.hive.ql.Driver>
2016-09-08 15:46:23 main [INFO ] Driver - Semantic Analysis Completed
2016-09-08 15:46:23 main [INFO ] PerfLogger - </PERFLOG method=semanticAnalyze start=1473329783202 end=1473329783396 duration=194 from=org.apache.hadoop.hive.ql.Driver>
2016-09-08 15:46:23 main [INFO ] Driver - Returning Hive schema: Schema(fieldSchemas:null, properties:null)
2016-09-08 15:46:23 main [INFO ] PerfLogger - </PERFLOG method=compile start=1473329781862 end=1473329783434 duration=1572 from=org.apache.hadoop.hive.ql.Driver>
2016-09-08 15:46:23 main [INFO ] Driver - Concurrency mode is disabled, not creating a lock manager
2016-09-08 15:46:23 main [INFO ] PerfLogger - <PERFLOG method=Driver.execute from=org.apache.hadoop.hive.ql.Driver>
2016-09-08 15:46:23 main [INFO ] Driver - Starting command(queryId=abc_20160908154622_aac49c43-565e-4fde-be6d-2d5c22c1a699): use userdb
2016-09-08 15:46:23 main [INFO ] PerfLogger - </PERFLOG method=TimeToSubmit start=1473329781861 end=1473329783682 duration=1821 from=org.apache.hadoop.hive.ql.Driver>
2016-09-08 15:46:23 main [INFO ] PerfLogger - <PERFLOG method=runTasks from=org.apache.hadoop.hive.ql.Driver>
2016-09-08 15:46:23 main [INFO ] PerfLogger - <PERFLOG method=task.DDL.Stage-0 from=org.apache.hadoop.hive.ql.Driver>
2016-09-08 15:46:23 main [INFO ] Driver - Starting task [Stage-0:DDL] in serial mode
2016-09-08 15:46:23 main [INFO ] PerfLogger - </PERFLOG method=runTasks start=1473329783682 end=1473329783729 duration=47 from=org.apache.hadoop.hive.ql.Driver>
2016-09-08 15:46:23 main [INFO ] PerfLogger - </PERFLOG method=Driver.execute start=1473329783435 end=1473329783730 duration=295 from=org.apache.hadoop.hive.ql.Driver>
2016-09-08 15:46:23 main [INFO ] Driver - OK
2016-09-08 15:46:23 main [INFO ] PerfLogger - <PERFLOG method=releaseLocks from=org.apache.hadoop.hive.ql.Driver>
2016-09-08 15:46:23 main [INFO ] PerfLogger - </PERFLOG method=releaseLocks start=1473329783734 end=1473329783734 duration=0 from=org.apache.hadoop.hive.ql.Driver>
2016-09-08 15:46:23 main [INFO ] PerfLogger - </PERFLOG method=Driver.run start=1473329781861 end=1473329783735 duration=1874 from=org.apache.hadoop.hive.ql.Driver>

**But when trying to execute below query, getting error show below**

    hiveContext.sql("select * from user_detail")

   **Error:**

 2016-09-08 15:47:50 main [INFO ] ParseDriver - Parsing command: select * from userdb.user_detail
2016-09-08 15:47:50 main [INFO ] ParseDriver - Parse Completed
org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.parse.ASTNode cannot be cast to org.antlr.runtime.tree.CommonTree;
    at org.apache.spark.sql.hive.HiveQl$.createPlan(HiveQl.scala:324)
    at org.apache.spark.sql.hive.ExtendedHiveQlParser$$anonfun$hiveQl$1.apply(ExtendedHiveQlParser.scala:41)
    at org.apache.spark.sql.hive.ExtendedHiveQlParser$$anonfun$hiveQl$1.apply(ExtendedHiveQlParser.scala:40)
    at scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:136)
    at scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:135)
    at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242)
    at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242)
    at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222)
    at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1$$anonfun$apply$2.apply(Parsers.scala:254)
    at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1$$anonfun$apply$2.apply(Parsers.scala:254)
    at scala.util.parsing.combinator.Parsers$Failure.append(Parsers.scala:202)
    at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254)
    at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254)
    at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222)
    at scala.util.parsing.combinator.Parsers$$anon$2$$anonfun$apply$14.apply(Parsers.scala:891)
    at scala.util.parsing.combinator.Parsers$$anon$2$$anonfun$apply$14.apply(Parsers.scala:891)
    at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)
    at scala.util.parsing.combinator.Parsers$$anon$2.apply(Parsers.scala:890)
    at scala.util.parsing.combinator.PackratParsers$$anon$1.apply(PackratParsers.scala:110)
    at org.apache.spark.sql.catalyst.AbstractSparkSQLParser.parse(AbstractSparkSQLParser.scala:34)
    at org.apache.spark.sql.hive.HiveQl$.parseSql(HiveQl.scala:295)
    at org.apache.spark.sql.hive.HiveQLDialect$$anonfun$parse$1.apply(HiveContext.scala:66)
    at org.apache.spark.sql.hive.HiveQLDialect$$anonfun$parse$1.apply(HiveContext.scala:66)
    at org.apache.spark.sql.hive.client.ClientWrapper$$anonfun$withHiveState$1.apply(ClientWrapper.scala:290)
    at org.apache.spark.sql.hive.client.ClientWrapper.liftedTree1$1(ClientWrapper.scala:237)
    at org.apache.spark.sql.hive.client.ClientWrapper.retryLocked(ClientWrapper.scala:236)
    at org.apache.spark.sql.hive.client.ClientWrapper.withHiveState(ClientWrapper.scala:279)
    at org.apache.spark.sql.hive.HiveQLDialect.parse(HiveContext.scala:65)
    at org.apache.spark.sql.SQLContext$$anonfun$2.apply(SQLContext.scala:211)
    at org.apache.spark.sql.SQLContext$$anonfun$2.apply(SQLContext.scala:211)
    at org.apache.spark.sql.execution.SparkSQLParser$$anonfun$org$apache$spark$sql$execution$SparkSQLParser$$others$1.apply(SparkSQLParser.scala:114)
    at org.apache.spark.sql.execution.SparkSQLParser$$anonfun$org$apache$spark$sql$execution$SparkSQLParser$$others$1.apply(SparkSQLParser.scala:113)
    at scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:136)
    at scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:135)
    at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242)
    at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242)
    at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222)
    at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1$$anonfun$apply$2.apply(Parsers.scala:254)
    at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1$$anonfun$apply$2.apply(Parsers.scala:254)
    at scala.util.parsing.combinator.Parsers$Failure.append(Parsers.scala:202)
    at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254)
    at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254)
    at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222)
    at scala.util.parsing.combinator.Parsers$$anon$2$$anonfun$apply$14.apply(Parsers.scala:891)
    at scala.util.parsing.combinator.Parsers$$anon$2$$anonfun$apply$14.apply(Parsers.scala:891)
    at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)
    at scala.util.parsing.combinator.Parsers$$anon$2.apply(Parsers.scala:890)
    at scala.util.parsing.combinator.PackratParsers$$anon$1.apply(PackratParsers.scala:110)
    at org.apache.spark.sql.catalyst.AbstractSparkSQLParser.parse(AbstractSparkSQLParser.scala:34)
    at org.apache.spark.sql.SQLContext$$anonfun$1.apply(SQLContext.scala:208)
    at org.apache.spark.sql.SQLContext$$anonfun$1.apply(SQLContext.scala:208)
    at org.apache.spark.sql.execution.datasources.DDLParser.parse(DDLParser.scala:43)
    at org.apache.spark.sql.SQLContext.parseSql(SQLContext.scala:231)
    at org.apache.spark.sql.hive.HiveContext.parseSql(HiveContext.scala:331)
    at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:817)
    at 
1

1 Answer 1

1

I am using spark-hive_2.10: 1.6.1, which internally solve some dependencies as:

  1. hive-exec: 1.2.1
  2. hive-metastore: 1.2.1

With the duplicate APIs, initially, i was able to execute all kind of queries(USE , INSERT, DESCRIBE, etc.) except SELECT. Select query throws above exception. After resolving this, now i am able to execute all kind of queries without any problem.

When i walk through with dependency hierarchy, i found that somehow getting two different version of hive-exec gets included in the project. I removed the external one, which SOLVED!. Hope this will help to someone else.

Thanks.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.