org.apache.spark.sql.AnalysisException: Table not found while inserting data into Hive table

Question

I am trying to insert a dataframe into a Hive table using the following code:

import org.apache.spark.sql.SaveMode
import org.apache.spark.sql._
val hiveCont =  val hiveCont = new org.apache.spark.sql.hive.HiveContext(sc)
val empfile = sc.textFile("empfile")
val empdata = empfile.map(p => p.split(","))
case class empc(id:Int, name:String, salary:Int, dept:String, location:String)
val empRDD  = empdata.map(p => empc(p(0).toInt, p(1), p(2).toInt, p(3), p(4)))
val empDF   = empRDD.toDF()
empDF.registerTempTable("emptab")

I have a table in Hive with following DDL:

# col_name              data_type               comment             

id                      int                                         
name                    string                                      
salary                  int                                         
dept                    string                                      

# Partition Information      
# col_name              data_type               comment             

location                string

I'm trying to insert the temporary table into the hive table as follows:

hiveCont.sql("insert into parttab select id, name, salary, dept from emptab")

This is giving an exception:

org.apache.spark.sql.AnalysisException: Table not found: emptab. 'emptab' is the temp table created from Dataframe

Here I understand that the hivecontext will run the query on 'HIVE' from Spark and it doesn't find the table there, hence resulting exception. But I don't understand how I can fix this issue. Could any tell me how to fix this ?

table party is a hive table or temp table created from dataframe ? I see that from dataframe you have create a temporary table named emptab — Sandeep Singh
– Sandeep Singh, Commented Jul 3, 2017 at 9:53
There are methods to save to Hive Table directly. I think saveAsTable and insertInto work for Spark 1.6. Did you try using them instead? — philantrovert
– philantrovert, Commented Jul 3, 2017 at 10:13

philantrovert · Accepted Answer · 2017-07-03 16:13:58Z

1

registerTempTable("emptab") : This line of code is used to create a table temporary table in spark, not in hive. For storing data to hive, you have to first create a table in hive explicitly. For storing a table value data to hive table, please use the below code:

import org.apache.spark.sql.SaveMode
import org.apache.spark.sql._

val hiveCont = new org.apache.spark.sql.hive.HiveContext(sc)
val empfile = sc.textFile("empfile")
val empdata = empfile.map(p => p.split(","))
case class empc(id:Int, name:String, salary:Int, dept:String, location:String)
val empRDD  = empdata.map(p => empc(p(0).toInt, p(1), p(2).toInt, p(3), p(4)))
val empDF   = empRDD.toDF()
empDF.write().saveAsTable("emptab");

edited Jul 3, 2017 at 16:13

philantrovert

10.1k3 gold badges43 silver badges65 bronze badges

answered Jul 3, 2017 at 11:45

Chetan Tayade

461 bronze badge

Sign up to request clarification or add additional context in comments.

Comments

Sandeep Singh · Accepted Answer · 2017-07-03 16:22:48Z

0

You are implicitly converting RDD into dataFrame but you are not importing implicit objects therefore RDD is not getting converted into dataframe. Include below line in import.

// this is used to implicitly convert an RDD to a DataFrame.
import sqlContext.implicits._

Also the case classes must be defined top level - they cannot be nested. So your final code should be like this:

import org.apache.spark._
import org.apache.spark.sql.hive.HiveContext;
import org.apache.spark.sql.DataFrame
import org.apache.spark.rdd.RDD
import org.apache.spark.sql._
import sqlContext.implicits._

val hiveCont = new org.apache.spark.sql.hive.HiveContext(sc)
case class Empc(id:Int, name:String, salary:Int, dept:String, location:String)
val empFile = sc.textFile("/hdfs/location/of/data/")
val empData = empFile.map(p => p.split(","))
val empRDD = empData.map(p => Empc(p(0).trim.toInt, p(1), p(2).trim.toInt, p(3), p(4)))
val empDF = empRDD.toDF()
empDF.registerTempTable("emptab")

Also trim all white space if you are converting a String to Integer. I have included that in the above code as well.

edited Jul 3, 2017 at 16:22

answered Jul 3, 2017 at 11:14

Sandeep Singh

8,0786 gold badges47 silver badges68 bronze badges

2 Comments

Metadata Over a year ago

After struggling for a week, your answer finally helped. You may need to correct this line though "val hiveCont = val hiveCont = new org.apache.spark.sql.hive.HiveContext(sc)"

Metadata Over a year ago

If you'd care to look at another issue of mine in Spark version 2: stackoverflow.com/questions/44888348/…

Collectives™ on Stack Overflow

org.apache.spark.sql.AnalysisException: Table not found while inserting data into Hive table

2 Answers 2

Comments

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related