Databricks Error in SQL statement: AnalysisException: cannot resolve '``' given input columns:

Question

I'm not sure if I'm in the correct group for this question. Any I have created the following sql code in Databricks, however I'm getting the error message;

Error in SQL statement: AnalysisException: cannot resolve 'a.COUNTRY_ID' given input columns: [a."PK_LOYALTYACCOUNT";"COUNTRY_ID";"CDC_TYPE", b."PK_LOYALTYACCOUNT";"COUNTRY_ID";"CDC_TYPE"]; line 7 pos 7;

I know the code works as I have successfully run the code on my SQL Server The code is as follows:

tabled = spark.read.csv("adl://carlslake.azuredatalakestore.net/testfolder/dbo_tabled.csv",inferSchema=True,header=True)
tablee = spark.read.csv("adl://carlslake.azuredatalakestore.net/testfolder/dbo_tablee.csv",inferSchema=True,header=True)
tabled.createOrReplaceTempView('tabled') 
tablee.createOrReplaceTempView('tablee')
%sql
; with cmn as 
  ( SELECT a.CDC_TYPE,
           a. PK_LOYALTYACCOUNT, --Add these also in CTE result set 
           a.COUNTRY_ID --Add these also in CTE result set 
    FROM  tabled  a 
    INNER JOIN tablee b 
    ON a.COUNTRY_ID = b.COUNTRY_ID 
    AND a.PK_LOYALTYACCOUNT = b.PK_LOYALTYACCOUNT 
    AND a.CDC_TYPE = 'U'
    )
 SELECT 1 AS is_deleted, 
        a.* 
 FROM  tabled  a 
 INNER JOIN cmn 
 ON a.CDC_TYPE = cmn.CDC_TYPE 
 and  a.COUNTRY_ID = cmn.COUNTRY_ID 
 AND a.PK_LOYALTYACCOUNT = cmn.PK_LOYALTYACCOUNT
 UNION ALL 
 SELECT 0 AS is_deleted, 
        b.* 
 FROM tablee  b 
 INNER JOIN cmn 
 ON b.CDC_TYPE = cmn.CDC_TYPE 
 and b.COUNTRY_ID = cmn.COUNTRY_ID 
 AND b.PK_LOYALTYACCOUNT = cmn.PK_LOYALTYACCOUNT
UNION ALL 
SELECT NULL, 
       a.* 
FROM   tabled a 
WHERE  a.CDC_TYPE = 'N' 
UNION ALL 
SELECT NULL, 
       b.* 
FROM   tablee b 
WHERE  b.CDC_TYPE = 'N'

when I run the simple query...

example1 =

spark.sql("""select * from tablee""")

or example2 =

spark.sql("""select * from tabled""")

I get the following output, so I know the tables are there

Any suggestions will be well received.

What error (if any) do you see for this query? SELECT a.COUNTRY_ID FROM tabled a; Is there actually a COUNTRY_ID field in that table? — Jon Jaussi
– Jon Jaussi, Commented Dec 23, 2018 at 16:36
@JonJaussi, that was good observation. When I run the command SELECT a.COUNTRY_ID FROM tabled a; I don't see the COUNTRY_ID field as you suggested. However, this is standard sql (even though its running on Databricks). And it works fine when I run the same command on my SQL Server. — Carltonp
– Carltonp, Commented Dec 23, 2018 at 16:47

Krishna Sistla · Accepted Answer · 2018-12-24 06:36:03Z

1

Use semicolon delimiter while reading from csv

tabled = spark.read.option("delimiter", ";").csv("adl://carlslake.azuredatalakestore.net/testfolder/dbo_tabled.csv",inferSchema=True,header=True)

or

tabled = spark.read.load("adl://carlslake.azuredatalakestore.net/testfolder/dbo_tabled.csv",
                 format="csv", sep=";", inferSchema="true", header="true")

ref: https://spark.apache.org/docs/2.3.0/sql-programming-guide.html#manually-specifying-options

answered Dec 24, 2018 at 6:36

Krishna Sistla

644 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Carltonp · Accepted Answer · 2018-12-23 22:41:07Z

0

The columns were not being identified properly since the delimiter used was a semicolon(;) and the job was looking for commas. Problem solved

answered Dec 23, 2018 at 22:41

Carltonp

1,3547 gold badges21 silver badges47 bronze badges

Collectives™ on Stack Overflow

Databricks Error in SQL statement: AnalysisException: cannot resolve '``' given input columns:

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related