1

I am looking to pass list as a parameter to sparksql statement.

process_date = '2020-01-01'

df1 = spark.sql("""select '{0}', * from table1""".format(process_date))

This works for a string. 
So df1 is created successfully.
But If I have List like this 

list1 = ['a','b','c']

df2 = spark.sql("""select '{0}','{1}',* from table1""".format(process_date,list1))

This is not working for me.


1
  • you can convert the list to string using join on the list Commented Feb 12, 2021 at 21:12

3 Answers 3

1

You can use join and list comprehension to get the following sql statement

"select '2020-01-01','a','b','c',* from table1"
print("""select '{0}',{1},* from table1""".format(process_date,",".join((f"'{i}'" for i in list1))))
Sign up to request clarification or add additional context in comments.

2 Comments

list1 = ['a','b','c'] df1 = spark.sql("""select * from table1""".format(",".join((f'"{i}"' for i in list1)))) SyntaxError: invalid syntax
I am using python version Python version 2.7.16.
0
col_list_to_select =["col1","col2","col3","col4"]
sqlQuery2 = "select {} from sample_vw w ".format( ", ".join([f"w.{x}" for x in col_list_to_select[0:]]))
spark.sql(sqlQuery2)

This worked for me !

Comments

0

This can also be achieved in a Pythonic way with pyspark dataframe selectExpr API instead of string interpolation


df = spark.table("database_name.table1")

expressions = ['a','b','c', '*']

result_df = df.selectExpr(*expressions)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.