I have a situation where I'm trying to query a table and use the result (dataframe) from that query as IN clause of another query.
From the first query I have the dataframe below:
+-----------------+
|key |
+-----------------+
| 10000000000004|
| 10000000000003|
| 10000000000008|
| 10000000000009|
| 10000000000007|
| 10000000000006|
| 10000000000010|
| 10000000000002|
+-----------------+
And now I want to run a query like the one below using the values of that dataframe dynamically instead of hard coding the values:
spark.sql("""select country from table1 where key in (10000000000004, 10000000000003, 10000000000008, 10000000000009, 10000000000007, 10000000000006, 10000000000010, 10000000000002)""").show()
I tried the following, however it didn't work:
df = spark.sql("""select key from table0 """)
a = df.select("key").collect()
spark.sql("""select country from table1 where key in ({0})""".format(a)).show()
Can somebody help me?
join.