using spark 2.3.2 with python, I am trying implement "alias" to join two dataframes after applying some filter in a single line as in below code. But Its throwing below error
code:
orders.filter(orders.order_status.isin("CLOSED","COMPLETE")).select("order_id","order_date").alias("a").\
join(orderitems.select("order_item_order_id","order_item_subtotal").alias("b"),a.order_id==b.order_item_order_id).\
drop(b.order_item_order_id)
error:
NameError: name 'a' is not defined
I need to get CLOSED and COMPLETE orders from dataframe:orders and then in the same step, I need to join the resultant dataframe with another dataframe:orderitems and then drop the duplicate column. So I am looking for implementing "alias" to a dataframe as same as alias to a table in SQL. Could any one please help me to understand where I am going wrong?