Coming from a SQL background here.. I'm using df1 = spark.read.jdbc to load data from Azure sql into a dataframe. I am trying to filter the data to exclude rows meeting the following criteria:
df2 = df1.filter("ItemID <> '75' AND Code1 <> 'SL'")
The dataframe ends up being empty but when i run equivalent SQL it is correct. When i change it to
df2 = df1.filter("ItemID **=** '75' AND Code1 **=** 'SL'")
it produces the rows i want to filter out.
What is the best way to remove the rows meeting the criteria, so they can be pushed to a SQL server? Thank you
filter($"ItemIID" =!= "75" && $"Code1" =!= "SL" )