1

i have two data frames i want to check if df1 contains any row in df2 where key is a and b, if equal then change exists to true in df2 and add the new rows from df1 with exists False

df1

a | b | c | d
1 | 1 | 3 | 4
2 | 2 | 4 | 1
3 | 3 | 5 | 3

df2

a | b | c | d
1 | 1 | 4 | 5
4 | 4 | 3 | 2

this should look like

df3

a | b | c | d | exists
1 | 1 | 4 | 5 | True
4 | 4 | 3 | 2 | False
1 | 1 | 3 | 4 | False
2 | 2 | 4 | 1 | False
3 | 3 | 5 | 3 | False

so far i have this

val newdf = df1.join(df2, df1("a")===df2("a") && df1("b") === df2("b"), "left")
   .select(df2("a"), df2("b"),df2("c"),df2("d"),when(df2("a").isNull, false).otherwise(true).alias("exists"))

which returns

a | b | c | d | exists
1 | 1 | 4 | 5 | True
rest of the rows are null 
2
  • 1 | 1 | 3 | 4 | False did this row going to be in df3 too? because there is matching row in df1.. Commented Aug 3, 2020 at 20:06
  • yes all rows from both dataframe will be in df3. the one with matching from df2 will have exists true in df3 Commented Aug 3, 2020 at 20:10

1 Answer 1

1

Try with left_semi, left_anti joins then unionAll the datasets.

Example:

df2.join(df1,Seq("a","b"),"left_semi").withColumn("exists",lit("True")).
unionAll(df2.join(df1,Seq("a","b"),"left_anti").withColumn("exists",lit("False"))).
unionAll(df1.withColumn("exists",lit("False"))).show()
//+---+---+---+---+------+
//|  a|  b|  c|  d|exists|
//+---+---+---+---+------+
//|  1|  1|  4|  5|  True|
//|  4|  4|  3|  2| False|
//|  1|  1|  3|  4| False|
//|  2|  2|  4|  1| False|
//|  3|  3|  5|  3| False|
//+---+---+---+---+------+
Sign up to request clarification or add additional context in comments.

1 Comment

that works, thanks so one other thing, df2 serves as a truth table, so the next time it will be read it will have the exists column. if i add that i get this error Union can only be performed on tables with the same number of columns, but the first table has 8 columns and the 2th table has 7 columns;;

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.