0

Here's a dataframe, df1, I have

+---------+-------+---------+
|   C1    |   C2  |   C3    |
+---------+-------+---------+
|    xr   |   1   |   ixfg  |
|    we   |   5   |   jsfd  |
|    r5   |   7   |   hfga  |
|    by   |   8   |   srjs  |
|    v4   |   4   |   qwks  |
|    c0   |   0   |   khfd  |
|    ba   |   2   |   gdbu  |
|    mi   |   1   |   pdlo  |
|    lp   |   7   |   ztpq  |
+---------+-------+---------+

Here's another, df2, that I have

+----------+-------+---------+
|    V1    |   V2  |   V3    |
+----------+-------+---------+
|    Null  |   6   |   ixfg  |
|    Null  |   2   |   jsfd  |
|    Null  |   2   |   hfga  |
|    Null  |   7   |   qwks  |
|    Null  |   1   |   khfd  |
|    Null  |   9   |   gdbu  |
+----------+-------+---------+

What I would like to have is another dataframe that

  1. Ignores values in V2 and takes values in C2 whereever V3 and C3 match, and
  2. Replaces V1 with values in C1 wherever V3 and C3 match.

The result should look like the following:

+----------+-------+---------+
|    M1    |   M2  |   M3    |
+----------+-------+---------+
|    xr    |   1   |   ixfg  |
|    we    |   5   |   jsfd  |
|    r5    |   7   |   hfga  |
|    v4    |   4   |   qwks  |
|    c0    |   0   |   khfd  |
|    ba    |   2   |   gdbu  |
+----------+-------+---------+

1 Answer 1

1

You can join and use coalesce to take a value which has a higher priority.

** coalesce will take any number of columns (the highest priority to least in the order of arguments) and return first non-null value, so if you do want to replace with null when there is a null in the lower priority column, you cannot use this function.

df = (df1.join(df2, on=(df1.C3 == df2.V3))
      .select(F.coalesce(df1.C1, df2.V1).alias('M1'), 
              F.coalesce(df2.V2, df1.C2).alias('M2'), 
              F.col('C3').alias('M3')))
Sign up to request clarification or add additional context in comments.

1 Comment

coalesce() was very helpful. I have been doing when(df.c.isNull(),...); but this is much simpler. ++ for that.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.