3

I have two dataframes that look like this:

df1

A    B    C
5    1    5
4    2    8
2    5    3

df2

A    B    C    D
4    3    4    1
3    5    1    2
1    2    5    4

df1 and df2 share the same columns names except "D" which is only found in df2. What I would like to do is add D to df1 but fill all rows with "0"'s

In other words, if a column exists in df2 but it doesn't in df1, add that column to df1 but make all values in that column 0 (below)

df1

A    B    C    D
5    1    5    0
4    2    8    0
2    5    3    0

I realize it would be very easy to add one column called "D" to df1 but this is just a dummy example when in reality I am dealing with much larger and many more dataframes. So, I am looking for a way to do this with a code I could implement in a loop or iteratively

2
  • I think the word add is getting confusing here. Seems like you simply want to create a column 'D' with the static value of 0 in df1, leaving everything else unchanged? Commented Dec 20, 2021 at 19:04
  • @ALollz that's correct. I will edit it to sound that way Commented Dec 20, 2021 at 19:06

4 Answers 4

2

You can find the missing columns with Index.difference.

Then there are a ton of ways to assign multiple columns with a static value to a DataFrame, so here's one where you unpack a dictionary where the keys are the column names and the values of that dict is the static value you want to assign.

df1 = df1.assign(**{x: 0 for x in df2.columns.difference(df1.columns)})

   A  B  C  D
0  5  1  5  0
1  4  2  8  0
2  2  5  3  0
Sign up to request clarification or add additional context in comments.

7 Comments

This works! Thank you. However, is there any way to preserve the order of the columns? They seem to be added quite randomly
@Melderon check my answer. I believe it does that for you, too.
@richardec for some reason, it turned everything into a float. I just need "0" as the above answer provides. Thank you, though
@Melderon that's easy to fix. Just add .astype(int), like in my answer.
@Melderon which columns are getting mis-ordered? The ones in DF1 or the ones in DF2 that are missing?
|
1

You can use DataFrame.add with fill_value:

print(df1.add(df2, fill_value=0))

Output:

   A  B  C    D
0  9  4  9  1.0
1  7  7  9  2.0
2  3  7  8  4.0

Note: This method will fill the existing nan in each dataframe with 0 as well.

Comments

1

Try this:

df3 = df1.add(df2).fillna(0).astype(int)

Output:

>>> df3
   A  B  C  D
0  9  4  9  0
1  7  7  9  0
2  3  7  8  0

Comments

0

You can reindex one dataframe using columns from another one:

df1.reindex(df2.columns, axis=1, fill_value=0)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.