38

This is my original dataframe.

df1

This is my second dataframe containing one column.

df2

I want to add the column of second dataframe to the original dataframe at the end. Indices are different for both dataframes. I did like this.

df1['RESULT'] = df2['RESULT']

It doesn't return an error and the column is added but all values are NaNs. How do I add these columns with their values?

0

2 Answers 2

57

Assuming the size of your dataframes are the same, you can assign the RESULT_df['RESULT'].values to your original dataframe. This way, you don't have to worry about indexing issues.

# pre 0.24
feature_file_df['RESULT'] = RESULT_df['RESULT'].values
# >= 0.24
feature_file_df['RESULT'] = RESULT_df['RESULT'].to_numpy()

Minimal Code Sample

df
          A         B
0 -1.202564  2.786483
1  0.180380  0.259736
2 -0.295206  1.175316
3  1.683482  0.927719
4 -0.199904  1.077655

df2

           C
11 -0.140670
12  1.496007
13  0.263425
14 -0.557958
15 -0.018375

Let's try direct assignment first.

df['C'] = df2['C']
df

          A         B   C
0 -1.202564  2.786483 NaN
1  0.180380  0.259736 NaN
2 -0.295206  1.175316 NaN
3  1.683482  0.927719 NaN
4 -0.199904  1.077655 NaN

Now, assign the array returned by .values (or .to_numpy() for pandas versions >0.24). .values returns a numpy array which does not have an index.

df2['C'].values 
array([-0.141,  1.496,  0.263, -0.558, -0.018])

df['C'] = df2['C'].values
df

          A         B         C
0 -1.202564  2.786483 -0.140670
1  0.180380  0.259736  1.496007
2 -0.295206  1.175316  0.263425
3  1.683482  0.927719 -0.557958
4 -0.199904  1.077655 -0.018375
Sign up to request clarification or add additional context in comments.

Comments

0

You can also call set_axis() to change the index of a dataframe/column. So if the lengths are the same, then with set_axis(), you can coerce the index of one dataframe to be the same as the other dataframe.

df1['A'] = df2['A'].set_axis(df1.index)

If you get SettingWithCopyWarning, then to silence it, you can create a copy by either calling join() or assign().

df1 = df1.join(df2['A'].set_axis(df1.index))
# or
df1 = df1.assign(new_col = df2['A'].set_axis(df1.index))

set_axis() is especially useful if you want to add multiple columns from another dataframe. You can just call join() after calling it on the new dataframe.

df1 = df1.join(df2[['A', 'B', 'C']].set_axis(df1.index))

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.