Adding a dataframe to an existing dataframe at specific rows and columns

Question

I have a loop that each time creates a dataframe(DF) with a form

DF

  ID        LCAR        RCAR  ...     LPCA1     LPCA2     RPCA2
0 d0129  312.255859  397.216797  ...  1.098888  1.101905  1.152332

and then add that dataframe to an existing dataframe(main_exl_df) with this form:

main_exl_df

         ID  Date     ... COGOTH3  COGOTH3X COGOTH3F
0     d0129   NaN    ...     NaN       NaN      NaN
1     d0757   NaN    ...     0.0       NaN      NaN
2     d2430   NaN    ...     NaN       NaN      NaN
3     d3132   NaN    ...     0.0       NaN      NaN
4     d0371   NaN    ...     0.0       NaN      NaN
                 ...   ...       ...  ...     ...       ...      ...
2163  d0620   NaN    ...     0.0       NaN      NaN
2164  d2410   NaN    ...     0.0       NaN      NaN
2165  d0752   NaN    ...     NaN       NaN      NaN
2166  d0407   NaN    ...     0.0       NaN      NaN

at each iteration main_exl_df is saved and then loaded again for the next iteration.

I tried

main_exl_df = pd.concat([main_exl_df, DF], axis=1)

but this add the columns each time to the right side of the main_exl_df and does not recognize the index if 'ID' row.

how I can specify to add the new dataframe(DF) at the row with correct ID and right columns?

I have also tried main_exl_df = pd.merge(main_exl_df, DF, on=main_exl_df.columns[0]) to recognize the correct ID, but when I save the main_exl_df , only one row is saved and the the rest of columns and rows are lost. — Rei Rei
– Rei Rei, Commented Aug 20, 2020 at 15:25

Dimple Singhania · Accepted Answer · 2020-08-20 19:58:59Z

2

Merge is the way to go for combining columns in such cases. When you use pd.merge, you need to specify whether the merge is inner, left or right. Assuming that in this case, you want to keep all the rows in main_exl_df, you should merge using:

main_exl_df = main_exl_df.merge(DF, how='left', on='ID')

If you want to keep rows from both the dataframes, use outer as argument value:

main_exl_df = main_exl_df.merge(DF, how='outer', on='ID')

answered Aug 20, 2020 at 19:58

Dimple Singhania

1421 silver badge5 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Rei Rei Over a year ago

Thank you. This fixed the problem with merge not saving all the rows and also can recognize to merge the two dataframes on the right 'ID' row. However both options of how='left' or how='outer' had the same output in a way that with each merge new columns were created (from main_exl_df with _x suffix and from DF with _y suffix). To fix it I tried this: main_exl_df = main_exl_df.merge(DF, how='outer', on=columns_label) which columns_label is the list of all the mutual column labels from both dataframes. But this didn't fix the problem either.

Dimple Singhania Over a year ago

@ReiRei This means that you have other common columns in the dataframes too. To fix this, you can merge on all the common columns and not just on 'ID' column. Also, check out (stackoverflow.com/questions/19125091/…) to remove duplicate columns while merging.

Rei Rei Over a year ago

thank you very for your answer. I used the linked you sent to solve the problem. I upvoted your answer however unfortunately it wouldn't show publicly because my reputation is less than 15 right now.

Rei Rei · Accepted Answer · 2020-08-24 19:50:38Z

1

This is what solved the problem at the end (with the help of this answer):

I used the merge function however merge created duplicate columns with _x and _y suffixes. To get rid of the _x suffixes I used this function:

    def drop_x(df):
        # list comprehension of the cols that end with '_x'
        to_drop = [x for x in df if x.endswith('_x')]
        df.drop(to_drop, axis=1, inplace=True)

and then merged the two dataframes while replacing the _y suffixes with empty string:

    col_to_use = DF.columns.drop_duplicates(main_exl_df)
    main_exl_df = main_exl_df.merge(DF[col_to_use], on='ID', how='outer', suffixes=('_x', ''))
    drop_x(main_exl_df)

edited Aug 24, 2020 at 19:50

answered Aug 24, 2020 at 19:40

Rei Rei

335 bronze badges

Collectives™ on Stack Overflow

Adding a dataframe to an existing dataframe at specific rows and columns

2 Answers 2

3 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related