1

I have many dataframes from 0 to N that need to merge. They all has one common column called 'NAME'. I tried different ways on merge and concat but dataframes doesn't return the format I need

These are some samples of my data

d1 = {'col1': [1,1], 'col2': [1,1], 'NAME': ["A","A"]}
d2 = {'col3': [1,1], 'col4': [1,1], 'NAME': ["A","A"]}
d3 = {'col1': [2,2], 'col2': [2,2], 'NAME': ["B","B"]}
d4 = {'col3': [2,2], 'col4': [2,2], 'NAME': ["B","B"]}
d5 = {'col5': []}
d6 = {'col5': []}

df_list = [d1, d2, d3, d4, d5, d6]

test_df = []
for item in df_list:
    temp_df = pd.DataFrame(data=item)
    test_df.append(temp_df)

merged = reduce(lambda left, right: pd.merge(left, right, on="NAME", how='outer'),test_df)

The above code also give me an error: KeyError: 'NAME'

I expect my merged result to be

   col1 col2 col3 col4 NAME
0     1    1    1    1    A
1     1    1    1    1    A
0     2    2    2    2    B
1     2    2    2    2    B

2 Answers 2

1

Assuming unique NAME values per dataframe, I would use pandas.concat here:

(pd
 .concat(test_df).rename_axis('index')
 .groupby(['NAME', 'index']).first()
 .reset_index('NAME')
)

output:

      NAME  col1  col2  col3  col4
index                             
0        A   1.0   1.0   1.0   1.0
1        A   1.0   1.0   1.0   1.0
0        B   2.0   2.0   2.0   2.0
1        B   2.0   2.0   2.0   2.0
Sign up to request clarification or add additional context in comments.

Comments

0
d1 = {'col1': [1,1], 'col2': [1,1], 'NAME': ["A","A"]}
d2 = {'col3': [1,1], 'col4': [1,1], 'NAME': ["A","A"]}
d3 = {'col1': [2,2], 'col2': [2,2], 'NAME': ["B","B"]}
d4 = {'col3': [2,2], 'col4': [2,2], 'NAME': ["B","B"]}
d5 = {'col5': []}
d6 = {'col5': []}

df_list = [d1, d2, d3, d4]

test_df = []
for item in df_list:
    temp_df = pd.DataFrame(data=item)
    test_df.append(temp_df)

merged = reduce(lambda left, right: pd.merge(left, right, on="NAME", how='outer'),test_df)
merged["col5"] = None
col1_x col2_x NAME col3_x col4_x col1_y col2_y col3_y col4_y col5
0 1 1 A 1 1 nan nan nan nan
1 1 1 A 1 1 nan nan nan nan
2 1 1 A 1 1 nan nan nan nan
3 1 1 A 1 1 nan nan nan nan
4 nan nan B nan nan 2 2 2 2
5 nan nan B nan nan 2 2 2 2
6 nan nan B nan nan 2 2 2 2
7 nan nan B nan nan 2 2 2 2

pandas.DataFrame.merge

If you just want to expand one column, then you can add a separate column without merge

KeyError: 'NAME'

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.