0

I'm trying to merge two excel sheets using the common filed Serial but throwing some errors. My program is as below :

 (user1_env)root@ubuntu:~/user1/test/compare_files# cat compare.py
import pandas as pd

source1_df = pd.read_excel('a.xlsx', sheetname='source1')
source2_df = pd.read_excel('a.xlsx', sheetname='source2')
joined_df = source1_df.join(source2_df, on='Serial')

joined_df.to_excel('/root/user1/test/compare_files/result.xlsx')

getting error as below :

    (user1_env)root@ubuntu:~/user1/test/compare_files# python3.5 compare.py
Traceback (most recent call last):
  File "compare.py", line 5, in <module>
    joined_df = source1_df.join(source2_df, on='Serial')
  File "/home/user1/miniconda3/envs/user1_env/lib/python3.5/site-packages/pandas/core/frame.py", line 4385, in join
    rsuffix=rsuffix, sort=sort)
  File "/home/user1/miniconda3/envs/user1_env/lib/python3.5/site-packages/pandas/core/frame.py", line 4399, in _join_compat
    suffixes=(lsuffix, rsuffix), sort=sort)
  File "/home/user1/miniconda3/envs/user1_env/lib/python3.5/site-packages/pandas/tools/merge.py", line 39, in merge
    return op.get_result()
  File "/home/user1/miniconda3/envs/user1_env/lib/python3.5/site-packages/pandas/tools/merge.py", line 223, in get_result
    rdata.items, rsuf)
  File "/home/user1/miniconda3/envs/user1_env/lib/python3.5/site-packages/pandas/core/internals.py", line 4445, in items_overlap_with_suffix
    to_rename)
ValueError: columns overlap but no suffix specified: Index(['Serial'], dtype='object')

I'm referring below SO link for the issue : python compare two excel sheet and append correct record

2 Answers 2

1

Small modification worked for me,

import pandas as pd

source1_df = pd.read_excel('a.xlsx', sheetname='source1')
source2_df = pd.read_excel('a.xlsx', sheetname='source2')
joined_df = pd.merge(source1_df,source2_df,on='Serial',how='outer')
joined_df.to_excel('/home/gk/test/result.xlsx')
Sign up to request clarification or add additional context in comments.

Comments

1

It is because of the overlapping column names after join. You can either set your index to Serial and join, or specify a rsuffix= or lsuffix= value in your join function so that the suffix value would be appended to the common column names.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.