0

I am comparing two csv files but the update.csv file is same as new.csv

import csv

with open('old.csv', 'r') as t1:
    old_csv = t1.readlines()

with open('new.csv', 'r') as t2:
    new_csv = t2.readlines()

with open('update.csv', 'w') as out_file:
        line_in_new = 0
        line_in_old = 0
        while line_in_new < len(new_csv) and line_in_old < len(old_csv):
            if old_csv[line_in_old] != new_csv[line_in_new]:
                out_file.write(new_csv[line_in_new])
            else:
        line_in_old += 1
    line_in_new += 1

I want output same as the sample.

Sample :

Input:

old.csv

a,b,c
1,2,3
4,5,6
8,9,9

new.csv

a,b,c
1,2,3
5,6,7
8,9,7

Output:

update.csv

4,5,6,deleted
5,6,7,new added 
8,9,9,change

Please help me to get the only difference on update.csv

1
  • What do u mean by difference? please post a clear input sample and desired output. Commented Apr 29, 2018 at 7:06

1 Answer 1

3

A solution using pandas:

import pandas as pd

df1 = pd.read_csv('old.csv')
df2 = pd.read_csv('new.csv')

df1['flag'] = 'old'
df2['flag'] = 'new'

df = pd.concat([df1, df2])

dups_dropped = df.drop_duplicates(df.columns.difference(['flag']), keep=False)
dups_dropped.to_csv('update.csv', index=False)

Input:

old.csv

a,b,c
1,2,3
4,5,6

new.csv

a,b,c
1,2,3
5,6,7

Output:

update.csv

a,b,c,flag
4,5,6,old
5,6,7,new
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks, Ashish, If I want to show only differences means old & new.How i can get

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.