I have two CSV files and I need to check for creations, updates and deletions. Take the following example files:
ORIGINAL FILE
sku1,A
sku2,B
sku3,C
sku4,D
sku5,E
sku6,F
sku7,G
sku8,H
sku9,I
sku10,J
UPDATED FILE
sku1,A
sku2,B-UPDATED
sku3,C
sku5,E
sku6,F
sku7,G-UPDATED
sku11, CREATED
sku8,H
sku9,I
sku4,D-UPDATED
I am using the linux comm command as follows:
comm -23 --nocheck-order updated_file.csv original_file > diff_file.csv
Which gives me all newly created and updated rows as follows
sku2,B-UPDATED
sku7,G-UPDATED
sku11, CREATED
sku4,D-UPDATED
Which is great but if you look closely "sku10,J" has been deleted and I'm not sure the best command/way to check for it. The data I have provided is merely demo, the text "sku" does not exist in the real data however column one of the CSV files are a unique 5 character indentifier. Any advice is appreciated.