Replace multiple fields/columns in one files with contents from another file if the headers are the same

Question

I have two CSV files that share similar headers:sample_scv_1.csv is::

Transaction_date,Name,Payment_Type,Product
1/2/09 6:17,NA,Mastercard,NA
1/2/09 4:53,NA,Visa,NA
1/2/09 13:08,Nick,Mastercard,NA
1/3/09 14:44,Larry,Visa,Goods
1/4/09 12:56,Tina,Visa,Services
1/4/09 13:19,Harry,Visa,Goods

Similarly, sample_scv_2.csv is ::

Transaction_date,Product,Name
1/2/09 6:17,Goods,Janis
1/2/09 4:53,Services,Nicola
1/2/09 13:08,Materials,Asuman

Here in these two files Columns/Fields Transaction_date, Product, Name are common and I want to replace fields Product, Name in sample_scv_1.csv iff the transaction date matches in both the files.

This is a toy example and my file is big. For this example I can separate the cases where columns are equal and use indices to replace using csvtool as:

head -4 sample_scv_1.csv > temp1.csv
tail -3 sample_scv_1.csv > temp1_1.csv
#sudo apt-get install csvtool
csvtool pastecol 2,4 3,2 temp1.csv sample_scv_2.csv > temp1_2.txt
cat temp1_2.txt temp1_1.csv > sample_scv_1.csv

My required output is ::

Transaction_date,Name,Payment_Type,Product
1/2/09 6:17,Janis,Mastercard,Goods
1/2/09 4:53,Nicola,Visa,Services
1/2/09 13:08,Asuman,Mastercard,Materials
1/3/09 14:44,Larry,Visa,Goods
1/4/09 12:56,Tina,Visa,Services
1/4/09 13:19,Harry,Visa,Goods

I can determine until which line the transaction date matches but I can not know the indexes where the two columns overlap: like Name and Product in first file. One issue is easy as all columns of sample_scv_2.csv will be in sample_scv_1.csv. Any ways to do this efficiently.

Please let us know what you have tried. Most of us here are happy to help you improve your craft, but are less happy acting as short order unpaid programming staff. Show us your work so far in an MCVE, the result you were expecting and the results you got from the attempt you made to solve this yourself, and we'll help you figure it out. — ghoti
– ghoti, Commented Oct 7, 2016 at 2:04
@ghoti : Thanks. However, I have shown an example of what I have tried with the csvtool above. I haven't mentioned others for brevity. — discipulus
– discipulus, Commented Oct 7, 2016 at 3:21

James Brown · Accepted Answer · 2016-10-07 03:38:04Z

1

As the files are not bigger than that the file with less columns or fields fits in the memory, so a solution in awk:

$ cat program.awk
BEGIN {FS=OFS=","}         # set the file separators
NR==FNR {                  # for the first file
    p[$1]=$2               # store the product, use date as key
    n[$1]=$3               # name
    next                   # no more processing for the first file
} 
$1 in p {                  # if date found in first processed file
    if($2=="NA") $2=n[$1]  # replace NA with name
    if($4=="NA") $4=p[$1]  # replace NA with product
} 1                        # print the record

Run it:

awk -f program.awk file2 file1
Transaction_date,Name,Payment_Type,Product
1/2/09 6:17 Janis Mastercard Goods
1/2/09 4:53 Nicola Visa Services
1/2/09 13:08 Nick Mastercard Materials
1/3/09 14:44,Larry,Visa,Goods
1/4/09 12:56,Tina,Visa,Services
1/4/09 13:19,Harry,Visa,Goods

answered Oct 7, 2016 at 3:38

James Brown

37.7k8 gold badges52 silver badges64 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

discipulus Over a year ago

Thanks! I want to replace everything not only cases where there are NAs. Plus, the solution assumes we know the indexes where we need replacements but that is little difficult in file with 350 columns. Can it be generalized?

James Brown Over a year ago

You can store every value from file2 to memory and use that to replace fields in file1. You need to know the indexes to match the records. I mostly use awk, magic not so often. There has to be something to compare.

Collectives™ on Stack Overflow

Replace multiple fields/columns in one files with contents from another file if the headers are the same

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related