1

I have a large file (~10GB) and I want to duplicate that file 10 times but each time add a variable to the first column:

for i in (1, 10):
    var = (i-1) * 1000
    # add var to the first column of the file and save the file as file(i).csv

So far I have tried:

#!/bin/bash
for i in {1..10}
do
   t=1
   j=$(( $i - t ))
   s=1000
   person_id=$(( j * add ))
   awk -F"," 'BEGIN{OFS=","} NR>1{$1=$1+$person_id} {print $0}' file.csv > file$i.csv
done

but no change in column value.

1 Answer 1

2

Awk variables are different from shell variables.

Replace:

awk -F"," 'BEGIN{OFS=","} NR>1{$1=$1+$person_id} {print $0}' file.csv > file$i.csv

With:

awk -F"," -v id="$person_id" 'BEGIN{OFS=","} NR>1{$1=$1+id} {print $0}' file.csv > "file$i.csv"

This uses the -v option to define an awk variable id whose value is the value of the shell variable person_id.

Because , is not a shell-active character, the code can be simplified. Also, changing the location of the definition of OFS can further shorten the code:

awk -F, -v id="$person_id" 'NR>1{$1+=id} 1' OFS=, file.csv > "file$i.csv"

Lastly, we replaced {print $0} with the cryptic shorthand 1. (This works because awk interprets 1 as a logical condition which it evaluates to true and, since no action was supplied, awk will perform the default action which is to print the line.)

Sign up to request clarification or add additional context in comments.

1 Comment

Thank you so much!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.