I have a file that contains 5 columns and the number of lines varies. I want to append three columns being populated from variables. The variable value stays the same.
At the moment I am doing it in the following way:
#!/bin/bash
newvar1="abcd6"
newvar2="abcd7"
newvar3="abcd8"
rm -rf *.txtyy
number_of_lines=`wc -l smallsample.txt|awk {'print $1'}`
for i in `seq $number_of_lines`; do
echo $newvar1 >> paste1.txtyy
echo $newvar2 >> paste2.txtyy
echo $newvar3 >> paste3.txtyy
done
paste -d "," smallsample.txt paste1.txtyy paste2.txtyy paste3.txtyy
Script output is:
# bash paste.sh
abcd1,abcd2,abcd3,abcd4,abcd5,abcd6,abcd7,abcd8
abcd1,abcd2,abcd3,abcd4,abcd5,abcd6,abcd7,abcd8
abcd1,abcd2,abcd3,abcd4,abcd5,abcd6,abcd7,abcd8
abcd1,abcd2,abcd3,abcd4,abcd5,abcd6,abcd7,abcd8
Execution time on 1,000,000 lines on my machine is:
time bash paste.sh
real 0m24.257s
user 0m14.668s
sys 0m9.380s
Input:
abcd1,abcd2,abcd3,abcd4,abcd5
abcd1,abcd2,abcd3,abcd4,abcd5
abcd1,abcd2,abcd3,abcd4,abcd5
abcd1,abcd2,abcd3,abcd4,abcd5
...
abcd1,abcd2,abcd3,abcd4,abcd5
Required output:
abcd1,abcd2,abcd3,abcd4,abcd5,abcd6,abcd7,abcd8
abcd1,abcd2,abcd3,abcd4,abcd5,abcd6,abcd7,abcd8
abcd1,abcd2,abcd3,abcd4,abcd5,abcd6,abcd7,abcd8
abcd1,abcd2,abcd3,abcd4,abcd5,abcd6,abcd7,abcd8
...
abcd1,abcd2,abcd3,abcd4,abcd5,abcd6,abcd7,abcd8
I believe that what am I doing here is such an overkill and wasting available resources. Can I do better and faster somehow on Debian 9.4 using available tools in that distro?