the answer to the "related question" you linked (and the one posted in the comments) actually solve your problem,
it just need to be adapted to your specific case.
cat original.list | awk -F" " ' {`/homes/script.py $1`}'
cat is useless here because awk can open and read the file by itself
- you don't need
-F" " because awk will split fields by spaces by default
- backticks `` wont run your script, that's a shell (discouraged)
feature, doesn't work in
awk
we can use command | getline var to execute a command and store its
(first line of) output in a variable. from man awk:
command | getline var
pipes a record from command into var.
using your example file:
$ cat original
Col1 Col2
d 2
e 4
f 6
$
and a dummy script.py:
$ cat script.py
#!/bin/python
print("output")
$
we can do something like this:
$ awk '
NR == 1 { print $0, "Col3" }
NR > 1 { cmd="./script.py " $1; cmd | getline out; close(cmd); print $0, out }
' original
Col1 Col2 Col3
d 2 output
e 4 output
f 6 output
$
the first action runs on the first line of input, adds Col3 to the header and
avoids passing Col1 to the python script.
in the other action, we first build the command concatenating $1 to the
script's path, then we run it and store its first line of output in out
variable (I'm assuming your python script output is just one line). close(cmd) is important because after getline, the pipe reading
from cmd's output would remain open and doing this for many records could lead
to errors like too many open files. at the end we print $0 and cmd's
output.
third's column formatting looks a bit off, you can improve it either from
awk using printf or with an external program like column, e.g:
$ awk '
NR == 1 { print $0, "Col3" }
NR > 1 { cmd="./script.py " $1; cmd | getline out; close(cmd); print $0, out }
' original | column -t
Col1 Col2 Col3
d 2 output
e 4 output
f 6 output
$
lastly, doing all this on a 150k rows file means calling the python script 150k
times etc.., it probably will be a slow task, I think performance could be
improved by doing everything directly in the python script as already
suggested in the comments, but whether or not it is applicable to your specific case, goes
beyond the scope of this question/answer.