Assigning system command's output to variable

Question

I want to run the system command in an awk script and get its output stored in a variable. I've been trying to do this, but the command's output always goes to the shell and I'm not able to capture it. Any ideas on how this can be done?

Example:

$ date | awk --field-separator=! {$1 = system("strip $1"); /*more processing*/}

Should call the strip system command and instead of sending the output to the shell, should assign the output back to $1 for more processing. Rignt now, it's sending output to shell and assigning the command's retcode to $1.

nit: The output isn't going to the shell, it's going to the terminal/console. The shell doesn't read any of the output of its children--they just share file descriptors that are associated with the same tty. — William Pursell
– William Pursell, Commented Dec 25, 2009 at 16:54

Skippy le Grand Gourou · Accepted Answer · 2019-05-17 19:37:48Z

80

Note: Coprocess is GNU awk specific. Anyway another alternative is using getline

cmd = "strip "$1
while ( ( cmd | getline result ) > 0 ) {
  print  result
} 
close(cmd)

Calling close(cmd) will prevent awk to throw this error after a number of calls :

fatal: cannot open pipe `…' (Too many open files)

edited May 17, 2019 at 19:37

Skippy le Grand Gourou

7,8726 gold badges66 silver badges83 bronze badges

answered Dec 25, 2009 at 10:34

ghostdog74

346k62 gold badges264 silver badges349 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

Sahas Over a year ago

Thanks. This way, I can remove the & from my answer. Looks cooler. But I'm writing only for usage in Linux, so unavailability of gawk shouldn't be an issue ?

ghostdog74 Over a year ago

yes, shouldn't be an issue. still you should check documentation and see if coprocess is only available in certain version of gawk. i can't remember on top of my head

Sahas Over a year ago

From version 3.1. RedHat has 3.1.5. Anyways I'll use the way you suggested, unless I want to send something to stdin of the command, in which case coprocess is helpful.

Dan Moulding Over a year ago

Awk never ceases to amaze me.

champost Over a year ago

Note that if you have a for loop over the code above then the close(cmd) is necessary as I discovered it the hard way that awk breaks out after 1018 iterations (this may depend on your system)

Community · Accepted Answer · 2017-05-23 12:17:58Z

54

To run a system command in awk you can either use system() or cmd | getline.

I prefer cmd | getline because it allows you to catch the value into a variable:

$ awk 'BEGIN {"date" |  getline mydate; close("date"); print "returns", mydate}'
returns Thu Jul 28 10:16:55 CEST 2016

More generally, you can set the command into a variable:

awk 'BEGIN {
       cmd = "date -j -f %s"
       cmd | getline mydate
       close(cmd)
     }'

Note it is important to use close() to prevent getting a "makes too many open files" error if you have multiple results (thanks mateuscb for pointing this out in comments).

Using system(), the command output is printed automatically and the value you can catch is its return code:

$ awk 'BEGIN {d=system("date"); print "returns", d}'
Thu Jul 28 10:16:12 CEST 2016
returns 0
$ awk 'BEGIN {d=system("ls -l asdfasdfasd"); print "returns", d}'
ls: cannot access asdfasdfasd: No such file or directory
returns 2

edited May 23, 2017 at 12:17

CommunityBot

11 silver badge

answered Jul 28, 2016 at 8:21

fedorqui

294k113 gold badges594 silver badges641 bronze badges

5 Comments

mateuscb Over a year ago

+1 for adding close(), if you don't add it, and have multiple results, you may get "makes too many open files". If you have a longer command, you can do cmd = "date -j -f %s"; cmd | getline mydate; close(cmd)

fedorqui Over a year ago

@mateuscb many thanks for your feedback. I updated the question to include your useful comments.

csu007 Over a year ago

Thanks for the reminding of close() command. It helps a lot. Without putting close(), I sometimes get wrong date result for multiple results. With putting close(). my multiple date results are all correctly displayed.

one-liner Over a year ago

close(cmd) was crucial for me when doing a cmd | getline var in a awk internal function that was called several times. The second time it was being called and the getline was triggered, the var was no longer being populated

Olivier Dulac Over a year ago

close(cmd): helps a lot. First, it frees the file descriptor. Second : it also "flushes" stdout and thus makes the display better (but it does cost a little bit of 'time' too, to call close for each operation. That "cost" should be paid, however).

Qben · Accepted Answer · 2012-09-18 10:33:49Z

36

Figured out.

We use awk's Two-way I/O

{
  "strip $1" |& getline $1
}

passes $1 to strip and the getline takes output from strip back to $1

edited Sep 18, 2012 at 10:33

Qben

2,6232 gold badges26 silver badges38 bronze badges

answered Dec 25, 2009 at 10:08

Sahas

11.5k10 gold badges44 silver badges52 bronze badges

3 Comments

mcoolive Over a year ago

If you need to call the same command several times, we have to close the command (staff.science.uu.nl/~oostr102/docs/nawk/nawk_26.html#SEC29)

Olivier Dulac Over a year ago

This is not awk but gawk specific (gnu awk) : " with gawk, it is possible to open a two-way pipe to another process "

Pal Over a year ago

close("strip $1" ); afterwards is important for large files (probably small as well)

Ryan Liu · Accepted Answer · 2013-06-07 13:46:57Z

6

gawk '{dt=substr($4,2,11); gsub(/\//," ",dt); "date -d \""dt"\" +%s"|getline ts; print ts}'

answered Jun 7, 2013 at 13:46

Ryan Liu

711 silver badge1 bronze badge

2 Comments

t.niese Over a year ago

If you post answers you should explain the different parts (what you did and why it works). So that others could learn from your answer. For some people this line would be self explaining. But for others its hard to follow what you did exactly.

Devaroop Over a year ago

CAUTION: You should use close(cmd) along with getline, else the results are wrong if run for bulk data. More here

Dmitry · Accepted Answer · 2019-06-09 15:00:12Z

5

You can use this when you need to process a grep output:

echo "some/path/exex.c:some text" | awk -F: '{ "basename "$1"" |& getline $1; print $1 " ==> " $2}'

option -F: tell awk to use : as field separator

"basename "$1"" execute shell command basename on first field

|& getline $1 reads output of previous shell command in substream

output:
exex.c ==> some text

edited Jun 9, 2019 at 15:00

answered May 1, 2018 at 8:19

Dmitry

7448 silver badges12 bronze badges

Comments

Mihir Luthra · Accepted Answer · 2020-03-12 14:04:36Z

3

I am using macOS's awk and I also needed exit status of the command. So I extended @ghostdog74's solution to get the exit status too:

Exit if non-zero exit status:

cmd = <your command goes here>
cmd = cmd" ; printf \"\n$?\""

last_res = ""
value = ""        

while ( ( cmd | getline res ) > 0 ) {

    if (value == "") {
        value = last_res
    } else {
        value = value"\n"last_res
    }

    last_res = res
}

close(cmd)

# Now `res` has the exit status of the command
# and `value` has the complete output of command

if (res != 0) {
    exit 1
} else {
    print value
}

So basically I just changed cmd to print exit status of the command on a new line. After the execution of the above while loop, res would contain the exit status of the command and value would contain the complete output of the command.

Honestly not a very neat way and I myself would like to know if there is some better way.

edited Mar 12, 2020 at 14:04

answered Feb 23, 2020 at 15:53

Mihir Luthra

6,8893 gold badges19 silver badges45 bronze badges

3 Comments

Olivier Dulac Over a year ago

Nice trick, to add the return value as the last line. But maybe simpler: tmpfile="somename" ; cmd="thingyouwant >" tmpfile ; res=system(cmd) ; close(cmd) and then use the simple getline to parse tmpfile to get the output of thingyouwant? (and delete it afterwards with another cmd="rm " tmpfile (that you system(cmd) and close(cmd) as well)

Mihir Luthra Over a year ago

Yes that's much cleaner. I would suggest you to add a new answer for that aswell. I won't be able to test it right now for speed and correctness but will try to use that way if it suits in my code whenever I get back to it.

Andrew Kay Over a year ago

I believe the exit status is returned by the "close(cmd)"

jeremysprofile · Accepted Answer · 2024-12-11 19:46:27Z

Using GNU awk, I wanted to grab the output of the function call and store it so I could format everything with printf. You can't do that with system(), but you can with myCmd | getline myVar:

#!/usr/bin/env bash

hrbytes() {  # human readable bytes. numfmt is cool.
  local num;
  if [[ $# -lt 1 ]]; then
    read num;
  else
    num="$1"
  fi
  local from
  if [[ "$num" =~ [KMGTPEZY]i$ ]]; then
    from="--from=iec-i"
  elif [[ "$num" =~ [KMGTPEZY]$ ]]; then
    from="--from=si"
  fi
  # purposefully not quoting from to avoid empty string issues
  numfmt --to=iec-i --suffix=B --format="%.1f" $from "${num//,}"
}
export -f hrbytes

command time -l helm ls 2>&1 | 
  awk '/peak memory/ {"hrbytes " $1 | getline mem}; /[0-9.] real / {time=$1} END {printf "%ss; %s\n", time, mem}'

This calls hrbytes with the argument of the first field on lines matching that regex and stores the output in mem, which I can think reference with my printf command at the END of reading the file.

This printed 1.49s; 152.1MiB, which was what I wanted to see.

Collectives™ on Stack Overflow

Assigning system command's output to variable

7 Answers 7

5 Comments

5 Comments

3 Comments

2 Comments

Comments

Exit if non-zero exit status:

3 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

7 Answers 7

5 Comments

5 Comments

3 Comments

2 Comments

Comments

Exit if non-zero exit status:

3 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related