2

Here is a minimal test case which fails

#!/bin/tcsh

#here is some code in tcsh I did not write which spawns many processes.
#let us pretend that it spawns 100 instances of stupid_test which the user kills
#manually after an indeterminate period

/bin/bash <<EOF
#!/bin/bash
while true
do
if [[ `ps -e | grep stupid_test | wc -l` -gt 0 ]]
then
  echo 'test program is still running'
  echo `ps -e | grep stupid_test | wc -l`
  sleep 10
else
  break
fi
done
EOF

echo 'test program finished'

The stupid_test program is consists of

#!/bin/bash
while true; do sleep 10; done

The intended behavior is to run until stupid_test is killed (in this case manually by the user), and then terminate within the next ten seconds. The observed behavior is that the script does not terminate, and evaluates ps -e | grep stupid_test | wc -l == 1 even after the program has been killed (and it no longer shows up under ps)

If the bash script is run directly, rather than in a here document, the intended behavior is recovered.

I feel like I am doing something very stupidly wrong, I am not the most experienced shell hacker at all. Why is it doing this?

3
  • If you want to call bash, then you need /bin/bash << \EOF, it is not a shebang then. The ` because you don't want the subshells to be executed in the tcsh` script. Plus the subshell after the echo is useless Commented Dec 13, 2013 at 20:45
  • 1
    Also, you have some wild useless backticks there. Commented Dec 13, 2013 at 20:58
  • 2
    <<'EOF' to prevent expansions from happening in the parent shell. Commented Dec 13, 2013 at 21:13

1 Answer 1

3

Usually when you try to grep the name of a process, you get an extra matching line for grep itself, for example:

$ ps xa | grep something
57386 s002  S+     0:00.01 grep something

So even when there is no matching process, you will get one matching line. You can fix that by adding a grep -v grep in the pipeline:

ps -e | grep stupid_test | grep -v grep | wc -l

As tripleee suggested, an even better fix is writing the grep like this:

ps -e | grep [s]tupid_test

The meaning of the pattern is exactly the same, but this way it won't match grep itself anymore, because the string "grep [s]tupid_test" doesn't match the regular expression /[s]tupid_test/.

Btw I would rewrite your script like this, cleaner:

/bin/bash <<EOF
while :; do
  s=$(ps -e | grep [s]tupid_test)
  test "$s" || break
  echo test program is still running
  echo "$s"
  sleep 10
done
EOF

Or a more lazy but perhaps sufficient variant (hinted by bryn):

/bin/bash <<EOF
while ps -e | grep [s]tupid_test
do
  echo test program is still running
  sleep 10
done
EOF
Sign up to request clarification or add additional context in comments.

8 Comments

pgrep is a pretty useful tool for this sort of situation.
I observe that behavior of ps if i use ps -x, but not for ps -e
@aestrivex in which OS are you?
Ok in Linux ps -e works like you say for me too. In the middle, if instead of the line count, you let it print the matching processes with ps -e | grep stupid_test, then you should get some lines, with the matching processes that keep the loop from exiting
As a further optimization, grep '[s]tupid_test avoids matching itself in the first place.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.