Here is a minimal test case which fails
#!/bin/tcsh
#here is some code in tcsh I did not write which spawns many processes.
#let us pretend that it spawns 100 instances of stupid_test which the user kills
#manually after an indeterminate period
/bin/bash <<EOF
#!/bin/bash
while true
do
if [[ `ps -e | grep stupid_test | wc -l` -gt 0 ]]
then
echo 'test program is still running'
echo `ps -e | grep stupid_test | wc -l`
sleep 10
else
break
fi
done
EOF
echo 'test program finished'
The stupid_test program is consists of
#!/bin/bash
while true; do sleep 10; done
The intended behavior is to run until stupid_test is killed (in this case manually by the user), and then terminate within the next ten seconds. The observed behavior is that the script does not terminate, and evaluates ps -e | grep stupid_test | wc -l == 1 even after the program has been killed (and it no longer shows up under ps)
If the bash script is run directly, rather than in a here document, the intended behavior is recovered.
I feel like I am doing something very stupidly wrong, I am not the most experienced shell hacker at all. Why is it doing this?
bash, then you need/bin/bash << \EOF, it is not a shebang then. The` because you don't want the subshells to be executed in thetcsh` script. Plus the subshell after theechois useless<<'EOF'to prevent expansions from happening in the parent shell.