0

I wonder if someone can help explain what is happening?

I run 2 subprocesses, 1 for ffprobe and 1 for ffmpeg.

popen = subprocess.Popen(ffprobecmd, stderr=subprocess.PIPE, shell=True)

And

popen = subprocess.Popen(ffmpegcmd, shell=True, stdout=subprocess.PIPE)

On both Windows and Linux the ffprobe command fires, finishes and gets removed from taskmanager/htop. But only on Windows does the same happen to ffmpeg. On Linux the command remains in htop...

enter image description here

Can anyone explain what is going on, if it matters and how I can stop it from happening please?

EDIT: Here are the commands...

ffprobecmd = 'ffprobe' + \
' -user_agent "' + request.headers['User-Agent'] + '"' + \
' -headers "Referer: ' + request.headers['Referer'] + '"' + \
' -timeout "5000000"' + \
' -v error -select_streams v -show_entries stream=height -of default=nw=1:nk=1' + \
' -i "' + request.url + '"'

and

ffmpegcmd = 'ffmpeg' + \
' -re' + \
' -user_agent "' + r.headers['User-Agent'] + '"' + \
' -headers "Referer: ' + r.headers['Referer'] + '"' + \
' -timeout "10"' + \
' -i "' + r.url + '"' + \
' -c copy' + \
' -f mpegts' + \
' pipe:'

EDIT: Here is a example that behaves as described...

import flask
from flask import Response
import subprocess

app = flask.Flask(__name__)

@app.route('/', methods=['GET'])
def go():
    def stream(ffmpegcmd):
        popen = subprocess.Popen(ffmpegcmd, stdout=subprocess.PIPE, shell=True)
        try:
            for stdout_line in iter(popen.stdout.readline, ""):
                yield stdout_line
        except GeneratorExit:
            raise

    url = "https://bitdash-a.akamaihd.net/content/MI201109210084_1/m3u8s/f08e80da-bf1d-4e3d-8899-f0f6155f6efa.m3u8"

    ffmpegcmd = 'ffmpeg' + \
                ' -re' + \
                ' -timeout "10"' + \
                ' -i "' + url + '"' + \
                ' -c copy' + \
                ' -f mpegts' + \
                ' pipe:'
    return Response(stream(ffmpegcmd))

if __name__ == '__main__':
    app.run(host= '0.0.0.0', port=5000)
12
  • 1
    Note that this code has some significant security problems. What do you expect to happen if a Referrer contains $(rm -rf ~)'$(rm -rf ~)', for example? (The possibility of there being single quotes as part of the data itself means you can't just add literal single quotes during the concatenation process to rule out issues). Commented May 23, 2021 at 15:54
  • 1
    ...particularly for UNIX-y systems, you'd be a lot better off forming a list and using shell=False instead of going the string route at all. Commented May 23, 2021 at 15:56
  • 1
    As for your ffmpeg command not exiting... are you actually reading to the end of the FIFO? I'd also suggest stdin=subprocesss.DEVNULL so it can't be blocking on stdin. However, this can't really be given a canonical answer without a minimal reproducible example -- code that can be run without any changes whatsoever to let someone else see the problem themselves and test whether the issue is resolved. Commented May 23, 2021 at 15:57
  • 1
    So, when I run that minimal reproducible example, what I see is that the copy of ffmpeg is actually exiting, but Python isn't reaping its PID from the process table so it's stuck there as a zombie. Commented May 23, 2021 at 16:35
  • 1
    But that's different from what your screenshot shows, because your screenshot has S state instead of Z state. Commented May 23, 2021 at 16:36

2 Answers 2

1

You have the extra sh process due to shell=True, and your copies of ffmpeg are allowed to try to attach to the original terminal's stdin because you aren't overriding that file handle. To fix both those issues, and also some security bugs, switch to shell=False, set stdin=subprocess.DEVNULL, and (to stop zombies from potentially being left behind, note the finally: block below that calls popen.poll() to see if the child exited, and popen.terminate() to tell it to exit if it hasn't):

#!/usr/bin/env python

import flask
from flask import Response
import subprocess

app = flask.Flask(__name__)

@app.route('/', methods=['GET'])
def go():
    def stream(ffmpegcmd):
        popen = subprocess.Popen(ffmpegcmd, stdin=subprocess.DEVNULL, stdout=subprocess.PIPE)
        try:
            # NOTE: consider reading fixed-sized blocks (4kb at least) at a time
            # instead of parsing binary streams into "lines".
            for stdout_line in iter(popen.stdout.readline, ""):
                yield stdout_line
        finally:
            if popen.poll() == None:
                popen.terminate()
                popen.wait() # yes, this can cause things to actually block

    url = "https://bitdash-a.akamaihd.net/content/MI201109210084_1/m3u8s/f08e80da-bf1d-4e3d-8899-f0f6155f6efa.m3u8"

    ffmpegcmd = [
        'ffmpeg',
        '-re',
        '-timeout', '10',
        '-i', url,
        '-c', 'copy',
        '-f', 'mpegts',
        'pipe:'
    ]
    return Response(stream(ffmpegcmd))

if __name__ == '__main__':
    app.run(host= '127.0.0.1', port=5000)

Mind, it's not appropriate to be parsing a binary stream as a series of lines at all. It would be much more appropriate to use blocks (and to change your response headers so the browser knows to parse the content as a video).

Sign up to request clarification or add additional context in comments.

10 Comments

The HTTP header that gets the mime type is called Content-Type, so yes.
If you want to be extra sure, you might keep a list of subprocess objects to occasionally poll() until they finish dying (if you end up in the popen.terminate() case, you can still get a zombie, since the terminate() takes some time to take effect, and the poll() happened before it).
BTW, the usual way to make sure you don't have zombies is to have a SIGCHILD handler that calls wait() to reap them; but doing that can break other things if you have code that assumes it'll be able to successfully wait() for a dead child (outside that signal handler) and find out how it died. Of course, if you have the signal handler update an internal table of jobs with how they died... then you just reinvented half of bash's job control.
I ended up just taking out the except GeneratorExit: raise; I'm unconvinced that it ever does anything useful.
@Chris, [..., '-headers', f"Referer: {request.headers['Referer']}", ...] is one of the many possible ways to do it.
|
0

What type is the ffmpegcmd variable? Is it a string or a list/sequence?

Note that Windows and Linux/POSIX behave differently with the shell=True parameter enabled or disabled. It matters whether ffmpegcmd is a string or a list.

Direct excerpt from the documentation:

On POSIX with shell=True, the shell defaults to /bin/sh. If args is a string, the string specifies the command to execute through the shell. This means that the string must be formatted exactly as it would be when typed at the shell prompt. This includes, for example, quoting or backslash escaping filenames with spaces in them. If args is a sequence, the first item specifies the command string, and any additional items will be treated as additional arguments to the shell itself. That is to say, Popen does the equivalent of:

Popen(['/bin/sh', '-c', args[0], args[1], ...])

On Windows with shell=True, the COMSPEC environment variable specifies the default shell. The only time you need to specify shell=True on Windows is when the command you wish to execute is built into the shell (e.g. dir or copy). You do not need shell=True to run a batch file or console-based executable.

1 Comment

I have edited my post to include the commands. I have been creating and testing my python code on Windows, but I deploy it on my Linux server inside a Docker container.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.