How to use linux command in python with python functions (sys.argv)

Question

In my python script, I need to use 'awk' but I want to pass file using the sys.argv. My current code is like this:

import sys
import os

cmd="awk '/regex/ {print}' sys.argv[1] | sed 's/old/new/g'"
x=os.popen(cmd).read()

Now the problem is that 'sys.argv' is a python thing but cmd variable is using a linux command. So my question is - Is there any way to include sys.argv in my linux command?

If you are already programming in Python, are you sure you need to call awk? Python probably can do everything you need from awk just fine. — ParthS007
– ParthS007, Commented Jun 24, 2019 at 18:24
Just wrap it in quotes: python somefile.py 'some commands here' — C.Nivs
– C.Nivs, Commented Jun 24, 2019 at 18:25
Simply use string formatting: cmd=f"awk '/regex/ {{print}}' %s | sed 's/old/new/g'" % (sys.argv[1]) — Tomerikoo
– Tomerikoo, Commented Jun 24, 2019 at 18:26
cmd="awk '/regex/ {print}' %s | sed 's/old/new/g'" % sys.argv[1] — Jay M
– Jay M, Commented Jun 24, 2019 at 18:35
@JasonMorgan, from a security perspective that's a horrid idea. What if sys.argv[1] contains $(rm -rf ~)? Or even $(rm -rf ~)'$(rm -rf ~)', so you can't escape it with literal single quotes? — Charles Duffy
– Charles Duffy, Commented Jun 24, 2019 at 19:20

tripleee · Accepted Answer · 2019-06-25 08:10:00Z

2

You really don't need Awk or sed for this. Python can do these things natively, elegantly, flexibly, robustly, and naturally.

import sys
import re

r = re.compile(r'regex')
s = re.compile(r'old')

with open(sys.argv[1]) as input:
    for line in input:
        if r.search(line):
            print(s.sub('new', line))

If you really genuinely want to use subprocesses for something, simply use Python's general string interpolation functions where you need to insert the value of a Python variable into a string.

import subprocess
import sys
import shlex

result = subprocess.run(
    """awk '/regex/ {print}' {} | 
    sed 's/old/new/g'""".format(shlex.quote(sys.argv[1])),
    stdout=subprocess.PIPE,
    shell=True, check=True)
print(subprocess.stdout)

But really, don't do this. If you really can't avoid a subprocess, keep it as simple as possible (avoid shell=True and peel off all the parts which can be done in Python).

edited Jun 25, 2019 at 8:10

answered Jun 24, 2019 at 19:04

tripleee

192k37 gold badges318 silver badges369 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Charles Duffy Over a year ago

Your section approach using format() has security vulnerabilities as currently formulated. Please either use shlex.quote()/pipes.quote(), or (better) pass a list to subprocess.run() and pass the value out-of-band from the code (as in this usage mode with shell=True, only the first list element is parsed as code; the second becomes $0, the third $1, etc).

tripleee Over a year ago

Thanks, adding shlex.quote() as an easy fix. Rearranging the pipeline so as to avoid shell=True is cumbersome, though there is some limited support via the Python pipes module.

tripleee Over a year ago

Another easy workaround would be to pass an open file handle from Python ; with open(sys.argv[1]) as input: subprocess.run(..., stdin=input)

Jvol Jvolizka · Accepted Answer · 2019-06-24 18:36:03Z

0

Just try like this

cmd="awk '/regex/ {print}' " + str(sys.argv[1]) + " | sed 's/old/new/g'"
x=os.popen(cmd).read()

answered Jun 24, 2019 at 18:36

Jvol Jvolizka

504 bronze badges

2 Comments

cdarke Over a year ago

sys.argv[1] is already a string, you don't need to use str().

Charles Duffy Over a year ago

From a security perspective this is horrible. Think about what happens if ./yourpythonprog '$(rm -rf ~)' is invoked.

Charles Duffy · Accepted Answer · 2019-06-25 16:18:12Z

Your best choice is to implement your logic as pure Python logic, as described in the first part of the answer by @tripleee. Your second best choice is to keep the external tools, but eliminate the need for a shell in invoking them and connecting them together.

See the Python documentation section Replacing Shell Pipelines.

import sys
from subprocess import Popen, PIPE

p1 = Popen(['awk', '/regex/ {print}'], stdin=open(sys.argv[1]), stdout=PIPE)
p2 = Popen(['sed', 's/old/new/g'], stdin=p1.stdout, stdout=PIPE)
x = p2.communicate()[0]

Your third best choice is to keep the shell, but pass the data out-of-band from the code:

p = subprocess.run([
  """awk '/regex/ {print}' <"$1" | sed 's/old/new/'""",  # code to run
  '_',                                                   # $0 in context of that code
  sys.argv[1]                                            # $1 in context of that code
], shell=True, check=True, stdout=subprocess.PIPE)
print(p.stdout)

Collectives™ on Stack Overflow

How to use linux command in python with python functions (sys.argv)

3 Answers 3

3 Comments

2 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

3 Comments

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related