1

In my python script, I need to use 'awk' but I want to pass file using the sys.argv. My current code is like this:

import sys
import os

cmd="awk '/regex/ {print}' sys.argv[1] | sed 's/old/new/g'"
x=os.popen(cmd).read()

Now the problem is that 'sys.argv' is a python thing but cmd variable is using a linux command. So my question is - Is there any way to include sys.argv in my linux command?

11
  • 2
    If you are already programming in Python, are you sure you need to call awk? Python probably can do everything you need from awk just fine. Commented Jun 24, 2019 at 18:24
  • Just wrap it in quotes: python somefile.py 'some commands here' Commented Jun 24, 2019 at 18:25
  • Simply use string formatting: cmd=f"awk '/regex/ {{print}}' %s | sed 's/old/new/g'" % (sys.argv[1]) Commented Jun 24, 2019 at 18:26
  • 1
    cmd="awk '/regex/ {print}' %s | sed 's/old/new/g'" % sys.argv[1] Commented Jun 24, 2019 at 18:35
  • 1
    @JasonMorgan, from a security perspective that's a horrid idea. What if sys.argv[1] contains $(rm -rf ~)? Or even $(rm -rf ~)'$(rm -rf ~)', so you can't escape it with literal single quotes? Commented Jun 24, 2019 at 19:20

3 Answers 3

2

You really don't need Awk or sed for this. Python can do these things natively, elegantly, flexibly, robustly, and naturally.

import sys
import re

r = re.compile(r'regex')
s = re.compile(r'old')

with open(sys.argv[1]) as input:
    for line in input:
        if r.search(line):
            print(s.sub('new', line))

If you really genuinely want to use subprocesses for something, simply use Python's general string interpolation functions where you need to insert the value of a Python variable into a string.

import subprocess
import sys
import shlex

result = subprocess.run(
    """awk '/regex/ {print}' {} | 
    sed 's/old/new/g'""".format(shlex.quote(sys.argv[1])),
    stdout=subprocess.PIPE,
    shell=True, check=True)
print(subprocess.stdout)

But really, don't do this. If you really can't avoid a subprocess, keep it as simple as possible (avoid shell=True and peel off all the parts which can be done in Python).

Sign up to request clarification or add additional context in comments.

3 Comments

Your section approach using format() has security vulnerabilities as currently formulated. Please either use shlex.quote()/pipes.quote(), or (better) pass a list to subprocess.run() and pass the value out-of-band from the code (as in this usage mode with shell=True, only the first list element is parsed as code; the second becomes $0, the third $1, etc).
Thanks, adding shlex.quote() as an easy fix. Rearranging the pipeline so as to avoid shell=True is cumbersome, though there is some limited support via the Python pipes module.
Another easy workaround would be to pass an open file handle from Python ; with open(sys.argv[1]) as input: subprocess.run(..., stdin=input)
0

Just try like this

cmd="awk '/regex/ {print}' " + str(sys.argv[1]) + " | sed 's/old/new/g'"
x=os.popen(cmd).read()

2 Comments

sys.argv[1] is already a string, you don't need to use str().
From a security perspective this is horrible. Think about what happens if ./yourpythonprog '$(rm -rf ~)' is invoked.
0

Your best choice is to implement your logic as pure Python logic, as described in the first part of the answer by @tripleee. Your second best choice is to keep the external tools, but eliminate the need for a shell in invoking them and connecting them together.

See the Python documentation section Replacing Shell Pipelines.

import sys
from subprocess import Popen, PIPE

p1 = Popen(['awk', '/regex/ {print}'], stdin=open(sys.argv[1]), stdout=PIPE)
p2 = Popen(['sed', 's/old/new/g'], stdin=p1.stdout, stdout=PIPE)
x = p2.communicate()[0]

Your third best choice is to keep the shell, but pass the data out-of-band from the code:

p = subprocess.run([
  """awk '/regex/ {print}' <"$1" | sed 's/old/new/'""",  # code to run
  '_',                                                   # $0 in context of that code
  sys.argv[1]                                            # $1 in context of that code
], shell=True, check=True, stdout=subprocess.PIPE)
print(p.stdout)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.