3

i wanted to call an awk commandline script from python:

os.system('''awk 'BEGIN{FS="\t";OFS="\n"} {a[$1]=a[$1] OFS $2 FS $3 FS $4} END{for (i in a) {print i a[i]}}' 2_lcsorted.txt > 2_locus_2.txt''')

it gives the following error:

awk: cmd. line:1: BEGIN{FS="    ";OFS="
awk: cmd. line:1:                     ^ unterminated string
awk: cmd. line:1: BEGIN{FS="    ";OFS="
awk: cmd. line:1:                     ^ syntax error
256

when I use subprocess using subprocess.call, another kind of error pops:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib64/python2.7/subprocess.py", line 493, in call
    return Popen(*popenargs, **kwargs).wait()
  File "/usr/lib64/python2.7/subprocess.py", line 679, in __init__
    errread, errwrite)
  File "/usr/lib64/python2.7/subprocess.py", line 1249, in _execute_child
    raise child_exception
OSError: [Errno 2] No such file or directory

it runs fine in the shell and all i want to do is to combine all steps in a single python script and for some obvious reasons awk is better for certain processing steps. Can someone please explain me the cause of these errors ?

3
  • 1
    What does it do with an r before the ''' ? (ie, r'''awk 'BEGIN{FS="\t";OFS="\n"} {... (The \n is being interpreted a step too early) Commented Mar 7, 2013 at 19:32
  • What are you actually trying to accomplish? I can see shelling out to awk to run a pre-written awk script, but why call awk with a hard-coded script when you can just do the same thing in Python? Commented Mar 7, 2013 at 19:38
  • @chepner.. i preferred awk because i dont know if python can work on a stream input.. its just that i believe that parsing is faster in awk Commented Mar 8, 2013 at 12:41

2 Answers 2

1

You do not want Python to convert \n to a newline character (or \t to a tab) before feeding the string to system. Use r"""....""" as jwpat7 suggested. Another possibility is to write something like ... OFS="\\n" ... in the string.

Sign up to request clarification or add additional context in comments.

Comments

0

Just to add, you might be better off using PyAwk: pyawk.sourceforge.net Also, if you're using subprocess, the problem is that your command should be split. See, subprocess works a bit differently than os.system. subprocces requires that the cmd is a string, not a list. For example,

`os.system('''awk 'BEGIN {FS="\t";OFS="\n"} {a[$1]=a[$1] OFS $2 FS $3 FS $4} 
END {for (i in a) {print i a[i]}}' 2_lcsorted.txt > 2_locus_2.txt''')`

Shoudn't be

`subprocess.call('''awk 'BEGIN {FS="\t";OFS="\n"} {a[$1]=a[$1] OFS $2 FS $3 FS $4} 
END {for (i in a) {print i a[i]}}' 2_lcsorted.txt > 2_locus_2.txt''')`

That won't work. If you feed subprocess a string, it assumes that that is the path to the command you want to execute. The command needs to be a list. Check out www.gossamer-threads.com/lists/python/python/724330. Also, because you're using file redirection, you should use

`subprocess.call(cmd, shell=True)`

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.