1

My works relates to instrumentation of code fragments in python code. So in my work i would be writing a script in python such that I take another python file as input and insert any necessary code in the required place with my script.

The following code is a sample code of a file which i would be instrumenting:

A.py #normal un-instrumented code

statements
....
....

def move(self,a):
    statements
    ......
    print "My function is defined" 
    ......

statements 
......

My script what actually does is to check each lines in the A.py and if there is a "def" then a code fragment is instrumented on top of the code the def function

The following example is how the final out put should be:

A.py #instrumented code

statements
....
....

@decorator    #<------ inserted code
def move(self,a):
    statements
    ......
    print "My function is defined" 
    ......

statements 
......

But I have been resulted with different output. The following code is the final output which i am getting:

A.py #instrumented code

statements
....
....

@decorator    #<------ inserted code
def move(self,a):
    statements
    ......
    @decorator #<------ inserted code [this should not occur]
    print "My function is defined" 
    ......

statements 
......

I can understand that in the instrumented code it recognizes "def" in the word "defined" and so it instruments the a code above it.

In realty the instrumented code has lots of these problems I was not able to properly instrument the given python file. Is there any other way to differentiate the actual "def" from string?

Thank you

3
  • How are you finding def in the instrumentation? If using a Regular Expresssion then try r'\bdef\b'. The \b marks a word boundary. Commented May 29, 2013 at 8:50
  • Will it work even when there is a statement like this print" This is a def" Commented May 29, 2013 at 8:55
  • no. To deal with text embedded in quotes you will need negative look-arounds. Commented May 29, 2013 at 9:05

2 Answers 2

3

Use the ast module to parse the file properly.

This code prints the line number and column offset of each def statement:

import ast
with open('mymodule.py') as f:
    tree = ast.parse(f.read())
for node in ast.walk(tree):
    if isinstance(node, ast.FunctionDef):
        print node.lineno, node.col_offset
Sign up to request clarification or add additional context in comments.

2 Comments

How do i use the column offset value when i am instrumenting a statement above the def function so that its is aligned correctly as the def function.
@karthik I'm not sure how tab characters affect col_offset; try it out. I think you need to copy col_offset characters from beginning of the line and use that string to indent @decorator.
0

You could use a Regular Expression. To avoid def inside quotes then you can use negative look-arounds:

import re

for line in open('A.py'):
    m = re.search(r"(?!<[\"'])\bdef\b(?![\"'])", line)
    if m:
        print r'@decorator    #<------ inserted code' 

    print line 

However, there might be other occurances of def that you or I can't think of, and if we are not careful we end-up writing the Python parser all over again. @Janne Karila's suggestion of using ast.parse is probably safer in the long term.

2 Comments

Then there are multi-line strings.
@JanneKarila: yup, that's one I didn't think of. Which only goes to show that your answer is preferred.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.