pyparsing to parse a python function call in its most general form

Question

I would like to use the excellent pyparsing package to parse a python function call in its most general form. I read one post that was somewhat useful here but still not general enough.

I would like to parse the following expression:

f(arg1,arg2,arg3,...,kw1=var1,kw2=var2,kw3=var3,...)

where

arg1,arg2,arg3 ... are any kind of valid python objects (integer, real, list, dict, function, variable name ...)
kw1, kw2, kw3 ... are valid python keyword names
var1,var2,var3 are valid python objects

I was wondering if a grammar could be defined for such a general template. I am perhaps asking too much ... Would you have any idea ?

thank you very much for your help

Eric

Sure it's possible, there is a full python grammar implementation in pyparsing examples (though it looks quite cryptic, but might help). — bereal
– bereal, Commented Jan 23, 2013 at 9:39
Depending on what you're doing with it - another option is to use Python's standard library to parse it: see ast.parse and related NodeVisitor and NodeTransformer classes. — Jon Clements
– Jon Clements, Commented Jan 23, 2013 at 10:32
That's not the most general form. The most general form is f(arg1, arg2, arg3, ..., kwarg1=val1, kwarg2=val2, ..., *args, **kwargs). — Bakuriu
– Bakuriu, Commented Jan 23, 2013 at 11:45

Vincent Wen · Accepted Answer · 2020-01-12 05:50:05Z

8

Is that all? Let's start with a simple informal BNF for this:

func_call ::= identifier '(' func_arg [',' func_arg]... ')'
func_arg ::= named_arg | arg_expr
named_arg ::= identifier '=' arg_expr
arg_expr ::= identifier | real | integer | dict_literal | list_literal | tuple_literal | func_call
identifier ::= (alpha|'_') (alpha|num|'_')*
alpha ::= some letter 'a'..'z' 'A'..'Z'
num ::= some digit '0'..'9'

Translating to pyparsing, work bottom-up:

identifier = Word(alphas+'_', alphanums+'_')

# definitions of real, integer, dict_literal, list_literal, tuple_literal go here
# see further text below

# define a placeholder for func_call - we don't have it yet, but we need it now
func_call = Forward()

string = pp.quotedString | pp.unicodeString

arg_expr = identifier | real | integer | string | dict_literal | list_literal | tuple_literal | func_call

named_arg = identifier + '=' + arg_expr

# to define func_arg, must first see if it is a named_arg
# why do you think this is?
func_arg = named_arg | arg_expr

# now define func_call using '<<' instead of '=', to "inject" the definition 
# into the previously declared Forward
#
# Group each arg to keep its set of tokens separate, otherwise you just get one
# continuous list of parsed strings, which is almost as worthless the original
# string
func_call << identifier + '(' + delimitedList(Group(func_arg)) + ')'

Those arg_expr elements could take a while to work through, but fortunately, you can get them off the pyparsing wiki's Examples page: http://pyparsing.wikispaces.com/file/view/parsePythonValue.py

from parsePythonValue import (integer, real, dictStr as dict_literal, 
                              listStr as list_literal, tupleStr as tuple_literal)

You still might get args passed using *list_of_args or **dict_of_named_args notation. Expand arg_expr to support these:

deref_list = '*' + (identifier | list_literal | tuple_literal)
deref_dict = '**' + (identifier | dict_literal)

arg_expr = identifier | real | integer | dict_literal | list_literal | tuple_literal | func_call | deref_list | deref_dict

Write yourself some test cases now - start simple and work your way up to complicated:

sin(30)
sin(a)
hypot(a,b)
len([1,2,3])
max(*list_of_vals)

Additional argument types that will need to be added to arg_expr (left as further exercise for the OP):

indexed arguments : dictval['a'] divmod(10,3)[0] range(10)[::2]
object attribute references : a.b.c
arithmetic expressions : sin(30), sin(a+2*b)
comparison expressions : sin(a+2*b) > 0.5 10 < a < 20
boolean expressions : a or b and not (d or c and b)
lambda expression : lambda x : sin(x+math.pi/2)
list comprehension
generator expression

edited Jan 12, 2020 at 5:50

Vincent Wen

1,8621 gold badge15 silver badges12 bronze badges

answered Jan 24, 2013 at 9:50

PaulMcG

64.1k16 gold badges98 silver badges135 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Eurydice Over a year ago

I tried some of the examples you gave and the defined grammar failed to parse the len([1,2,3]) string. It gives a TypeError: object of type 'int' has no len() because the [1,2,3] list is concatenated to 123 integer. I tried to figure out why. I thought that it may be due to the Optional(Suppress(",")) used for defining listStr, tupleStr and dictStr but removing them still produces the same error.

PaulMcG Over a year ago

I don't understand why [1,2,3] concatenates to an integer - listStr should parse and convert that to the 3-element list [1,2,3].

Eurydice Over a year ago

That's what puzzles me too, Paul. If I do:type(eval(func_call.transformString('[1,2,3]'))) I get a <type 'list'> but when I do func_call.transformString('len([1,2,3])') I get len(123) and not len([1,2,3]) as expected. I will try to dig this further. If in the meantime, you could find the problem, you're wellcome !!!

PaulMcG Over a year ago

Pyparsing is no longer hosted on wikispaces.com. Go to github.com/pyparsing/pyparsing

Collectives™ on Stack Overflow

pyparsing to parse a python function call in its most general form

1 Answer 1

4 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

4 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related