How to get the names of the named variables from the python string

Question

Is there a graceful way to get names of named %s-like variables of string object? Like this:

string = '%(a)s and %(b)s are friends.'
names = get_names(string)  # ['a', 'b']

Known alternative ways:

Parse names using regular expression, e.g.:

import re
names = re.findall(r'%\((\w)\)[sdf]', string)  # ['a', 'b']

Use .format()-compatible formating and Formatter().parse(string).

How to get the variable names from the string for the format() method

But what about a string with %s-like variables?

PS: python 2.7

The method you're describing seems to work well. It returns ['a','b']. So what is missing now? — Adi Levin
– Adi Levin, Commented Jan 19, 2016 at 13:03
@AdiLevin The way no.1 requires additional import. The way no.2 requires another string format. I am just curious is there a way to get the same result using only string object inner methods and properties or, maybe, some string module functions. — hackprime
– hackprime, Commented Jan 19, 2016 at 13:12
What is preventing you from using format() for formatting? This seems like one of those cases where it is simply more powerful. — Joost
– Joost, Commented Jan 19, 2016 at 13:15
If you're asking, "Does Python, in the course of performing percent-style formatting, ever produce an intermediary data structure that one could inspect and extract the named parameters from?", it does not. The formatting code is all C, so there's no native method you could invoke; and it basically operates directly on the final string object, so there's no intermediary object to look at. — Kevin
– Kevin, Commented Jan 19, 2016 at 13:34

J. Beattie · Accepted Answer · 2018-08-13 15:26:25Z

In order to answer this question, you need to define "graceful". Several factors might be worth considering:

Is the code short, easy to remember, easy to write, and self explanatory?
Does it reuse the underlying logic (i.e. follow the DRY principle)?
Does it implement exactly the same parsing logic?

Unfortunately, the "%" formatting for strings is implemented in the C routine "PyString_Format" in stringobject.c. This routine does not provide an API or hooks that allow access to a parsed form of the format string. It simply builds up the result as it is parsing the format string. Thus any solution will need to duplicate the parsing logic from the C routine. This means DRY is not followed and exposes any solution to breaking if a change is made to the formatting specification.

The parsing algorithm in PyString_Format includes a fair bit of complexity, including handling nested parentheses in key names, so cannot be fully implemented using regular expression nor using string "split()". Short of copying the C code from PyString_Format and converting it to Python code, I do not see any remotely easy way of correctly extracting the names of the mapping keys under all circumstances.

So my conclusion is that there is no "graceful" way to obtain the names of the mapping keys for a Python 2.7 "%" format string.

The following code uses a regular expression to provide a partial solution that covers most common usage:

import re
class StringFormattingParser(object):
    __matcher = re.compile(r'(?<!%)%\(([^)]+)\)[-# +0-9.hlL]*[diouxXeEfFgGcrs]')
    @classmethod
    def getKeyNames(klass, formatString):
        return klass.__matcher.findall(formatString)

# Demonstration of use with some sample format strings
for value in [
    '%(a)s and %(b)s are friends.',
    '%%(nomatch)i',
    '%%',
    'Another %(matched)+4.5f%d%% example',
    '(%(should_match(but does not))s',
    ]:
    print StringFormattingParser.getKeyNames(value)

# Note the following prints out "really does match"!
print '%(should_match(but does not))s' % {'should_match(but does not)': 'really does match'}

P.S. DRY = Don't Repeat Yourself (https://en.wikipedia.org/wiki/Don%27t_repeat_yourself)

Ilia w495 Nikitin · Accepted Answer · 2016-02-18 04:33:39Z

Also, you can reduce this %-task to Formater-solution.

>>> import re
>>> from string import Formatter
>>> 
>>> string = '%(a)s and %(b)s are friends.'
>>> 
>>> string = re.sub('((?<!%)%(\((\w)\)s))', '{\g<3>}',  string)
>>> 
>>> tuple(fn[1] for fn in Formatter().parse(string) if fn[1] is not None)
('a', 'b')
>>>

In this case you can use both variants of formating, I suppose.

The regular expression in it depends on what you want.

>>> re.sub('((?<!%)%(\((\w)\)s))', '{\g<3>}', '%(a)s and %(b)s are %(c)s friends.')
'{a} and {b} are {c} friends.'
>>> re.sub('((?<!%)%(\((\w)\)s))', '{\g<3>}', '%(a)s and %(b)s are %%(c)s friends.')
'{a} and {b} are %%(c)s friends.'
>>> re.sub('((?<!%)%(\((\w)\)s))', '{\g<3>}', '%(a)s and %(b)s are %%%(c)s friends.')
'{a} and {b} are %%%(c)s friends.'

Adi Levin · Accepted Answer · 2016-01-19 13:15:39Z

0

You could also do this:

[y[0] for y in [x.split(')') for x in s.split('%(')] if len(y)>1]

answered Jan 19, 2016 at 13:15

Adi Levin

5,3031 gold badge19 silver badges26 bronze badges

2 Comments

BlackJack Over a year ago

Just like the regex in the question this fails on '%%(a)s'.

Adi Levin Over a year ago

What's the exact requirement then? Besides %(a)s, what are the other kinds of expressions we need to be able to parse? %%(a)s? Anything else?

RootTwo · Accepted Answer · 2016-01-23 02:53:33Z

0

Don't know if this qualifies as graceful in your book, but here's a short function that parses out the names. No error checking, so it will fail for malformed format strings.

def get_names(s):
    i = s.find('%')
    while 0 <= i < len(s) - 3:
        if s[i+1] == '(':
            yield(s[i+2:s.find(')', i)])
        i = s.find('%', i+2)

string = 'abd %(one) %%(two) 99 %%%(three)'
list(get_names(string) #=> ['one', 'three']

answered Jan 23, 2016 at 2:53

RootTwo

4,4361 gold badge13 silver badges15 bronze badges

Collectives™ on Stack Overflow

How to get the names of the named variables from the python string

4 Answers 4

Comments

Comments

2 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

Comments

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related