0

I'm trying to write a regexp to use in a parser. Specifically, I'd like to be able to parse strings formatted like so:

[SOME-WORD "Quoted string"]

Currently, I'm trying the following expression:

(?P<capital-item>
\[SOME-WORD(\ ?)\"
  (?P<quoted-string>\w+)
\"(\ ?)\])

I'm using python, and re.compile to get a scanner. Once compiled, the regexp doesn't match the example string I gave above. What am I messing up here?

4
  • Sorry, typo. That's not it Commented May 3, 2013 at 13:47
  • 2
    \w does not match the space between "Quoted" and "string". \w is equivalent to [a-zA-Z0-9_] (no space). Commented May 3, 2013 at 13:49
  • 3
    Group names must be valid Python identifiers. Neither capital-item nor quoted-string are. Commented May 3, 2013 at 13:49
  • 1
    What do you wish to get out of it? An example result would be nice. Is "SOME-WORD" a static part of this? Commented May 3, 2013 at 13:49

1 Answer 1

3
>>> import re
>>> text = '[SOME-WORD "Quoted string"]'
>>> pat = r'\[(?P<capitalitem>SOME-WORD)(\ ?)\"(?P<quotedstring>[\w\s]+)\"(\ ?)\]'
>>> re.search(pat, text).groupdict()
{'capitalitem': 'SOME-WORD', 'quotedstring': 'Quoted string'}
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.