1

I would like to create a dict by parsing a string

<brns ret = "Herld" other = "very">
<brna name = "ame1">

I would like to create a dict that has the following key-value pairs:

dict = {'brnsret': 'Herld', 
        'brnsother':'very',
        'brnaname':'ame1'}

I have a working script that can handle this:

<brns ret = "Herld">
<brna name = "ame1">

my Code to generate the dict:

match_tag = re.search('<(\w+)\s(\w+) = \"(\w+)\">', each_par_line)
if match_tag is not None:


   dict_tag[match_tag.group(1)+match_tag.group(2)] = match_tag.group(3)

But how should I tweak my script to handle more than one attribute pair in a tag?

Thanks

1
  • Why don't you try split the input by space and use list.index() to find '=', get the index (say, a) and generate dict with left and right operand (i.e. in the list, at a-1, a+1) of equals sign? (since I'm not so familiar with regex) Commented Jun 21, 2016 at 1:42

1 Answer 1

2

An alternative option and, probably, just for educational reasons - you can pass this kind of string into a lenient HTML parser like BeautifulSoup:

from bs4 import BeautifulSoup

data = """
<brns ret = "Herld" other = "very">
<brna name = "ame1">
"""

d = {tag.name + attr: value
     for tag in BeautifulSoup(data, "html.parser")()
     for attr, value in tag.attrs.items()}
print(d)

Prints:

{'brnaname': 'ame1', 'brnsother': 'very', 'brnsret': 'Herld'}
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.