Python parse string with regex for constitute a dictionary

Question

I need extract the following string in Python to constitute a dictionary:

2014:02:02-12:24:17 NAMETEST ulogd[4834]: id="xxxx" severity="xxxx" sys="xxxx" sub="xxxx" name="xxxx aaaa" action="xxxx" fwrule="xxxx" outitf="xxxx" srcmac="xxxx" srcip="xxxx" dstip="xxxx" proto="x" length="xxxx" tos="xxxx" prec="xxxx" ttl="xx" srcport="xxxx" dstport="xxxx" tcpflags="xxxx"

I do not use split(' ') with space, because for example, the field name="xxxx aaaa" can contain a space.

first with the following regex I have extracted the data only:

re.findall('"([^"]*)"', line)

But now I need to used an dictionary format like: line['id'] = 1111.

So the regex? Have you an idea?

alecxe · Accepted Answer · 2015-06-09 12:23:26Z

2

You can use re.findall() to find the key value pairs:

>>> import re
>>> groups = re.findall(r'(\w+)="(.*?)"', s)
>>> line = dict(groups)
>>>
>>> from pprint import pprint
>>> pprint(line)
{'action': 'xxxx',
 'dstip': 'xxxx',
 'dstport': 'xxxx',
 'fwrule': 'xxxx',
 'id': 'xxxx',
 'length': 'xxxx',
 'name': 'xxxx aaaa',
 'outitf': 'xxxx',
 'prec': 'xxxx',
 'proto': 'x',
 'severity': 'xxxx',
 'srcip': 'xxxx',
 'srcmac': 'xxxx',
 'srcport': 'xxxx',
 'sub': 'xxxx',
 'sys': 'xxxx',
 'tcpflags': 'xxxx',
 'tos': 'xxxx',
 'ttl': 'xx'}

(\w+)="(.*?)" would match one or more alphanumeric characters (the \w+ part), followed by =", followed by any characters (.*?, non-greedy), followed by ". Parenthesis here define capturing groups.

edited Jun 9, 2015 at 12:23

answered Jun 9, 2015 at 12:16

alecxe

476k127 gold badges1.1k silver badges1.2k bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Python parse string with regex for constitute a dictionary

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related