Python: regex parse text to create dict

Question

I have a problem with one task:

I have output from cisco hw.

IP access list 100  
        10 permit igmp any any  
        20 deny any any  
IP access list 200  
        10 permit ip 192.168.1.1/32   
        20 permit ip 192.168.2.1/32 any  
        30 permit ip 192.168.3.3/32 any  
        40 deny any any

The task is to make a dict with access list number as key and access list rule number as value.

acl_dict = {'100' : '10', '100' : '20','200': '10', '200': '20', '200': '30', '200': '40'}

I have written a regex:

rx = re.compile("""
                   list\s(.*)[\n\r]
                   \s{4}(\d{1,3}).+$
                 """,re.MULTILINE|re.VERBOSE)
         for match in rx.finditer(text):
             print (match.group(1))
             print (match.group(2))

But is shows only number from first two strings (100 and 10) I need to modify somehow regex to match all numbers to make needed dict. Can anyone help ?

Alberto Re · Accepted Answer · 2016-10-26 11:32:34Z

2

It's possible to do it with a single method by using the newest regex module:

import regex

text = """
IP access list 100  
    10 permit igmp any any  
    20 deny any any  
IP access list 200  
    10 permit ip 192.168.1.1/32   
    20 permit ip 192.168.2.1/32 any  
    30 permit ip 192.168.3.3/32 any  
    40 deny any any
"""

acl_dict = {}
rx = regex.compile("list\s(.+)[\n\r](\s{4}(\d{1,3}).+[\n\r])*", regex.MULTILINE|regex.VERBOSE)
for match in rx.finditer(text):
    acl_dict[match.group(1)] = match.captures(3)

print(acl_dict)

Output:

$ python3 match.py 
{'200  ': ['10', '20', '30', '40'], '100  ': ['10', '20']}

answered Oct 26, 2016 at 11:32

Alberto Re

5143 silver badges9 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Wiktor Stribiżew · Accepted Answer · 2016-10-26 11:20:04Z

You may extract full blocks first, and then get the leading numbers from the inner parts (that can be captured).

Use

r'(?sm)IP access list\s+(\d+)(.*?)(?=^IP access list|\Z)'

See the regex demo.

Details:

(?sm) - enable the DOTALL and MULTILINE modes
IP access list - a literal string IP access list (can be prepended with ^ if it is always at the line start)
\s+ - 1 or more whitespaces
(\d+) - Group 1: one or more digits
(.*?) - Group 2: any 0+ chars as few as possible up to the first...
(?=^IP access list|\Z) - IP access list at the start of a line or end of string (\Z).

Python sample code:

import re
input_str = "IP access list 100  \n        10 permit igmp any any  \n        20 deny any any  \nIP access list 200  \n        10 permit ip 192.168.1.1/32   \n        20 permit ip 192.168.2.1/32 any  \n        30 permit ip 192.168.3.3/32 any  \n        40 deny any any"
results = {}
for match in re.finditer(r"(?sm)IP access list\s+(\d+)(.*?)(?=^IP access list|\Z)", input_str):
    fields = re.findall(r"(?m)^\s*(\d+)", match.group(2))
    results[match.group(1)] = fields
print(results) # => {'200': ['10', '20', '30', '40'], '100': ['10', '20']}

Collectives™ on Stack Overflow

Python: regex parse text to create dict

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related