0

I am trying to parse a command output, which looks like:

2.437 GHz (Channel 6)
Quality=39/70  Signal level=-71 dBm
Encryption key:on
ESSID:"testssid"
IE: IEEE 802.11i/WPA2 Version 1
IE: WPA Version 1

..and essentially convert it to:

channel = 6
quality = "39/70"
signal = -71
encryption = true
essid = "testssid"
wpa = true

I am not particularly good with regular expressions but here's my attempt at extracting these fields:

    m = re.search('Channel (.+)\)', n)
    if m:
            print m.group(1)

    m = re.search('Quality\=(.{5})', n)
    if m:
            print m.group(1)

    m = re.search('level\=(.+)', n)
    if m:
            print m.group(1)

    m = re.search('key\:(.+)', n)
    if m:
            print m.group(1)

    m = re.search('ESSID\:\"(.+?)\"', n)
    if m:
            print m.group(1)

This outputs:

6
39/70
-71 dBm
off
testssid

There are two problems: First one is the 'Quality' value, as I have a hardcoded value that might break if the match is shorter than 5 characters and the second one is the 'signal value', which I rather have without the "dBM" portion. I guess in both cases I'd like to match until the next whitespace character, couldn't get it working with \s though.

Also, having a few re.search operations look cluttered and messy, is there a way to combine these or tidy it up generally?

Thanks.

2
  • 1
    The problem is that you use always the dot . instead of the appropriate character class (for example [0-9/] for the quality item). With a more descriptive pattern you will gain speed and security. Second thing, if the informations are always in the same order and format, you can try to extract all that you want in a single pattern (use named captures). Or you can try to read your string line by line. (the idea is to avoid to search the full string for each field you need) Commented May 21, 2014 at 16:16
  • @CasimiretHippolyte Thanks for the pointer, I've fixed the code snippet to use appropriate character classes now. Commented May 21, 2014 at 17:10

1 Answer 1

2
re.search('Quality\=(\d+/\d+)', n) #matches a number a slash and a number #/#
re.search('level\=([+-]?\d+)', n) #matches 1 or more numbers so ignore dbm

to clean it up you could do

patterns = {'quality':'Quality\=(\d+/\d+)',
            'level': 'level\=([+-]?\d+)',
            'key':'key\:(.+)',
            'channel':'Channel (.+)\)'}
body_of_text = open("somefile.txt").read()
results = dict([(key,re.search(regex,body_of_text).group(1)) for  key,regex in patterns.items()])
print results
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks, The regex matches are what I originally wanted them to be with your fixes. Regarding the second bit, I am getting a ValueError: too many values to unpack when I run it, any ideas?
yeah I screwed up ... I edited ... it should work now

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.