0

Basically, I have a file like this:

Url/Host:   www.example.com
Login:     user
Password:   password
Data_I_Dont_Need:    something_else

How can I use RegEx to separate the details to place them into variables?

Sorry if this is a terrible question, I can just never grasp RegEx. So another question would be, can you provide the RegEx, but kind of explain what each part of it is for?

1
  • 2
    Using str.split(":") is not an option? Commented May 16, 2010 at 19:02

5 Answers 5

1

You should put the entries in a dictionary, not in so many separate variables -- clearly, the keys you're using need NOT be acceptable as variable names (that slash in 'Url/Host' would be a killer!-), but they'll be just fine as string keys into a dictionary.

import re

there = re.compile(r'''(?x)      # verbose flag: allows comments & whitespace
                       ^         # anchor to the start
                       ([^:]+)   # group with 1+ non-colons, the key
                       :\s*      # colon, then arbitrary whitespace
                       (.*)      # group everything that follows
                       $         # anchor to the end
                    ''')

and then

 configdict = {}
 for aline in open('thefile.txt'):
   mo = there.match(aline)
   if not mo:
     print("Skipping invalid line %r" % aline)
     continue
   k, v = mo.groups()
   configdict[k] = v

the possibility of making RE patterns "verbose" (by starting them with (?x) or using re.VERBOSE as the second argument to re.compile) is very useful to allow you to clarify your REs with comments and nicely-aligning whitespace. I think it's sadly underused;-).

Sign up to request clarification or add additional context in comments.

5 Comments

Nice answer and great explanation. I think I'd like potential whitespace on the value removed. I believe that could be done by adding \s* between the value group and the end-of-line anchor '$'?
AttributeError: 'NoneType' object has no attribute 'group'
@Rob, you mean groups, not group. Yes, I forgot to add the continue obviously needed to do the skip, let me add it. BTW, your question doesn't mention that there can be lines that don't match this pattern, and what to do when such lines are found -- please edit your Q to add this crucial information!
@extraneon, if you want to remove trailing whitespace on the value, change the end of the RE's pattern to (.*?)\s*$. The ? here is crucial as it tells the RE to do the star-match non-greedily: without it, it would still match the trailing whitespace as part of this group!
Sorry, didn't realize it matted. Edited it
1

For a file as simple as this you don't really need regular expressions. String functions are probably easier to understand. This code:

def parse(data):
    parsed = {}    
    for line in data.split('\n'):
        if not line: continue # Blank line
        pair = line.split(':')
        parsed[pair[0].strip()] = pair[1].strip()
    return parsed

if __name__ == '__main__':
    test = """Url/Host:   www.example.com
    Login:     user
    Password:   password
"""
    print parse(test)

Will do the job, and results in:

{'Login': 'user', 'Password': 'password', 'Url/Host': 'www.example.com'}

Comments

0

Well, if you don't know about regex, simply change you file like this:

Host = www.example.com
Login = uer
Password = password

And use ConfigParser python module http://docs.python.org/library/configparser.html

1 Comment

ConfigParser supports : delimiter stackoverflow.com/questions/2845018/…
0

EDIT: Better Solution

for line in input: 
    key, val = re.search('(.*?):\s*(.*)', line).groups()

Comments

0

ConfigParser module supports ':' delimiter.

import ConfigParser
from cStringIO import StringIO

class Parser(ConfigParser.RawConfigParser):
    def _read(self, fp, fpname):
        data = StringIO("[data]\n"+fp.read()) 
        return ConfigParser.RawConfigParser._read(self, data, fpname)

p = Parser()
p.read("file.txt")
print dict(p.items("data"))

Output:

{'login': 'user', 'password': 'password', 'url/host': 'www.example.com'}

Though a regex or manual parsing might be more appropriate in your case.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.