3

I am using python to parse Postfix logfiles. I need to match lines containing any of multiple patterns, and extract IP address if line matches

ip = re.search('^warning: Connection rate limit exceeded: [0-9]* from .*\[([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3})\] for service smtp', message)
if not ip:
    ip = re.search('^NOQUEUE: reject: RCPT from .*\[([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3})\]: .*: Relay access denied; .*', message)
    if not ip:
        ip = re.search('^NOQUEUE: reject: RCPT from .*\[([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3})\]: .*:  Recipient address rejected: .*', message)
...
...
print ip.group(1)

Any line will only ever match one pattern. I know that normaly I can use '(pattern1|pattern2|pattern3)' to match any of multiple patterns, but since I am alredy using parenthesis () to group the IP address which I want to extract, I don't know how to do that.

I will have quite a lot of patterns to match. What would be the most clean/elegant way to do it ?

1
  • Please include your input, current output and expected output. Commented Mar 7, 2016 at 13:55

1 Answer 1

3

You can use a non-capturing group:

patterns = [
    "warning: Connection rate limit exceeded: [0-9]* from .*\[([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3})\] for service smtp",
    "NOQUEUE: reject: RCPT from .*\[([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3})\]: .*: Relay access denied; .*",
    "NOQUEUE: reject: RCPT from .*\[([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3})\]: .*:  Recipient address rejected: .*"
]
pattern = re.compile("^(?:" + "|".join(patterns) + ")")
ip = pattern.search(message)
Sign up to request clarification or add additional context in comments.

5 Comments

what does "^(?:" in the re.compile expression mean ?
@MartinVegter ^ is the beginning of the string, (?:) is a non-capturing group itself.
it almost works, but when I do if ip: print ip.group(1), then it prints None. Obviously, it matched my pattern (because if ip evaluated as true), but it does not print the matched ip.
I see what's happening: ip.group(1) is None when the second pattern matched. In which case the ip is in ip.group(2). But I don't know which of my patterns matched. How can I print ip.group(n), where n is the only non-None value ?
@MartinVegter got it, have you tried looking into groups()? Thanks.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.