0

I'm using regex library 're' in Python (2.7) to validate a flight number.

I've had no issues with expected outputs using a really helpful online editor here: http://regexr.com/

My results on regexr.com are: https://i.sstatic.net/YC7ra.jpg

My code is:

import re
test1 = 'ba116'
###Referencelink: http://academe.co.uk/2014/01/validating-flight-codes/
p = re.compile('/^([a-z][a-z]|[a-z][0-9]|[0-9][a-z])[a-z]?[0-9]{1,4}[a-z]?$/g')
m = p.search(test1)  # p.match() to find from start of string only
if m:
print 'It works!: ', m.group()  # group(1...n) for capture groups
else:
print 'Did not work'

I'm unsure why I get the 'didn't work' output where regexr shows one match (as expected)

I made a much simpler regex lookup, and it seemed that the results were correct, so it seems either my regex string is invalid, or I'm using re.complile (or perhaps the if loop) incorrectly?

'ba116' is valid, and should match.

6
  • 1
    Is your code really indented like that? It should be throwing a syntax error. Commented Oct 13, 2016 at 17:00
  • 1
    Remove the / you don't need them in python. Thats probably why its not working Commented Oct 13, 2016 at 17:05
  • Yes. As in: imgur.com/QqK3HsX - I don't get any errors; code executes and ends with 'Process finished with exit code 0' Commented Oct 13, 2016 at 17:06
  • Removed '/' resulting in: p = re.compile('^([a-z][a-z]|[a-z][0-9]|[0-9][a-z])[a-z]?[0-9]{1,4}[a-z]?$g') - but exact same result Commented Oct 13, 2016 at 17:08
  • 1
    ... So, in fact, your code is NOT indented the way it is shown in the text of the question. Python is an offside-rule language; it is critical to preserve its indentation when providiing people with code samples. Commented Oct 13, 2016 at 17:09

1 Answer 1

1

Python's re.compile is treating your leading / and trailing /g as part of the regular expression, not as delimiters and modifiers. This produces a compiled RE that will never match anything, since you have ^ with stuff before it and $ with stuff after it.

The first argument to re.compile should be a string containing only the stuff you would put inside the slashes in a language that had /.../ regex notation. The g modifier corresponds to calling the findall method on the compiled RE; in this case it appears to be unnecessary. (Some of the other modifiers, e.g. i, s, m, correspond to values passed to the second argument to re.compile.)

So this is what your code should look like:

import re
test1 = 'ba116'
###Referencelink: http://academe.co.uk/2014/01/validating-flight-codes/
p = re.compile(r'^([a-z][a-z]|[a-z][0-9]|[0-9][a-z])[a-z]?[0-9]{1,4}[a-z]?$')
m = p.search(test1)  # p.match() to find from start of string only
if m:
    print 'It works!: ', m.group()  # group(1...n) for capture groups
else:
    print 'Did not work'

The r immediately before the open quote makes no difference for this regular expression, but if you needed to use backslashes in the RE it would save you from having to double all of them.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.