0

I have the following regex:

(\b)(con)    

This matches:

.con
con

But I only want to match the second line 'con' not '.con'.

This then needs expanding to enable me to match alternative words (CON|COM1|LPT1) etc. And in those scenarios, I need to match the dot afterwards and potentially file extensions too. I have regex for these. I am attempting to understand one part of the expression at a time.

How can I tighten what I've got to give me the specific match I require?

2
  • you need to add ^ prefix to your regexp Commented Dec 15, 2013 at 21:00
  • Maybe you should look at, os.path.splitext. Commented Dec 16, 2013 at 16:24

2 Answers 2

8

Edit:

You can use non-delimited capture groups and re.match (which is anchored to the start of the string):

>>> from re import match
>>> strs = ["CON.txt", "LPT1.png", "COM1.html", "CON.jpg"]
>>> # This can be customized to what you want
>>> # Right now, it is matching .jpg and .png files with the proper beginning
>>> [x for x in strs if match("(?:CON|COM1|LPT1)\.(?:jpg|png)$", x)]
['LPT1.png', 'CON.jpg']
>>>

Below is a breakdown of the Regex pattern:

(?:CON|COM1|LPT1)  # CON, COM1, or LPT1
\.                 # A period
(?:jpg|png)        # jpg or png
$                  # The end of the string

You may also want to add (?i) to the start of the pattern in order to have case-insensitive matching.

Sign up to request clarification or add additional context in comments.

5 Comments

I understand how this works but it doesn't answer my specific requirements. I have expanded my question to improve the context I gave initially. Let me know if you still think this adequately answers my question.
@RossSpencer - No. For your new requirements, I'd use Regex. See my edit. Is that what you want?
Thanks. The reason I asked for regex in the first place! :) I appreciate the revisit of your answer.
@RossSpencer - Happy to have helped! If you are all set, then I ask that you please don't forget to accept an answer. Otherwise, I'd be glad to help further.
+1, and for an obnoxious optimization: (?:CO(?:M1|N)|LPT1))\.(?:jp|pn)g$ :)
3

^ matches start of a string:

^con

would work.

1 Comment

How do I expand this to increase the word list, say CON|ABC|DCE ?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.