-1

I'm trying to write a regex pattern in Python to capture two groups, where the second group is optional, but I want the groups to remain distinct.

Here is are examples of the possible pattern I want to match:

Case 1: 1.2.1 Mickey Mouse (3-400-1-Z)

  • Group 1: "Mickey Mouse"
  • Group 2: "3-400-1-Z"

There are cases where the parentheses and their contents are missing, i.e., Case 2: 1.2.1 Mickey Mouse

  • Group 1: "Mickey Mouse"
  • Group 2 should return None

Here's what I have for my current regex: 1\.2\.\d\s+(.*)(?:\s*\((\d+-\d+-\d+-[A-Z])\))

This creates my desired groups, but if there is a Case 2, then it does not match. Also, if I add a '?' making the second group optional, it just merges my desired groups into one (i.e., returning Mickey Mouse (3-400-1-Z) as Group 1)

So, how can I create a regex in Python to properly match these two groups while keeping group 2 optional and independent of group 1? Is there a way to make the optional group work correctly without merging into group 1?

0

1 Answer 1

0

When you add a '?' making the second group optional, your group 1 will match as much as possible (see about Greedy or Lazy). So, in your group 1, adding a '?' like that (.*?) will make it matches as less as possible.
Then add a $ to match until the end of the line.

1\.2\.\d\s+(.*?)(?:\s*\((\d+-\d+-\d+-[A-Z])\))?$
Sign up to request clarification or add additional context in comments.

2 Comments

What about cases when the desired text runs onto a new line?
@user1142252 Use the re.DOTALL flag to allow . to match newlines.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.