2

I am trying to capture List[int] (list of integers which might be seperated by a comma) in a string. However I am not getting the expected result.

>>> txt = '''Automatic face localisation is the prerequisite step of 
facial image analysis for many applications such as facial attribute 
(e.g. expression [64] and age [38]) and facial identity
recognition [45, 31, 55, 11]. A narrow definition of face localisation 
may refer to traditional face detection [53, 62], '''

output

>>> re.findall(r'[(\b\d{1,3}\b,)+]',txt)
['(', '6', '4', '3', '8', ')', '4', '5', ',', '3', '1', ',', '5', '5', ',', '1', '1', '5', '3', ',', '6', '2', ',']

What should be the expression to capture the below output.

Expected output:

['[64]', '[38]', '[45, 31, 55, 11]', '[53, 62]']
0

2 Answers 2

2

You may try:

\[[\d, ]*?]

Explanation of the above regex:

Pictorial Representation

Please find the demo of the above regex in here.

Sample Implementation in python

import re

regex = r"\[[\d, ]*?]"

test_str = ("Automatic face localisation is the prerequisite step of facial image analysis for many applications such as facial attribute (e.g. expression [64] and age [38]) and facial identity\n"
    "... recognition [45, 31, 55, 11]. A narrow definition of face localisation may refer to traditional face detection [53, 62]")

print(re.findall(regex, test_str))
# Outputs: ['[64]', '[38]', '[45, 31, 55, 11]', '[53, 62]']

You can find the sample run of the above code in here.

Sign up to request clarification or add additional context in comments.

Comments

2

You can match 1-3 digits. Then repeat 0+ times matching a comma, 0+ spaces and again 1-3 digits.

\[\d{1,3}(?:, *\d{1,3})*]
  • \[ Match {
  • \d{1,3} Match 1-3 digits
  • (?: Non capture group
    • , *\d{1,3}
  • )* Close the group and repeat it 0+ times
  • ] Match ]

Regex demo | Python demo

Example

import re

txt = '''Automatic face localisation is the prerequisite step of facial image analysis for many applications such as facial attribute (e.g. expression [64] and age [38]) and facial identity
... recognition [45, 31, 55, 11]. A narrow definition of face localisation may refer to traditional face detection [53, 62],
... '''

print (re.findall(r'\[\d{1,3}(?:, *\d{1,3})*]',txt))

Output

['[64]', '[38]', '[45, 31, 55, 11]', '[53, 62]']

If there can be more digits and spaces on all sides, including continuing the sequence on a newline:

\[\s*\d+(?:\s*,\s*\d+)*\s*]

Regex demo

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.