Capture a list of integers in a string using regex python

Question

I am trying to capture List[int] (list of integers which might be seperated by a comma) in a string. However I am not getting the expected result.

>>> txt = '''Automatic face localisation is the prerequisite step of 
facial image analysis for many applications such as facial attribute 
(e.g. expression [64] and age [38]) and facial identity
recognition [45, 31, 55, 11]. A narrow deﬁnition of face localisation 
may refer to traditional face detection [53, 62], '''

output

>>> re.findall(r'[(\b\d{1,3}\b,)+]',txt)
['(', '6', '4', '3', '8', ')', '4', '5', ',', '3', '1', ',', '5', '5', ',', '1', '1', '5', '3', ',', '6', '2', ',']

What should be the expression to capture the below output.

Expected output:

['[64]', '[38]', '[45, 31, 55, 11]', '[53, 62]']

score 2 · Accepted Answer · 2020-07-08 18:01:55Z

2

You may try:

\[[\d, ]*?]

Explanation of the above regex:

Please find the demo of the above regex in here.

Sample Implementation in python

import re

regex = r"\[[\d, ]*?]"

test_str = ("Automatic face localisation is the prerequisite step of facial image analysis for many applications such as facial attribute (e.g. expression [64] and age [38]) and facial identity\n"
    "... recognition [45, 31, 55, 11]. A narrow deﬁnition of face localisation may refer to traditional face detection [53, 62]")

print(re.findall(regex, test_str))
# Outputs: ['[64]', '[38]', '[45, 31, 55, 11]', '[53, 62]']

You can find the sample run of the above code in here.

edited Jul 8, 2020 at 18:01

answered Jul 8, 2020 at 17:45

user7571182

Sign up to request clarification or add additional context in comments.

Comments

The fourth bird · Accepted Answer · 2020-07-08 20:25:36Z

You can match 1-3 digits. Then repeat 0+ times matching a comma, 0+ spaces and again 1-3 digits.

\[\d{1,3}(?:, *\d{1,3})*]

\[ Match {
\d{1,3} Match 1-3 digits
(?: Non capture group
- , *\d{1,3}
)* Close the group and repeat it 0+ times
] Match ]

Regex demo | Python demo

Example

import re

txt = '''Automatic face localisation is the prerequisite step of facial image analysis for many applications such as facial attribute (e.g. expression [64] and age [38]) and facial identity
... recognition [45, 31, 55, 11]. A narrow deﬁnition of face localisation may refer to traditional face detection [53, 62],
... '''

print (re.findall(r'\[\d{1,3}(?:, *\d{1,3})*]',txt))

Output

['[64]', '[38]', '[45, 31, 55, 11]', '[53, 62]']

If there can be more digits and spaces on all sides, including continuing the sequence on a newline:

\[\s*\d+(?:\s*,\s*\d+)*\s*]

Regex demo

Collectives™ on Stack Overflow

Capture a list of integers in a string using regex python

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related