2

I have a string :

line = "[kossor],(blommor),{skor},kossor,blommor,skor"

I want to write a pattern that matches the characters ()[] and {} and the words inside, like this:

['[kossor]', '(blommor)', '{skor}']

I used this method:

ligne = "[kossor],(blommor),{skor},kossor,blommor,skor"
pattern = "\(([^\)]+)\)" 
ANSWER = re.findall(pattern, ligne)

I got this :

["blommor"]

Any ideas? thanks!

1
  • does the input line always has matching () [] {}? or can it have something like (blommor]? also, are all strings separated by ,? for given sample ip/op, even a simple re.findall(r'[\[({][^,]+', line) would do Commented Mar 5, 2017 at 14:11

4 Answers 4

5

You may use this pattern

pattern = r"([\[\(\{].*?[\]\)\}])"

Code

import re
pattern = r"([\[\(\{].*?[\]\)\}])"
ligne = "[kossor],(blommor),{skor},kossor,blommor,skor"
re.findall(pattern,ligne)

Output

 ['[kossor]', '(blommor)', '{skor}']
Sign up to request clarification or add additional context in comments.

4 Comments

That doesn't work. If you inject a non matching string between any of what you are trying to match, that regex breaks. Try: [kossor],(blommor), chicken, {skor},, and I don't think that will work.
@idjaw updated the code for working with non-matching strings inside thanks
One last thing. Your output now should be: ['[kossor]', '(blommor)', '{skor}']
@idjaw that was a typo. Now copy pasted from console .Thanks for pointing it out
3

Suppose we want to be strict: we want to match [abc] and (abc), but not ill-formed things such as [abc). We can use a regex like this:

pattern = r'\([^)]+\)|\[[^]]+\]|{[^}]+}'

Essentially saying, match (...), or [...], or {...}; but do not match strings with mismatched bracket types.

This may lead to unwanted results. For example:

ligne = "[kossor],(blommor),{skor},kossor,blommor,skor,[abc),(abc]"
print(re.findall(pattern, ligne))

Result:

['[kossor]', '(blommor)', '{skor}', '[abc),(abc]']

Whether you want to capture such results or not depends on your data and purpose. You could add ^, to the character class to stop matching if it hits a comma inside the brackets:

pattern = r'\([^),]+\)|\[[^],]+\]|{[^},]+}'

Comments

1

Use following regex (character class):

In [10]: re.findall(r'[\[({][^\]})]+[\]}\)]', line)
Out[10]: ['[kossor]', '(blommor)', '{skor}']

Comments

0

Use sub and split:

 re.sub(r'(?<=\})(.*$)','',line).split(',')
 Out[23]: ['[kossor]', '(blommor)', '{skor}']

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.