19

Here is my problem: in a variable that is text and contains commas, I try to delete only the commas located between two strings (in fact [ and ]). For example using the following string:

input =  "The sun shines, that's fine [not, for, everyone] and if it rains, it Will Be better."
output = "The sun shines, that's fine [not for everyone] and if it rains, it Will Be better."

I know how to use .replace for the whole variable, but I can not do it for a part of it. There are some topics approaching on this site, but I did not manage to exploit them for my own question, e.g.:

0

4 Answers 4

33
import re
Variable = "The sun shines, that's fine [not, for, everyone] and if it rains, it Will Be better."
Variable1 = re.sub("\[[^]]*\]", lambda x:x.group(0).replace(',',''), Variable)

First you need to find the parts of the string that need to be rewritten (you do this with re.sub). Then you rewrite that parts.

The function var1 = re.sub("re", fun, var) means: find all substrings in te variable var that conform to "re"; process them with the function fun; return the result; the result will be saved to the var1 variable.

The regular expression "[[^]]*]" means: find substrings that start with [ (\[ in re), contain everything except ] ([^]]* in re) and end with ] (\] in re).

For every found occurrence run a function that convert this occurrence to something new. The function is:

lambda x: group(0).replace(',', '')

That means: take the string that found (group(0)), replace ',' with '' (remove , in other words) and return the result.

Sign up to request clarification or add additional context in comments.

3 Comments

@user1453786: with all due respect, @Qtax answer is far better. Functional sub is a useful technique to know, but in this case it's clearly overkill.
@thg435: No, it is not. Because it will not work, for example, for unbalanced brackets. Qtax checks with the lookahead assertion only finishing part of the expression and that is wrong. Please try "not, for, everyone] and if it rains, it Will [a,c]," and you will see it yourself. Of course one can add lookbehind assertion too, but that will be not so simple anymore
Great explanation for a great answer
4

You can use an expression like this to match them (if the brackets are balanced):

,(?=[^][]*\])

Used something like:

re.sub(r",(?=[^][]*\])", "", str)

1 Comment

I like your answer, it is very clean and effective but I think it has one disadvantage. I think that will not work for unbalanced brackets. You check only the finishing part of the expression. It would be really super to solve this task with lookahead/lookbehind assertions but I'm not sure that that will be so elegant as now.
0

Here is a non-regex method. You can replace your [] delimiters with say [/ and /], and then split on the / delimiter. Then every odd string in the split list needs to be processed for comma removal, which can be done while rebuilding the string in a list comprehension:

>>> Variable = "The sun shines, that's fine [not, for, everyone] and if it rains,
                it Will Be better."
>>> chunks = Variable.replace('[','[/').replace(']','/]').split('/')
>>> ''.join(sen.replace(',','') if i%2 else sen for i, sen in enumerate(chunks))
"The sun shines, that's fine [not for everyone] and if it rains, it Will Be 
 better."

Comments

-1

If you don't fancy learning regular expressions (see other responses on this page), you can use the partition command.

sentence = "the quick, brown [fox, jumped , over] the lazy dog"
left, bracket, rest = sentence.partition("[")
block, bracket, right = rest.partition("]")

"block" is now the part of the string in between the brackets, "left" is what was to the left of the opening bracket and "right" is what was to the right of the opening bracket.

You can then recover the full sentence with:

new_sentence = left + "[" + block.replace(",","") + "]" + right
print new_sentence # the quick, brown [fox jumped over] the lazy dog

If you have more than one block, you can put this all in a for loop, applying the partition command to "right" at every step.

Or you could learn regular expressions! It will be worth it in the long run.

4 Comments

This method will not work when you have more than one "special" section in the string (e.g.: "a,b [c,d] e,f [g,e] h,i")
Don't call variables string; it'll confuse the hell out of developers that expect it to be the string module from the python stdlib.
Also, why not use .split('[', 1) here, you are tossing the brackets anyway. And the second .partition call should be on rest, not on string, so this code won't work at all. string ends up as "the quick, brown [the quick brown [fox jumped over] the lazy dog".
Edited to remove typo and change variable names. Thanks for the feedback.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.