Replace a string located between

Question

Here is my problem: in a variable that is text and contains commas, I try to delete only the commas located between two strings (in fact [ and ]). For example using the following string:

input =  "The sun shines, that's fine [not, for, everyone] and if it rains, it Will Be better."
output = "The sun shines, that's fine [not for everyone] and if it rains, it Will Be better."

I know how to use .replace for the whole variable, but I can not do it for a part of it. There are some topics approaching on this site, but I did not manage to exploit them for my own question, e.g.:

Matthew Moisen · Accepted Answer · 2018-06-16 23:24:15Z

33

import re
Variable = "The sun shines, that's fine [not, for, everyone] and if it rains, it Will Be better."
Variable1 = re.sub("\[[^]]*\]", lambda x:x.group(0).replace(',',''), Variable)

First you need to find the parts of the string that need to be rewritten (you do this with re.sub). Then you rewrite that parts.

The function var1 = re.sub("re", fun, var) means: find all substrings in te variable var that conform to "re"; process them with the function fun; return the result; the result will be saved to the var1 variable.

The regular expression "[[^]]*]" means: find substrings that start with [ (\[ in re), contain everything except ] ([^]]* in re) and end with ] (\] in re).

For every found occurrence run a function that convert this occurrence to something new. The function is:

lambda x: group(0).replace(',', '')

That means: take the string that found (group(0)), replace ',' with '' (remove , in other words) and return the result.

edited Jun 16, 2018 at 23:24

Matthew Moisen

18.6k32 gold badges148 silver badges253 bronze badges

answered Jun 19, 2012 at 8:02

Igor Chubin

65.1k14 gold badges132 silver badges149 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

georg Over a year ago

@user1453786: with all due respect, @Qtax answer is far better. Functional sub is a useful technique to know, but in this case it's clearly overkill.

Igor Chubin Over a year ago

@thg435: No, it is not. Because it will not work, for example, for unbalanced brackets. Qtax checks with the lookahead assertion only finishing part of the expression and that is wrong. Please try "not, for, everyone] and if it rains, it Will [a,c]," and you will see it yourself. Of course one can add lookbehind assertion too, but that will be not so simple anymore

trinaldi Over a year ago

Great explanation for a great answer

Qtax · Accepted Answer · 2012-06-19 08:00:02Z

4

You can use an expression like this to match them (if the brackets are balanced):

,(?=[^][]*\])

Used something like:

re.sub(r",(?=[^][]*\])", "", str)

answered Jun 19, 2012 at 8:00

Qtax

34k9 gold badges92 silver badges127 bronze badges

1 Comment

Igor Chubin Over a year ago

I like your answer, it is very clean and effective but I think it has one disadvantage. I think that will not work for unbalanced brackets. You check only the finishing part of the expression. It would be really super to solve this task with lookahead/lookbehind assertions but I'm not sure that that will be so elegant as now.

fraxel · Accepted Answer · 2012-06-19 08:31:45Z

0

Here is a non-regex method. You can replace your [] delimiters with say [/ and /], and then split on the / delimiter. Then every odd string in the split list needs to be processed for comma removal, which can be done while rebuilding the string in a list comprehension:

>>> Variable = "The sun shines, that's fine [not, for, everyone] and if it rains,
                it Will Be better."
>>> chunks = Variable.replace('[','[/').replace(']','/]').split('/')
>>> ''.join(sen.replace(',','') if i%2 else sen for i, sen in enumerate(chunks))
"The sun shines, that's fine [not for everyone] and if it rains, it Will Be 
 better."

answered Jun 19, 2012 at 8:31

fraxel

35.4k11 gold badges101 silver badges104 bronze badges

Comments

Pascal Bugnion · Accepted Answer · 2012-06-20 15:57:00Z

-1

If you don't fancy learning regular expressions (see other responses on this page), you can use the partition command.

sentence = "the quick, brown [fox, jumped , over] the lazy dog"
left, bracket, rest = sentence.partition("[")
block, bracket, right = rest.partition("]")

"block" is now the part of the string in between the brackets, "left" is what was to the left of the opening bracket and "right" is what was to the right of the opening bracket.

You can then recover the full sentence with:

new_sentence = left + "[" + block.replace(",","") + "]" + right
print new_sentence # the quick, brown [fox jumped over] the lazy dog

If you have more than one block, you can put this all in a for loop, applying the partition command to "right" at every step.

Or you could learn regular expressions! It will be worth it in the long run.

edited Jun 20, 2012 at 15:57

answered Jun 19, 2012 at 8:16

Pascal Bugnion

4,9381 gold badge26 silver badges29 bronze badges

4 Comments

Igor Chubin Over a year ago

This method will not work when you have more than one "special" section in the string (e.g.: "a,b [c,d] e,f [g,e] h,i")

Martijn Pieters Over a year ago

Don't call variables string; it'll confuse the hell out of developers that expect it to be the string module from the python stdlib.

Martijn Pieters Over a year ago

Also, why not use .split('[', 1) here, you are tossing the brackets anyway. And the second .partition call should be on rest, not on string, so this code won't work at all. string ends up as "the quick, brown [the quick brown [fox jumped over] the lazy dog".

Pascal Bugnion Over a year ago

Edited to remove typo and change variable names. Thanks for the feedback.

Collectives™ on Stack Overflow

Replace a string located between

4 Answers 4

3 Comments

1 Comment

Comments

4 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

3 Comments

1 Comment

Comments

4 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related