Edit(Replace/Add) strings based on index

Question

Consider the following string,

a = """Dear Sir or Madam,

I am writting to you about the show. I was very disappointed after this show. I would like to have my money back. At first the show started at 10.15, and it should be at 19.30.

After your show I wanted to visit my friends, and because of it, I didn't do it.
"""

The string has to be edited based on the list of lists given below.

li = [[25, 33, 'writing'],
    [87, 91, 'the'],
    [134, 142, 'First'],
    [184, 186, 'have started'],
    [265, 271, "couldn't"]]

Here, individual list correspond to a single change in the string. The first element in the list is the starting index and the second element is the ending index of the string which has to be replaced by the third element of the list.

for example: a[25:33] gives writting which has to be replaced by writing, a[87:91] gives this which has to be replaced by the and similarly to all the other lists.

The expected output for the given example is:

"""Dear Sir or Madam,

I am writing to you about the show. I was very disappointed after the show. I would like to have my money back. First the show started at 10.15, and it should have started at 19.30.

After your show I wanted to visit my friends, and because of it, I couldn't do it.
"""

kaya3 · Accepted Answer · 2020-01-29 11:14:10Z

4

To avoid problems where one substitution affects the index of another, you can do the substitutions in reverse order; then they don't mess up the indices of other substitutions you need to do afterwards. To make the substitution in only the required position, you can use slicing instead of str.replace.

for start, end, replacement in reversed(li):
    a = a[:start] + replacement + a[end:]

print(a)

Output:

Dear Sir or Madam,

I am writing to you about the show. I was very disappointed after the show. I would like to have my money back. First the show started at 10.15, and it should have started at 19.30.

After your show I wanted to visit my friends, and because of it, I couldn't do it.

This works assuming the indices in li are given in order; if they are not, use sorted(li, reverse=True) instead of reversed(li).

edited Jan 29, 2020 at 11:14

answered Jan 29, 2020 at 11:08

kaya3

51.6k7 gold badges87 silver badges118 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Nischal Sanil Over a year ago

But doesn't this affect the index.This method combined with @Valentinm offset approach should do it, I guess.

kaya3 Over a year ago

@NischalSanil Not sure what you mean by that; the result matches your expected result exactly, anyway. The reason for iterating in reverse is so you don't have to recalculate any indices using an offset.

Nischal Sanil Over a year ago

Oh sorry, I didn't check it properly.

Valentin M. · Accepted Answer · 2020-01-29 10:55:39Z

1

This should do the trick :

a = """Dear Sir or Madam,

I am writting to you about the show. I was very disappointed after this show. I would like to have my money back. At first the show started at 10.15, and it should be at 19.30.

After your show I wanted to visit my friends, and because of it, I didn't do it.
"""

li = [[25, 33, 'writing'],
    [87, 91, 'the'],
    [134, 142, 'First'],
    [184, 186, 'have started'],
    [265, 271, "couldn't"]]

offset = 0

for (start, end, replacement) in li:
    a = a.replace( a[start-offset : end-offset], replacement, 1)
    offset -= len(replacement) - (end-start)

print(a)

The change of index caused by replacement is handled by the offset variable, and the 1 at the end of a.replace(..., 1) makes sure that only the first occurrence of the text you want to replace is affected

edited Jan 29, 2020 at 10:55

answered Jan 29, 2020 at 10:50

Valentin M.

5195 silver badges19 bronze badges

1 Comment

Nischal Sanil Over a year ago

That's a very good trick with using offset, Thanks a lot for sharing. But, like you mentioned this code snippet only changes the first match/occurrence of the text. But, I have a lot of cases which does not meet that criteria.

Ahmet · Accepted Answer · 2020-01-29 11:08:28Z

1

This should do it. Note that this will only work if the list of replacements are in chronological order.

a = """Dear Sir or Madam,

I am writting to you about the show. I was very disappointed after this show. I would like to have my money back. At first the show started at 10.15, and it should be at 19.30.

After your show I wanted to visit my friends, and because of it, I didn't do it.
"""

li = [[25, 33, 'writing'],
    [87, 91, 'the'],
    [134, 142, 'First'],
    [184, 186, 'have started'],
    [265, 271, "couldn't"]]


for (start,end,replacement) in reversed(li):
    a = a.replace(a[start:end],replacement,1)


print(a)

edited Jan 29, 2020 at 11:08

answered Jan 29, 2020 at 10:44

Ahmet

4354 silver badges13 bronze badges

5 Comments

kaya3 Over a year ago

To fix the "indices change" issue, just do the changes in reverse order: for ele in reversed(li):. This works as long as the list of changes is in sorted order, because each substitution only messes up the indices after it - not before it.

Ahmet Over a year ago

@kaya3 Very good idea, but for some reason I couldn't work out it doesn't work. It adds an additional "have started" somewhere seemingly randomly, while the remainder is working. Couldn't figure out why?

kaya3 Over a year ago

That's presumably because you're using replace which replaces all occurrences of that substring, instead of just the one you want to replace. You can use just slicing and +, without replace, to get the desired result.

Ahmet Over a year ago

@kaya3 thats it, nice one!

Nischal Sanil Over a year ago

@Ahmet Sorry, I didn't observe your answer before. Thanks a lot for helping out.

Kelly Bundy · Accepted Answer · 2020-01-30 02:07:12Z

1

A linear time alternative, joining the parts to keep and the replacements. A part to keep starts where the previous replacement stops, and stops where the current replacement starts.

none = [[None, None, '']]
a = ''.join(a[prev[1]:curr[0]] + curr[2]
            for prev, curr in zip(none + li, li + none))

Same thing, different names:

none = [[None, None, '']]
a = ''.join(a[start:stop] + replacement
            for (_, start, _), (stop, _, replacement) in zip(none + li, li + none))

edited Jan 30, 2020 at 2:07

answered Jan 30, 2020 at 1:59

Kelly Bundy

28k8 gold badges34 silver badges76 bronze badges

Collectives™ on Stack Overflow

Edit(Replace/Add) strings based on index

4 Answers 4

3 Comments

1 Comment

5 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

3 Comments

1 Comment

5 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related