0

I'm looking at a statement that looks like this:

def fn(somelongstring):
    shorterstring = somelongstring.replace('very, ','').replace('long ', '')

fn('some very, very, very, long string')

what's the most efficient method for performing this kind of operation in Python?


Some notes:

  • The list of replace calls is quite long, but fixed and known in advance
  • The long string is an argument to the function, and can get massive; it includes repetitions of the substrings
  • My intuition is that deletion has the opportunity to use different, faster, algorithms from replace
  • The chained replace calls are probably each iterating over the string. There has to be a way to do this without all those repeated iterations.
1

1 Answer 1

3

Use an re:

import re
shorterstring = re.sub('very, |long ', '', 'some very, very, very, long string')

You'll need to make sure that the substrings to replace with nothing are in descending order of length so that longer matches are replaced first.

Or, you could avoid the chained calls, and use:

reduce(lambda a, b: a.replace(b, ''), ['very, ', 'long '], s)
Sign up to request clarification or add additional context in comments.

2 Comments

Compared to OP's method this is slower. (454 us vs 2.74 ms)
+1 Indeed, regex will save a lot of memory when the string is huge.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.