2

I'm trying to write a function to split a string with given separators. I've seen answers to similar questions that have use regular expressions to ignore all special characters, but I want to be able to pass in a variable of separators.

So far I've got:

def split_string(source, separators): 
    source_list = source
    for separator in separators:
        if separator in source_list:
                source_list.replace(separator, ' ') 
    return source_list.split()

But it's not removing the separators

3
  • .split takes a regular expression in python; why can't you use source.split(separators)? What is separators exactly? (like an example) Commented Feb 6, 2013 at 3:21
  • @ExplosionPills str.split() doesn't take a regex, it just takes a string - (if you want a regex, that's re.split()). Commented Feb 6, 2013 at 3:21
  • @ExplosionPills -- I hope you mean .replace takes a selection of characters... Commented Feb 6, 2013 at 3:23

4 Answers 4

5

The regex solution (to me) seems like it would be pretty easy:

import re
def split_string(source,separators):
    return re.split('[{0}]'.format(re.escape(separators)),source)

example:

>>> import re
>>> def split_string(source,separators):
...     return re.split('[{0}]'.format(re.escape(separators)),source)
... 
>>> split_string("the;foo: went to the store",':;')
['the', 'foo', ' went to the store']

The reason for using a regex here is in the event that you don't want to have ' ' in your separators, this will still work ...


An alternative (which I think I prefer), where you could have multi-character separators is:

def split_string(source,separators):
    return re.split('|'.join(re.escape(x) for x in separators),source)

In this case, the multi-character separators things get passed in as some sort of non-string iterable (e.g. a tuple or a list), but single character separators can still be passed in as a single string.

>>> def split_string(source,separators):
...     return re.split('|'.join(re.escape(x) for x in separators),source)
... 
>>> split_string("the;foo: went to the store",':;')
['the', 'foo', ' went to the store']
>>> split_string("the;foo: went to the store",['foo','st'])
['the;', ': went to the ', 'ore']

Or, finally, if you want to split on consecutive runs of separators as well:

def split_string(source,separators):
    return re.split('(?:'+'|'.join(re.escape(x) for x in separators)+')+',source)

which gives:

>>> split_string("Before the rain ... there was lightning and thunder.", " .")
['Before', 'the', 'rain', 'there', 'was', 'lightning', 'and', 'thunder', '']
Sign up to request clarification or add additional context in comments.

4 Comments

I tried this: out = split_string("Before the rain ... there was lightning and thunder.", " .") print out and got this back: ['Before', '', 'the', 'rain', '', '', '', '', '', '', '', 'there', 'was', 'lightning', 'and', 'thunder', '']
@BasilSiddiqui -- That's looks like what I would expect. What did you expect?
I was expecting ['Before', 'the', 'rain', 'there', 'was', 'lightning', 'and', 'thunder']
@BasilSiddiqui -- See my final edit. It gets you really close using the same basic formalism.
2

The problem is that source_list.replace(separator, ' ') does not modify source_list in place; it just returns a modified string value. But you don't do anything with this modified value, so it is lost.

You can do this:

source_list = source_list.replace(separator, ' ')

Then source_list will now have the modified version. I made this one change to your function and then it worked perfectly when I tested it.

Comments

2

You forgot to assign the result of source_list.replace(separator, ' ') back to source_list

Look at this modified snippet

def split_string(source, separators): 
    source_list = source
    for separator in separators:
        if separator in source_list:
                source_list=source_list.replace(separator, ' ') 
    return source_list.split()

Comments

0

you should be using split to solve the issue, it doesn't take regex but you can make it work to do what you need.

In your example code you dont reassign.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.