7

I have a string with multiple successive instances of , (comma+space) that I want to replace with a single instance. Is there a clean way to do so? I suppose RegEx can be helpful.

A naive example:

s = 'a, b, , c, , , d, , , e, , , , , , , f

The desired output:

'a, b, c, d, e, f

Naturally, the text can change, so the search should be for successive instances of ,.

2
  • How did you ever get such even displacement ? Commented Nov 28, 2015 at 20:14
  • Not sure I understand the question. This is a simplified example of something else :) Commented Nov 28, 2015 at 20:20

5 Answers 5

11

So the regular expression searches for two or more instances of , (comma + space) and then in sub function you replace it with only a single ,.

import re
pattern = re.compile(r'(,\s){2,}')

test_string = 'a, b, , c, , , d, , , e, , , , , , , f'
print re.sub(pattern, ', ', test_string)
>>> a, b, c, d, e, f

and without a regular expression (as @Casimir et Hippolyte suggested in comment)

test_string = 'a, b, , c, , , d, , , e, , , , , , , f'
test_string_parts = test_string.split(',')
test_string_parts = [part.strip() for part in test_string_parts if part != ' ']
print ', '.join(test_string_parts)
>>> a, b, c, d, e, f
Sign up to request clarification or add additional context in comments.

8 Comments

Thanks. Would not this work the same: re.sub(r'(,\s){2,}', ', ', test_string) ? Or, is there some notable difference or corner case?
It will work as well. More info about compile here: stackoverflow.com/questions/452104/…
If instead of ', ' I have '\n' how would I adapt this find & replace? (\\n){2,} seems to work in regex101.com, but not in Python. Any suggestions?
Why not only r'\n{2,}'?
tried r'\n{2,}'. It fails to catch '\n 'in Python and in regex101.com
|
2

You can use reduce:

>>> from functools import reduce
>>> reduce( (lambda x, y: x+', '+y if y else x), s.split(', '))

(Where x is the carry and y the item)

Comments

1

the simplest way for your problem would be:

>>> s = 'a, b, , c, , , d, , , e, , , , , , , f'
>>> s = [x for x in s if x.isalpha()]
>>> print(s)
['a', 'b', 'c', 'd', 'e', 'f']

then, use join()

>>> ', '.join(s)
'a, b, c, d, e, f'

do it in one line:

>>> s = ', '.join([x for x in s if x.isalpha()])
>>> s
'a, b, c, d, e, f'

Just figure other way:

>>> s = 'a, b, , c, , , d, , , e, , , , , , , f'
>>> s = s.split()  #split all ' '(<- space)
>>> s
['a,', 'b,', ',', 'c,', ',', ',', 'd,', ',', ',', 'e,', ',', ',', ',', ',', ',', ',', 'f']
>>> while ',' in s:
...     s.remove(',')
>>> s
['a,', 'b,', 'c,', 'd,', 'e,', 'f']
>>> ''.join(s)
'a,b,c,d,e,f'

2 Comments

However... this only works for single-letter variables... Please read the question :)
just add a new method
1
s = ", , a, b, , c, , , d, , , e, , , ,  , , , f,,,,"
s = [o for o in s.replace(' ', '').split(',') if len(o)]
print (s)

1 Comment

This answer came in low quality post. Add some explanation even though code is self explanatory
0

One more solution: go through the combination of the list and the same list shifted by one (in other words, by the pairs of consecutive items) and select the second item from each pair where the first (previous) item differs from the second (next) item:

s = 'a. b. . c. . . d. . . e. . . . . . . f'
test = []
for i in s:
    if i != ' ':
        test.append(i)


res = [test[0]] + [y for x,y in zip(test, test[1:]) if x!=y]

for x in res:
    print(x, end='')
 

yields

a.b.c.d.e.f
[Program finished]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.