Find and replace multiple comma/space instances in a string, Python

Question

I have a string with multiple successive instances of , (comma+space) that I want to replace with a single instance. Is there a clean way to do so? I suppose RegEx can be helpful.

A naive example:

s = 'a, b, , c, , , d, , , e, , , , , , , f

The desired output:

'a, b, c, d, e, f

Naturally, the text can change, so the search should be for successive instances of ,.

Not sure I understand the question. This is a simplified example of something else :) — Oleg Melnikov
– Oleg Melnikov, Commented Nov 28, 2015 at 20:20

Dušan Maďar · Accepted Answer · 2015-11-28 20:07:42Z

11

So the regular expression searches for two or more instances of , (comma + space) and then in sub function you replace it with only a single ,.

import re
pattern = re.compile(r'(,\s){2,}')

test_string = 'a, b, , c, , , d, , , e, , , , , , , f'
print re.sub(pattern, ', ', test_string)
>>> a, b, c, d, e, f

and without a regular expression (as @Casimir et Hippolyte suggested in comment)

test_string = 'a, b, , c, , , d, , , e, , , , , , , f'
test_string_parts = test_string.split(',')
test_string_parts = [part.strip() for part in test_string_parts if part != ' ']
print ', '.join(test_string_parts)
>>> a, b, c, d, e, f

edited Nov 28, 2015 at 20:07

answered Nov 28, 2015 at 19:51

Dušan Maďar

10k6 gold badges59 silver badges72 bronze badges

Sign up to request clarification or add additional context in comments.

8 Comments

Oleg Melnikov Over a year ago

Thanks. Would not this work the same: re.sub(r'(,\s){2,}', ', ', test_string) ? Or, is there some notable difference or corner case?

Dušan Maďar Over a year ago

It will work as well. More info about compile here: stackoverflow.com/questions/452104/…

Oleg Melnikov Over a year ago

If instead of ', ' I have '\n' how would I adapt this find & replace? (\\n){2,} seems to work in regex101.com, but not in Python. Any suggestions?

Dušan Maďar Over a year ago

Why not only r'\n{2,}'?

Oleg Melnikov Over a year ago

tried r'\n{2,}'. It fails to catch '\n 'in Python and in regex101.com

|

Casimir et Hippolyte · Accepted Answer · 2015-11-28 20:15:39Z

2

You can use reduce:

>>> from functools import reduce
>>> reduce( (lambda x, y: x+', '+y if y else x), s.split(', '))

(Where x is the carry and y the item)

edited Nov 28, 2015 at 20:15

answered Nov 28, 2015 at 20:09

Casimir et Hippolyte

90k5 gold badges102 silver badges131 bronze badges

Comments

AkaKaras · Accepted Answer · 2015-11-29 19:15:19Z

1

the simplest way for your problem would be:

>>> s = 'a, b, , c, , , d, , , e, , , , , , , f'
>>> s = [x for x in s if x.isalpha()]
>>> print(s)
['a', 'b', 'c', 'd', 'e', 'f']

then, use join()

>>> ', '.join(s)
'a, b, c, d, e, f'

do it in one line:

>>> s = ', '.join([x for x in s if x.isalpha()])
>>> s
'a, b, c, d, e, f'

Just figure other way:

>>> s = 'a, b, , c, , , d, , , e, , , , , , , f'
>>> s = s.split()  #split all ' '(<- space)
>>> s
['a,', 'b,', ',', 'c,', ',', ',', 'd,', ',', ',', 'e,', ',', ',', ',', ',', ',', ',', 'f']
>>> while ',' in s:
...     s.remove(',')
>>> s
['a,', 'b,', 'c,', 'd,', 'e,', 'f']
>>> ''.join(s)
'a,b,c,d,e,f'

edited Nov 29, 2015 at 19:15

answered Nov 29, 2015 at 5:02

AkaKaras

1021 silver badge4 bronze badges

2 Comments

Oleg Melnikov Over a year ago

However... this only works for single-letter variables... Please read the question :)

AkaKaras Over a year ago

just add a new method

Tobias · Accepted Answer · 2018-12-19 08:31:36Z

1

s = ", , a, b, , c, , , d, , , e, , , ,  , , , f,,,,"
s = [o for o in s.replace(' ', '').split(',') if len(o)]
print (s)

edited Dec 19, 2018 at 8:31

Tobias

9693 gold badges14 silver badges31 bronze badges

answered Dec 19, 2018 at 7:57

Gold

262 bronze badges

1 Comment

Harsha Biyani Over a year ago

This answer came in low quality post. Add some explanation even though code is self explanatory

Subham · Accepted Answer · 2021-05-04 14:49:49Z

0

One more solution: go through the combination of the list and the same list shifted by one (in other words, by the pairs of consecutive items) and select the second item from each pair where the first (previous) item differs from the second (next) item:

s = 'a. b. . c. . . d. . . e. . . . . . . f'
test = []
for i in s:
    if i != ' ':
        test.append(i)


res = [test[0]] + [y for x,y in zip(test, test[1:]) if x!=y]

for x in res:
    print(x, end='')

yields

a.b.c.d.e.f
[Program finished]

answered May 4, 2021 at 14:49

Subham

4111 gold badge6 silver badges14 bronze badges

Collectives™ on Stack Overflow

Find and replace multiple comma/space instances in a string, Python

5 Answers 5

8 Comments

Comments

2 Comments

1 Comment

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

8 Comments

Comments

2 Comments

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related