0

I have a string '1a1b1c1d3e3e3e1f1g2h2h1i1j1k1l1m1n4o4o4o4o1p1q2r2r1s2t2t2u2u1v1w1x1y1z' and I want to remove all of the duplicates of these charterers: 3e, 4o, 2r etc. How can I do that in Python?

4
  • is there any specific string len to remove? Commented May 5, 2020 at 7:24
  • all of those are 2 characters. Its always a number and a letter next to it Commented May 5, 2020 at 7:25
  • Not sure if there's a builtin that can handle this scenario. But a crude way to do it would be to do x = x[:x.find(y)+len(y)] + x[x.find(y)+len(y):].replace(y, '') where x is your original string and y is the desired duplicates to be removed no matter what cost. Add error handling etc to catch -1 positions etc. Commented May 5, 2020 at 7:31
  • There's also very little constraints here. Which is worrying. For instance, what if you're actually looking to keep the 2:d occurance? or the last? Commented May 5, 2020 at 7:32

3 Answers 3

6
str_='1a1b1c1d3e3e3e1f1g2h2h1i1j1k1l1m1n4o4o4o4o1p1q2r2r1s2t2t2u2u1v1w1x1y1z'
seen = set()
result = []
n=2
for i in range(0,len(str_),n):
    item=str_[i:i+n]
    if item not in seen:
        seen.add(item)
        result.append(item)
Sign up to request clarification or add additional context in comments.

Comments

1

This is a pretty crude way of doing it.
But it seams to do the job without begin to complicated.

This also assumes that it's known character compositions you need to remove. You didn't mention that you need to remove all duplicates, only a set of known ones.

x = '1a1b1c1d3e3e3e1f1g2h2h1i1j1k1l1m1n4o4o4o4o1p1q2r2r1s2t2t2u2u1v1w1x1y1z'
for y in ['3e', '4o', '2r']:
    x = x[:x.find(y)+len(y)] + x[x.find(y)+len(y):].replace(y, '')
print(x)

Finds the first occurance of your desired object (3e for instance) and builds a new version of the string up to and including that object, and prepends the string with the rest of the original string but with replacing your object with a empty string.

This is a bit slow, but again, gets the job done. No error handling here tho so be wary of -1 positions etc.

Comments

0

You can use list comprehension and set to do this in the following way:

s = '1a1b1c1d3e3e3e1f1g2h2h1i1j1k1l1m1n4o4o4o4o1p1q2r2r1s2t2t2u2u1v1w1x1y1z' s = [s[i:i+2] for i in range(0, len(s) - 1, 2)] s = set(s)

Hope it helps

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.