-3

string input: Tem1 = 'Hhelloo ookkee'

I want to make output like Tem1 = 'helo oke'

I have try this link form stackoverflow (Python: Best Way to remove duplicate character from string)

I've tried using itertools, but when saving in csv. the stored format is still the same with lots of duplicate characters

import itertools
tem1 = sum(val*(2**idx) for idx, val in enumerate(reversed(tem)))
if bit[0:8]==[1,0,0,1,1,0,0,1]:
    cv2.putText(frame, "Text Print: " + chr(tem1) +".....", (50, 50), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)
    print(chr(tem1))
    cv2.imshow('frame',frame)
if str(tem1)!='0':
    row = ''.join(ch for ch, _ in itertools.groupby(f'{chr(tem1)}'))
    # create csv file to save the data.
    f.write(row)

best way to remove duplicate

NOTE: Order is important and this question is not similar to this one.

4
  • 1
    What about the result of join? Is that string correct? Why are you storing the result in row but in your csv you are writing newrow? Commented Mar 23, 2023 at 7:53
  • 1
    I don't understand your call to groupby. It should receive the string whose duplicate characters have to be removed. Instead you are passing a string with a single character (which will never have a double character). Commented Mar 23, 2023 at 7:57
  • @JorgeLuis sorry, i have updated "newrow" into "row". I have try this one with f.write(row),but the result is still duplicate when I save into csv file Commented Mar 24, 2023 at 6:38
  • without a reproducible example is really hard to tell from the code you posted what you are trying to achieve because you are doing some weird stuff. Commented Mar 24, 2023 at 7:48

1 Answer 1

0
def remove_duplicates(s):
    acc = [s[0]]
    for c in s[1:]:
        if acc[-1] != c: acc.append(c)
    return ''.join(acc)

s = "Hhelloo ookkee"
print(remove_duplicates(s))

There's also a module called more-itertools (install with pip install more-itertools), that has a unique_justseen function which seems to do the same thing:

from more_itertools import unique_justseen

s = "Hhelloo ookkee"
print(''.join(unique_justseen(s)))

The output would be 'Hhelo oke', because 'H' and 'h' are, strictly speaking, different characters. If you want the comparison to be case-insensitive, you should lowercase the symbols before comparing. For simple examples with strings limited to Latin alphabet calling str.lower() would be enough, but it wouldn't work for some Unicode characters, therefore, for real stuff, casefold() should be used instead; read this about even more real stuff. E. g., for the first of the above code samples:

if acc[-1].casefold() != c.casefold(): acc.append(c)

And for the second, using the optional key argument:

unique_justseen(s, str.casefold)

And in both cases it would probably be more efficient to casefold the entire string first, not to do it character by character when comparing.

Sign up to request clarification or add additional context in comments.

5 Comments

technically str.lower() and str.upper() can not give you case-insensitive comparison. These operations are asymmetric casing operations, not matching or comparison operations, neither do they remove all case distinctions.
@Andj Huh? "Hello, World!".lower() == "hello, world!".lower() -> True
The fact that "Hello, World!".lower() == "hello, world!".lower() is true is irrelevant. Not all lowercase characters have uppercase equivalents, not all uppercase characters have lowercase equivalents. Some uppercase characters map to two characters when lowercased. Also with casing, two types of casing are defined in Unicode: simple casing and full casing. @headcrab Unicode defines four types of caseless matching, the simplest is case-folding. str.lower() and str.upper() were only ever caseless matching in Python 2. I.e. when using encodings other than Unicode.
@Andj I see no point in overloading such a simple question with endless Unicode intricacies, but I've added str.casefold() and a link for further reading into the answer. OK now, or do you still see some room for exercises in pedantry?
your answer works perfectly well for the question, I didn't imply that. Rather, I was pointing out that your characterisation of the str.lower() operation was technically incorrect and a hangover from Python 2.x. For the question being answered, it doesn't matter, true. But stackoverflow questions and answers are often searched after the fact, in fact people asking questions are encouraged to search first, ask if they can't find the answer already. So it is better to be exact for future readers rather than having partial answers where the distinctions matter.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.