0

I want to remove the duplicate adjacent of specific string from list. Suppose that I have a list as below:

list_ex = ['I', 'went', 'to', 'the', 'big', 'conference', ',', 'I', 'presented', 'myself', 'there', '.', 'After', 'the', '<word>conference</word>', '<word>conference</word>', ',', 'I', 'took', 'a', 'taxi', 'to', 'go', 'to', 'the', '<word>hotel</word>', '<word>hotel</word>', '.', 'Tomorrow', 'I', 'will', 'go', 'to', '<word>conference</word>', 'again', '.']

Here is what I have tried so far:

for item in list_ex:
    if item.startswith('<word>'):
        if item in new_list_ex and (item == list_ex[list_ex.index(item)+1]):
            continue
    new_list_ex.append(item)

My output of new_list_ex:

['I', 'went', 'to', 'the', 'big', 'conference', ',', 'I', 'presented', 'myself', 'there', '.', 'After', 'the', '<word>conference</word>', ',', 'I', 'took', 'a', 'taxi', 'to', 'go', 'to', 'the', '<word>hotel</word>', '.', 'Tomorrow', 'I', 'will', 'go', 'to', 'again', '.']

Desired output:

['I', 'went', 'to', 'the', 'big', 'conference', ',', 'I', 'presented', 'myself', 'there', '.', 'After', 'the', '<word>conference</word>', ',', 'I', 'took', 'a', 'taxi', 'to', 'go', 'to', 'the', '<word>hotel</word>', '.', 'Tomorrow', 'I', 'will', 'go', 'to', '<word>conference</word>', 'again', '.']

I feel like my list_ex[list_ex.index(item)+1] to detect the adjacent element did not work properly. How can I adjust to get the desired output?

Please note that order in this list is important.

4
  • What you want to do is skip the word if it's flagged (startswith(<word>)) and is equal to the LAST element of new_list_ex; try if item.startswith('<word>') and (item == new_list_ex[-1]): Commented Dec 6, 2022 at 21:27
  • @Mohammedalmalki of course order is important. Commented Dec 6, 2022 at 21:32
  • now i understand .. you want to delete the tags and delete the duplicate word is the right ? Commented Dec 6, 2022 at 21:34
  • @Mohammedalmalki yes of course. Commented Dec 6, 2022 at 21:36

2 Answers 2

1

Test whether a word flagged with <word> is the last item in the new_list (new_list_ex[-1]); if so, continue (skip it). If not, just append the word to the new_list.

list_ex = ['I', 'went', 'to', 'the', 'big', 'conference', ',', 'I', 'presented', 'myself', 'there', '.', 'After', 'the', '<word>conference</word>', '<word>conference</word>', ',', 'I', 'took', 'a', 'taxi', 'to', 'go', 'to', 'the', '<word>hotel</word>', '<word>hotel</word>', '.', 'Tomorrow', 'I', 'will', 'go', 'to', '<word>conference</word>', 'again', '.']

new_list_ex = []
for item in list_ex:
    if item.startswith('<word>') and (item == new_list_ex[-1]):
        continue
    new_list_ex.append(item)
Sign up to request clarification or add additional context in comments.

Comments

0

i think you mean this work :


list_ex = ['I', 'went', 'to', 'the', 'big', 'conference', ',', 'I', 'presented', 'myself', 'there', '.', 'After', 'the', '<word>conference</word>', '<word>conference</word>', ',', 'I', 'took', 'a', 'taxi', 'to', 'go', 'to', 'the', '<word>hotel</word>', '<word>hotel</word>', '.', 'Tomorrow', 'I', 'will', 'go', 'to', '<word>conference</word>', 'again', '.']

new_list = []

for i in list_ex:
    if "<" in i and ">" in i:
        if i.split(">")[1].split("<")[0] in new_list:
            continue
        else:
            new_list.append(i.split(">")[1].split("<")[0])
    else:
        if i in new_list:
            continue
        else:
            new_list.append(i)


print(new_list)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.