Remove duplicate adjacent of specific string from list

Question

I want to remove the duplicate adjacent of specific string from list. Suppose that I have a list as below:

list_ex = ['I', 'went', 'to', 'the', 'big', 'conference', ',', 'I', 'presented', 'myself', 'there', '.', 'After', 'the', '<word>conference</word>', '<word>conference</word>', ',', 'I', 'took', 'a', 'taxi', 'to', 'go', 'to', 'the', '<word>hotel</word>', '<word>hotel</word>', '.', 'Tomorrow', 'I', 'will', 'go', 'to', '<word>conference</word>', 'again', '.']

Here is what I have tried so far:

for item in list_ex:
    if item.startswith('<word>'):
        if item in new_list_ex and (item == list_ex[list_ex.index(item)+1]):
            continue
    new_list_ex.append(item)

My output of new_list_ex:

['I', 'went', 'to', 'the', 'big', 'conference', ',', 'I', 'presented', 'myself', 'there', '.', 'After', 'the', '<word>conference</word>', ',', 'I', 'took', 'a', 'taxi', 'to', 'go', 'to', 'the', '<word>hotel</word>', '.', 'Tomorrow', 'I', 'will', 'go', 'to', 'again', '.']

Desired output:

['I', 'went', 'to', 'the', 'big', 'conference', ',', 'I', 'presented', 'myself', 'there', '.', 'After', 'the', '<word>conference</word>', ',', 'I', 'took', 'a', 'taxi', 'to', 'go', 'to', 'the', '<word>hotel</word>', '.', 'Tomorrow', 'I', 'will', 'go', 'to', '<word>conference</word>', 'again', '.']

I feel like my list_ex[list_ex.index(item)+1] to detect the adjacent element did not work properly. How can I adjust to get the desired output?

Please note that order in this list is important.

What you want to do is skip the word if it's flagged (startswith(<word>)) and is equal to the LAST element of new_list_ex; try if item.startswith('<word>') and (item == new_list_ex[-1]): — Swifty
– Swifty, Commented Dec 6, 2022 at 21:27
now i understand .. you want to delete the tags and delete the duplicate word is the right ? — Mohammed almalki
– Mohammed almalki, Commented Dec 6, 2022 at 21:34

Swifty · Accepted Answer · 2022-12-06 21:31:05Z

1

Test whether a word flagged with <word> is the last item in the new_list (new_list_ex[-1]); if so, continue (skip it). If not, just append the word to the new_list.

list_ex = ['I', 'went', 'to', 'the', 'big', 'conference', ',', 'I', 'presented', 'myself', 'there', '.', 'After', 'the', '<word>conference</word>', '<word>conference</word>', ',', 'I', 'took', 'a', 'taxi', 'to', 'go', 'to', 'the', '<word>hotel</word>', '<word>hotel</word>', '.', 'Tomorrow', 'I', 'will', 'go', 'to', '<word>conference</word>', 'again', '.']

new_list_ex = []
for item in list_ex:
    if item.startswith('<word>') and (item == new_list_ex[-1]):
        continue
    new_list_ex.append(item)

answered Dec 6, 2022 at 21:31

Swifty

3,4642 gold badges6 silver badges25 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Mohammed almalki · Accepted Answer · 2022-12-06 21:40:58Z

i think you mean this work :


list_ex = ['I', 'went', 'to', 'the', 'big', 'conference', ',', 'I', 'presented', 'myself', 'there', '.', 'After', 'the', '<word>conference</word>', '<word>conference</word>', ',', 'I', 'took', 'a', 'taxi', 'to', 'go', 'to', 'the', '<word>hotel</word>', '<word>hotel</word>', '.', 'Tomorrow', 'I', 'will', 'go', 'to', '<word>conference</word>', 'again', '.']

new_list = []

for i in list_ex:
    if "<" in i and ">" in i:
        if i.split(">")[1].split("<")[0] in new_list:
            continue
        else:
            new_list.append(i.split(">")[1].split("<")[0])
    else:
        if i in new_list:
            continue
        else:
            new_list.append(i)


print(new_list)

Collectives™ on Stack Overflow

Remove duplicate adjacent of specific string from list

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related