2

I have been working on this problem for more hours than I'd like to admit now and it's driving me nuts! Given a simple linked list, say one that stores an integer (data) only in addition to a pointer to the next node (next), I would like an algorithm that removes duplicates without sorting or relying on helper functions.

Previous questions have been asked about unsorted linked lists in Java which take advantage of helper functions that Java offers. This is strictly pertinent to C without the use of helper functions.

I have tinkered around with code and have got this to work for some cases, but not all. Here is a complete, verifiable example -- I've include a push() function to create a linked list and a main() with test cases, but the logic that my question pertains to is in removeDuplicates() alone:

#include <stdlib.h>
#include <stdio.h>

struct node {
    int data;
    struct node *next; 
};

void push(struct node **headRef, int data) {
    struct node *newNode = malloc(sizeof(struct node));
    newNode->data = data;
    newNode->next = *headRef;
    *headRef = newNode;
}

void removeDuplicates(struct node **head) {
        struct node *currentNode = *head; //The node we compare other nodes against
    struct node *runningNode = (*head)->next; //The node we are comparing to
    struct node *runningNodePrev = *head; //The node before the node we are comparing to
    struct node *runningNodeNext = (*head)->next->next; // The node after the node we are comparing to
    int count = -1;
    while (currentNode->next != NULL) { //Ensure we are not looking at the last node -- cannot have comparisons past this node
        count++;
        if (count) {
            //'Base Position'
            currentNode = currentNode->next;
            runningNodePrev = currentNode;
            runningNode = currentNode->next;
            runningNodeNext = runningNode->next;
        }
        printf("\nChecking for a match with  %d ... \n", currentNode->data);
        while (runningNode != NULL) { //Compare each node in the list until we reach the end of the list
            printf("Inspecting next adjacent node ... \n");
            if (runningNode->data == currentNode->data) {//If a duplicate is found
                printf("Found match ... ");

                //Ensure link is maintained before removing duplicate node
                if (currentNode == runningNodePrev)
                    currentNode->next = runningNodeNext;
                runningNodePrev->next = runningNodeNext; 

                free(runningNode);//Delete duplicate node
                printf("Deleted duplicate.\n");
            }
            runningNode = runningNodeNext;//Continue searching for duplicates at the first unchecked node
            runningNodeNext = runningNodeNext->next;//Move running node leader up the list.     
        }
    }
}

int main(void) {
    struct node *t1 = NULL;
    struct node *t2 = NULL;
    struct node *t4 = NULL;
    struct node *t5 = NULL;
    push(&t1, 1);
    push(&t1, 1);
    push(&t1, 1);
    push(&t1, 1);
    push(&t2, 6);
    push(&t2, 1);
    push(&t2, 2);
    push(&t2, 3);
    push(&t2, 4);
    push(&t2, 5);
    push(&t2, 6);
    push(&t4, 4);
    push(&t4, 2);
    push(&t4, 3);
    push(&t4, 2);
    push(&t4, 1);
    push(&t5, 0);
    push(&t5, 0);
    push(&t5, 1);
    printf("Testing removeDuplicates()...\n");
    printf("Case 1: Calling removeDuplicates({1,0,0}).\n");
    printf("Expected result: { 1 0 }\n");
    printf("Actual result:   { ");
    removeDuplicates(&t5);
    ptr = t5;
    while (ptr != NULL) { 
        printf("%d ", ptr->data);
        ptr = ptr->next;
    }
    printf("}\n");
    printf("Case 2: Calling removeDuplicates({1,2,3,2,4}).\n");
    printf("Expected result: { 1 2 3 4 }\n");
    printf("Actual result:   { ");
    removeDuplicates(&t4);
    ptr = t4;
    while (ptr != NULL) {
        printf("%d ", ptr->data);
        ptr = ptr->next;
    }
    printf("}\n");
    printf("Case 3: Calling removeDuplicates({6,5,4,3,2,1,6}).\n");
    printf("Expected result: { 6 5 4 3 2 1 }\n");
    printf("Actual result:   { ");
    removeDuplicates(&t2);
    ptr = t2;
    while (ptr != NULL) {
        printf("%d ", ptr->data);
        ptr = ptr->next;
    }
    printf("}\n");
    printf("Case 4: Calling removeDuplicates({1,1,1,1}).\n");
    printf("Expected result: { 1 }\n");
    printf("Actual result:   { ");
    removeDuplicates(&t1);
    ptr = t1;
    while (ptr != NULL) {
        printf("%d ", ptr->data);
        ptr = ptr->next;
    }
    printf("}\n");
}

I've also included a picture that describes the logic if it is unclear what I'm doing: http://imgur.com/DbnBOq2

11
  • 1
    What's wrong with helper functions? Commented Mar 22, 2017 at 13:28
  • 1
    Why not modify push to not insert a duplicate? Commented Mar 22, 2017 at 13:32
  • you can use hashtables for checking of duplicates... then you don't need to check the whole list Commented Mar 22, 2017 at 13:34
  • 2
    You have too many variables. You only need to compare two nodes, and (maybe) delete one of them (and update the pointer that points to the deleted node) Wihout sorting (or hashtables) this will have quadratic behavior. Commented Mar 22, 2017 at 13:57
  • @StoryTeller because I do not want to assume that a list is created without duplicates. I only included push() so that a Stack member could create a list easily to verify the code. Commented Mar 22, 2017 at 13:57

5 Answers 5

2
/* Program to remove duplicates in an unsorted linked list */

#include <bits/stdc++.h>
using namespace std;

/* A linked list node */
struct Node
{
    int data;
    struct Node *next;
};

// Utility function to create a new Node
struct Node *newNode(int data)
{
   Node *temp = new Node;
   temp->data = data;
   temp->next = NULL;
   return temp;
}

/* Function to remove duplicates from a
   unsorted linked list */
void removeDuplicates(struct Node *start)
{
    struct Node *ptr1, *ptr2, *dup;
    ptr1 = start;

    /* Pick elements one by one */
    while (ptr1 != NULL && ptr1->next != NULL)
    {
        ptr2 = ptr1;

        /* Compare the picked element with rest
           of the elements */
        while (ptr2->next != NULL)
        {
            /* If duplicate then delete it */
            if (ptr1->data == ptr2->next->data)
            {
                /* sequence of steps is important here */
                dup = ptr2->next;
                ptr2->next = ptr2->next->next;
                delete(dup);
            }
            else /* This is tricky */
                ptr2 = ptr2->next;
        }
        ptr1 = ptr1->next;
    }
}

/* Function to print nodes in a given linked list */
void printList(struct Node *node)
{
    while (node != NULL)
    {
        printf("%d ", node->data);
        node = node->next;
    }
}

/* Druver program to test above function */
int main()
{
    /* The constructed linked list is:
     10->12->11->11->12->11->10*/
    struct Node *start = newNode(10);
    start->next = newNode(12);
    start->next->next = newNode(11);
    start->next->next->next = newNode(11);
    start->next->next->next->next = newNode(12);
    start->next->next->next->next->next =
                                    newNode(11);
    start->next->next->next->next->next->next =
                                    newNode(10);

    printf("Linked list before removing duplicates ");
    printList(start);

    removeDuplicates(start);

    printf("\nLinked list after removing duplicates ");
    printList(start);

    return 0;
}

Reference: geeksforgeeks

Sign up to request clarification or add additional context in comments.

Comments

0
void removeDuplicates(struct node **head){
struct node *tmp;

while (*head) {
        /* Look below *head, to see if it has any duplicates */
        for (tmp= (*head)->next; tmp; tmp = tmp->next) {
                if (tmp->data == (*head)->data) break;
                }
                /* Found no duplicate: advance head */
        if(!tmp) { head = &(*head)->next; continue;}
                /* Duplicate found :: delete *head */
        tmp = (*head)->next;
        free(*head);
        *head = tmp;
        }
return;
}

Now, check the corner cases:

  • if *head is NULL, the outer loop is never executed: nothing happens
  • if (*head)->next is NULL, the inner loop is never executed, so tmp is still NULL after the inner loop
  • if a duplicate is found *head is replaced by its ->next pointer (which could be NULL)
  • if no duplicate is found, *head is just advanced to its ->next pointer(which could be NULL)

Comments

0

This sort of question is asked a lot in various variations. Usually when one is implementing a linked-list, one eventually comes to a point where one needs to keep way too many pointers to various elements. On the face of it, it may seem that a single redirection is easier to work with, when in fact it doesn't convey nearly enough information about the list to preform the operation locally.

Here's the function re-written (but not fully tested) to make use of the "link" abstraction (which is essentially a struct node**):

void removeDuplicates(struct node** head) {
  if(!head)
    return;

  struct node **link = head;

  // We will iterate over the links. I.e. the `next` pointers in the list.
  while(*link) {
    struct node **rest = &((*link)->next);
    while(*rest) {
      if ((*link)->data != (*rest)->data) {
        rest = &((*rest)->next); // move to the next link
      } else {
        // modify the current link of rest to look one past the next
        struct node *to_remove = *rest;
        *rest = to_remove->next;
        free(to_remove);
      }
    }
    link = &((*link)->next); // again, move the the next link
  }
}

By using another level of indirection, it's possible to guarantee that the iterator we use to traverse the list isn't invalidated at any point. There is no way (barring mistakes in writing it) that the loop above can invalidate *link, and therefore there's no need to check before the assignment link = &((*link)->next);

Comments

0
#include <stdio.h>
#include <stdlib.h>
struct node
{
int data;
struct node *next;
};
struct node *createNode(int data)
 {
struct node *n;
n = (struct node *)malloc(sizeof(struct node));
n->data = data;
n->next = NULL;
return n;
}
void traverseList(struct node *head)
{
while (head != NULL)
{
    printf("%d ", head->data);
    head = head->next;
}
}
 void removeDuplicates(struct node *head)
 {
struct node *ptr = head;
while (ptr != NULL)
{
    struct node *runner = ptr;
    while (runner->next != NULL)
    {
        if (ptr->data == runner->next->data)
        {
            runner->next = runner->next->next;
        }
        else
        {
            runner = runner->next;
        }
    }
    free(runner->next);
     ptr = ptr->next;
 }
 }
 int main()
 {
struct node *node1, *node2, *node3, *node4;
node1 = createNode(2);
node2 = createNode(5);
node3 = createNode(5);
node4 = createNode(1);
node1->next = node2;
node2->next = node3;
node3->next = node4;
node4->next = NULL;
traverseList(node1);
removeDuplicates(node1);
printf("\n");
traverseList(node1);

return 0;
}

1 Comment

While your answer may solve the question, including an explanation of how and why this solves the problem would really help to improve the quality of your post, and probably result in more up-votes. Remember that you are answering the question for readers in the future, not just the person asking now. Please edit your answer to add explanations and give an indication of what limitations and assumptions apply. - From Review
-3

Special thanks to @StoryTeller -- I have not verified your solution but your comment about having too many pointers was definitely key. I have re-written my function and it now works for 4 different and special cases (duplicates at the end of list, at beginning of list, randomly throughout list, a list consisting purely of duplicates). Here is the correct code:

void removeDuplicates(struct node** head){
    struct node* currentNode = *head;
    struct node* runningNode = (*head)->next;
    struct node* runningNodePrev = *head;
    int count = -1;
    while(currentNode->next != NULL){
        count++;
        if(count){
            currentNode = currentNode->next;
            runningNodePrev = currentNode;
            runningNode = currentNode->next;
        }
        while(runningNode != NULL){
            if(runningNode->data == currentNode->data){
                runningNodePrev->next = runningNode->next;
                free(runningNode);
                runningNode = runningNodePrev->next;
            }else{
                runningNodePrev = runningNodePrev->next;
                runningNode = runningNode->next;
            }
        }
    }
}

Cheers and thank you to all who commented.

3 Comments

I was curious and ran my function through your test cases. You should know that in the code you posted you do not push 1 into t2 despite expecting it to be there.
Thanks, @StoryTeller. I noticed this to upon testing my code. I have fixed the push() calls with t2 to accurately depict the relevant test case.
struct node* runningNode = (*head)->next; :: *head could be NULL here; dereferencing would step onto a NULL pointer. BTW: if you define things correctly , there are no special cases

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.