I've recently started Cracking the Code Interview and I just got to LinkedLists. I did problem 2.1 to remove duplicate entries from the book's implementation of LinkedList.
When I time the removal of dupes, I see the book's implementation is faster than Java LinkedList.
I've implemented four dupe removal methods, each with a different parameter. The book's LinkedList implementation is referred to as Node.
static void removeDupes(LinkedList<Integer> list) {
HashSet<Integer> integerHashSet = new HashSet<>();
for (int i = 0; i < list.size(); i++) {
if (integerHashSet.contains(list.get(i))) {
list.remove(list.get(i));
} else {
integerHashSet.add(list.get(i));
}
}
}
static void removeDupes(ArrayList<Integer> list) {
HashSet<Integer> integerHashSet = new HashSet<>();
for (int i = 0; i < list.size(); i++) {
if (integerHashSet.contains(list.get(i))) {
list.remove(list.get(i));
} else {
integerHashSet.add(list.get(i));
}
}
}
static void removeDupes(Node currentNode) {
HashSet<Integer> integerHashSet = new HashSet<>();
while(currentNode != null) {
if (integerHashSet.contains(currentNode.data)) {
currentNode.prev.next = currentNode.next;
if (currentNode.next != null) {
currentNode.next.prev = currentNode.prev;
}
} else {
integerHashSet.add(currentNode.data);
}
currentNode = currentNode.next;
}
}
static void removeDupesNoBuffer(Node currentNode) {
while (currentNode != null) {
Node runnerNode = currentNode;
while(runnerNode.next != null) {
if (runnerNode.next.data == currentNode.data) {
runnerNode.next = runnerNode.next.next;
if (runnerNode.next != null) {
runnerNode.next.prev = runnerNode;
}
} else {
runnerNode = runnerNode.next;
}
}
currentNode = currentNode.next;
}
}
Book's LinkedList implementation:
public class Node {
Node prev;
Node next;
int data;
Node(int d) {
data = d;
}
void add(int d) {
Node newNode = new Node(d);
Node currentNode = this;
while (currentNode.next != null) {
currentNode = currentNode.next;
}
currentNode.next = newNode;
newNode.prev = currentNode;
}
}
Each List or Node, I populated with a length of 100000, where each odd index was 0 and everything else was unique.
My results were:
- Node: 18ms
- Node No Buffer: 5154ms
- LinkedList: 2734ms
- ArrayList: 1392ms
What am I not understanding?
EDIT:
When I swapped the remove dupes with the LinkedList parameter to an enhanced for loop as pointed out by the comments and solution, it became just as fast.
static void removeDupes(LinkedList<Integer> list) {
HashSet<Integer> integerHashSet = new HashSet<>();
for(Integer i : list) {
if (integerHashSet.contains(i)) {
list.remove(i);
} else {
integerHashSet.add(i);
}
}
}
list.get(i)is O(N). Always use node.next instead or the linked list iterator