Java - Find the difference between two arrays with duplicates

Question

I've implemented a method for finding the difference between two unsorted arrays. Currently I have achieved getting the diff without duplicates. But how to make it take duplicates into consideration as well?
For example for the below input arrays I expect the output [4 5 3]:

int[] arr1 = {1, 2, 3, 4, 5, 5};
int[] arr2 = {1, 2, 3, 5, 3};

For these input arrays I expect [7 7 9]

int[] arr3 = {7, 7, 4, 9, 6};
int[] arr4 = {4, 6};

//

static ArrayList<Integer> findDifference(int[] a, int[] b) {
    ArrayList<Integer> arr1 = new ArrayList<Integer>() {
        { for (int i : a) add(i); }
    };
    ArrayList<Integer> arr2 = new ArrayList<Integer>() {
        { for (int i : b) add(i); }
    };

    if (arr1.size() > arr2.size()) {
        arr1.removeAll(arr2);
        return arr1;
    } else {
        arr2.removeAll(arr1);
        return arr2;
    }
}

What do you mean by difference ? Can you provide more examples? — BlackPearl
– BlackPearl, Commented Jan 20, 2020 at 14:29
@BlackPearl from the description it looks like they're talking about a concept similar to relative complement, but as if a set can have duplicates which are not considered actual duplicates. — Federico klez Culloca
– Federico klez Culloca, Commented Jan 20, 2020 at 14:31
If your output returns 4 (in arr1 and not in arr2) and 3 (in arr2 and not in arr1) - then you need a combination of both and removeAll() and a simple return of one array wont work -- as you are missing the data from the other — Alan Kavanagh
– Alan Kavanagh, Commented Jan 20, 2020 at 14:32
@BlackPearl in this case the difference are the numbers, without which the two arrays will contain the same elements. — samba
– samba, Commented Jan 20, 2020 at 14:33
I thought about using a HashMap for storing an integer value as a key and its count as a value. But no idea how I should use it afterwards — samba
– samba, Commented Jan 20, 2020 at 14:35

Konstantin Yovkov · Accepted Answer · 2020-01-20 14:59:47Z

You can keep a count for each value in the first array. You can use a HashMap to hold how many occurences you have for a specific value.

Then, for each value in the second array, you can decrease the already calculated count for this value. In the end, if the count for a specific value is 0, this would mean that there were the same number of appearances in the both arrays. Otherwise, one of the arrays contained more occurences of value. The number of differences for a specific value would be abs(count[value]) (as it can get negative, in the case when the second arrays contains more occurences of value than the first array).

This Java code illustrates the approach:

public List<Integer> findDiff(int[] first, int[] second) {
  Map<Integer, Integer> count = new HashMap<>();
  for (int value : first) {
    int current = count.getOrDefault(value, 0);
    count.put(value, current + 1);
  }
  for (int value : second) {
    int current = count.getOrDefault(value, 0);
    count.put(value, current - 1);
  }
  List<Integer> result = new ArrayList<>();
  for (Map.Entry<Integer, Integer> entry : count.getEntrySet()) {
    int diff = entry.getValue();
    int times = Math.abs(diff);
    for (int i = 0; i < times; i++) {
      result.add(entry.getKey());
    }
  }
  return result;
}

Clearly, we have a linear complexity for both time and memory.

BeUndead · Accepted Answer · 2020-01-20 15:18:21Z

Almost certainly not an optimal solution, but as something you can hopefully work with:

private static <X> Collection<X> findDiff(final Collection<X> a, final Collection<X> b) {
    // Copy the Collections so you don't modify inputs
    // and so you can safely 'remove' from them.
    final List<X> aCopy = new ArrayList<>(a);
    final List<X> bCopy = new ArrayList<>(b);

    // Remove all common elements from the copies
    // Using 'removeAll' will pull out duplicates,
    // so do this one-by-one.
    for (final X bElement : b) {
        aCopy.remove(bElement);
    }
    // Note it's important to iterate over 'a' here, not
    // aCopy since the elements of aCopy (may) have had some
    // entries 'remove'd.
    for (final X aElement : a) {
        bCopy.remove(aElement);
    }

    // Combine the two cleared out lists to find
    // the cumulative difference.
    final List<X> diff = new ArrayList<>(aCopy);
    diff.addAll(bCopy);

    return Collections.unmodifiableCollection(diff);
}

Note, you can convert your int[] to a Collection<Integer> using something simple like:

IntStream.of(arr).boxed().collect(Collectors.toList());

Note also: you can do this with fewer intermediate Collections. You only need to copy one of the input ones if you don't mind modifying the inputs. And you don't need to combine the two into a new diff one. This was just something to work with (and more explanatory).

vkourt · Accepted Answer · 2020-01-20 14:40:01Z

0

If you want an absolute difference between the two arrays (in this case the only different element being 4) you could calculate the union and the intersection of the two sets.

To exclude duplicates you could also use a Set instead of a List to guarantee uniqueness. A very simple sample could be the following:

    public static void main(String... args) {
        Integer[] arr1 = {1, 2, 3, 4, 5, 5};
        Integer[] arr2 = {1, 2, 3, 5, 3};

        Set<Integer> diffs = findDiff(arr1, arr2);
        diffs.forEach(System.out::println);
    }

    public static Set<Integer> findDiff(Integer[] array1, Integer[] array2) {
        List<Integer> list1 = Arrays.asList(array1);
        List<Integer> list2 = Arrays.asList(array2);
        Set<Integer> union = new HashSet<>(list1);
        union.addAll(list2);
        Set<Integer> intersection = new HashSet<>(list1);
        intersection.retainAll(list2);
        union.removeAll(intersection);
        return union;
    }

answered Jan 20, 2020 at 14:40

vkourt

5,6332 gold badges36 silver badges71 bronze badges

1 Comment

samba Over a year ago

Thank you. But the expected output here should be [4 5 3]

kevin ternet · Accepted Answer · 2020-01-20 16:32:06Z

Here is a solution that works with both exemple :

public static void main(String[] args) {
    int[] arr1 = {1, 2, 3, 4, 5, 5};
    int[] arr2 = {1, 2, 3, 5, 3};
    System.out.println(findDifference(arr1, arr2));
    int[] arr3 = {7, 7, 4, 9, 6};
    int[] arr4 = {4, 6};
    System.out.println(findDifference(arr3, arr4));
}
static ArrayList<Integer> findDifference(int[] a, int[] b) {
    ArrayList<Integer> list1 = new ArrayList<Integer>();
    Arrays.stream(a).forEach(e -> list1.add(e));
    ArrayList<Integer> list2 = new ArrayList<Integer>();
    Arrays.stream(b).forEach(e -> list2.add(e));

    ArrayList<Integer> list1Copy = new ArrayList<Integer>();
    ArrayList<Integer> list2Copy = new ArrayList<Integer>();
    list1Copy.addAll(list1);
    list2Copy.addAll(list2);

    list1.forEach(e -> list2Copy.remove(e));
    list2.forEach(e -> list1Copy.remove(e));
    list1Copy.addAll(list2Copy);
    return list1Copy;
}

output :

[4, 5, 3] [7, 7, 9]

The principle is to process removing operation on copy to be abble to iterate again on initial list

Collectives™ on Stack Overflow

Java - Find the difference between two arrays with duplicates

4 Answers 4

Comments

Comments

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

Comments

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related