3

I had 2 integer arrays, one original and one modified w.r.t the original array. Elements can be added or removed from the original to convert it to the modified. My problem is, in the modified, I need to find out which elements are new, which elements are same, and which elements are not there w.r.t to the original array.

Given data:

arr1 = 3,2,1      //Original array
arr2 = 1,4,5      //Modified array by adding and/or removing elements

I need something like:

same = 1
removed = 2,3
added = 4,5

Obviously, I can write several nested for loops and find it out, but this would be too inefficient. I was wondering if there was be a better or efficient way to do it.. I am using Java. This page kind of address a similar problem, but not sure if I can use it to solve my problem.

Any help would be appreciated.

3
  • Are the integers guaranteed to be in order? Can there be duplicates within an array? Is there an upper bound on the integers? Commented Aug 19, 2011 at 17:53
  • sorry, i should have mentioned that. No the order is not guaranteed.. I will edit the post to make it clear. Commented Aug 19, 2011 at 17:56
  • is 1,2,3 the same as 3,2,1 ? Commented Aug 19, 2011 at 17:58

4 Answers 4

6

If memory is not a constraint, I'd recommend using Set for these kind of operations. Finding the things you require would be just a matter of calling the appropriate methods on the two Set objects. Of course, this assumes you'd have unique elements in your elements and if not, you are interested in only unique elements when it comes to reporting stuff. For e.g.

public static void testSet() {
    final Set<Integer> first = new HashSet<Integer>(Arrays.asList(1, 2, 3));
    final Set<Integer> second = new HashSet<Integer>(Arrays.asList(1, 4, 5));

    Set<Integer> result = new HashSet<Integer>(first);
    result.retainAll(second);
    System.out.println("Similar: " + result);

    result = new HashSet<Integer>(first);
    result.removeAll(second);
    System.out.println("Removed: " + result);

    result = new HashSet<Integer>(second);
    result.removeAll(first);
    System.out.println("Newly added: " + result);
}
/*
OUTPUT:

Similar: [1]
Removed: [2, 3]
Newly added: [4, 5]
*/
Sign up to request clarification or add additional context in comments.

3 Comments

What methods are you talking about? I've never heard of a Java library method that does this kind of comparison.
removeAll and retainAll are the methods doing this king of job.
Updated post with sample snippet.
2

You could go through each array once and obviously use a removal save iteration like a loop from the back of the array.

int[] same
for (int i = arr1.length; i >= 0; i--)
{
    if(arr2.contains(i))
        same.add(i)
        arr1.remove(i)
}
for (int i = arr2.length; i >= 0; i--)
{
    if(same.contains(i))
        arr2.remove(i)
}

Then arr1 will be the list of removed, arr2 will be added and same will be the same.

4 Comments

This is a simple solution. Thanks, I am also taking a look at the dynamic programming solution below.
Yes, this is the simple solution in terms of "space complexity". The contains check here is O(n) just FYI which is exactly the Set solution tries to deal with.
@Sanjay Agreed, its not the best but its simple.
Sorry if this seems like nitpick, but how exactly is this simple? This code has around 12 lines just for finding out same and other stuff. In around 16 lines I have all the 3 things with sample code! Also, it's really OK to go with this solution, IMHO the Set solution is something the OP must keep in mind when dealing with "large" collections when memory is not a constraint. Of course, YMMV :)
2

If there are no duplicates, and the maximum integer is constrained, and the members are moderately dense (say, 1% density or better), make them into BitSets. The "and" of the two sets are "same", A.andNot(B) are the ones in A only, and B.andNot(A) are the ones in B only. If the integers are moderately dense, this is very fast.

If the integers are sparse, sort each array and walk up them in tandem.

Comments

1

You are trying to calculate the Levenshtein distance between two arrays.

There is a simple dynamic programming solution to calculate this (taken from Wikipedia):

int LevenshteinDistance(char s[1..m], char t[1..n])
{
  // for all i and j, d[i,j] will hold the Levenshtein distance between
  // the first i characters of s and the first j characters of t;
  // note that d has (m+1)x(n+1) values
  declare int d[0..m, 0..n]

  for i from 0 to m
    d[i, 0] := i // the distance of any first string to an empty second string
  for j from 0 to n
    d[0, j] := j // the distance of any second string to an empty first string

  for j from 1 to n
  {
    for i from 1 to m
    {
      if s[i] = t[j] then  
        d[i, j] := d[i-1, j-1]       // no operation required
      else
        d[i, j] := minimum
                   (
                     d[i-1, j] + 1,  // a deletion
                     d[i, j-1] + 1,  // an insertion
                     d[i-1, j-1] + 1 // a substitution
                   )
    }
  }

  return d[m,n]
}

You can easily change this code to tell you what the individual operations are.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.