Multidimensional array manipulation - Java

Question

I have a series of arrays that i draw data from and ultimately lead to a final array that holds the information i want. The final array is 2-dimensional, being comprised of a large number of single dimensional arrays, each holding up to 3 entries.

int[][] realOcc = new int[result.length][3];

The way the array holds data is as follows: The first "cell" holds a name, the second a region ID and the third an occurence number - an int telling me how many times this name came up in this particular region ID.

After sorting the array according to name using a Bubble Sort algorithm i naturally see many entries that i wouldn't want to be there. As an example, imagine a name coming up 3 times in a specific region ID. The way the array entries for this name would look like would be as follows:

Name1 regionID17 1
Name1 regionID17 2
Name1 regionID17 3
...
Name156 regionID1 1
Name168 regionID99 1
...

What i want to do is get rid of all the excess entries as in, the entries that correspond to the same name and regID, and only keep the maximum occurence number for each name in a specific region. So, taking the above example, what i would like to see after manipulating the array would be:

Name1 regionID17 3
...
Name156 regionID1 1
Name168 regionID99 1
...

Any ideas would be greatly appreciated since i'm pretty much stumped.. Keep in mind that since the data i'm pulling is pretty big in number, i also need to keep my code efficient.

dimo414 · Accepted Answer · 2015-05-12 03:54:06Z

2

I agree with Mario, you shouldn't be using an array structure here. The fact that you're using Bubble Sort suggests that you're in an intro programming course of some kind, so you may not know about ArrayLists, HashSets, the .equals() method, or the like, but that's what you really want to do. Create a custom object with a custom .equals() method - something like:

public class Record{
  String name;
  String region;

  public boolean equals(Object o){
    Record r = (Record)o;
    return name.equals(r.name) && region.equals(r.region);
  }

  public int hashCode(){
    return name.hashCode()+region.hashCode();
  }
}

Then you can use a HashMap<Record, Integer> to check if a record already exists in the set - if it does, increment count (the map's value) by 1, else add it.

If you want everything sorted in a certain order, you can either define a custom .compareTo() method and use a TreeMap or, if you want everything in insertion order, use a LinkedHashSet<Record> to preserve that order.

edited May 12, 2015 at 3:54

answered Jul 28, 2009 at 18:27

dimo414

49.3k19 gold badges167 silver badges275 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

jprete Over a year ago

Don't forget to implement hashCode or else this method will not work since the HashSet won't correctly find the already-inserted object. Also, for something like this I would use a HashMap<Record, Integer>; the Record has the name and region only (and is otherwise as in your answer) while the Integer is the count.

sp. Over a year ago

Thanks a lot for the answer, it pushed me to the right direction.

jjnguy · Accepted Answer · 2009-07-28 18:21:16Z

2

What you really should look into is using an ArrayList class to hold these 'items.'

You should also create a specific class to hold this data.

Your custom class should look something like this:

class Entry implements Comparable<Entry> {
    private String name, region;
    private int occuranceCount;

    public Entry(String nameP, regionP, occurCountP){
        name = nameP;
        region = regionP;
        occuranceCount = occurCountP;
    }

    // Getters

    public int compareTo(Entry other){
        return name.compareTo(other.name);
    }

    // Equals and hashcode
}

Then you can put these objects into an ArrayList<Entry> and use Collections.sort() which will be much faster than a bubble sort.

After they are sorted, you can loop through and remove duplicate entries using ArrayList.remove().

edited Jul 28, 2009 at 18:21

answered Jul 28, 2009 at 18:17

jjnguy

139k54 gold badges298 silver badges328 bronze badges

5 Comments

dimo414 Over a year ago

Why use an ArrayList and have a second iteration when you can use a hash data structure of some kind?

jjnguy Over a year ago

Well, perhaps he/she wants to have some sort of order to the data.

jjnguy Over a year ago

I know that a set could solve many of the problems, but it would also have no sense of order to the elements.

dimo414 Over a year ago

A TreeMap then would allow you to define a custom ordering. By the sound of the problem though, the sorting is only being done to identify duplicates and remove them.

jjnguy Over a year ago

You are probably right. I guess I was making the assumption that he would like to keep his code as similar to what it is now as possible.

Mario Ortegón · Accepted Answer · 2009-07-28 18:24:17Z

2

A question comes to mind: Why are you using an array? I would think that it is better to use a Set object to hold your results, and then create a Result object that has three fields: one for name, one for region and one for the count. If the equals and hash methods are overriden to only take into account the region and the name, then you will have no duplicates on your set and you can use it to keep track of these result objects.

Another way to achieve the same is to have a Map, where the key is the name + region and the value is the count. That would also make it simple to implement and ensure that you have no duplicates.

edited Jul 28, 2009 at 18:24

answered Jul 28, 2009 at 18:18

Mario Ortegón

18.9k18 gold badges74 silver badges81 bronze badges

Comments

rtenhove · Accepted Answer · 2009-07-28 18:20:59Z

0

Sounds like a Hashtable or Map might be useful. You could make a single pass through the raw data, and use the Map to lookup the name, add it if not seen yet, or check to see if it has been exceeded in value by the new entry if it does. This wouldn't require sorting ahead of time; you could sort afterwards, and save a lot of time :-)

answered Jul 28, 2009 at 18:20

rtenhove

1851 silver badge6 bronze badges

1 Comment

dimo414 Over a year ago

Though not technically deprecated, you don't want to use Hashtable if you can help it, HashMap is correct usage.

Collectives™ on Stack Overflow

Multidimensional array manipulation - Java

4 Answers 4

2 Comments

5 Comments

Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

2 Comments

5 Comments

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related