4

I'm currently looking through two very large lists of Peak Objects, by overriding the equals method and looping through the two lists, comparing every peak to every other peak. Is there a more efficient way of doing this? My lists can be ~10,000 elements, which means up to 10000 * 10000 comparisons.

The code for my peak object:

public class Peak extends Object{

private final SimpleIntegerProperty peakStart;
private final SimpleIntegerProperty peakEnd;
private final SimpleIntegerProperty peakMaxima;
private final SimpleIntegerProperty peakHeight;
private final SimpleIntegerProperty peakWidth;
private final SimpleStringProperty rname;

public Peak(int peakStart, int peakEnd, int peakMaxima, int peakHeight, String rname) {
    this.peakStart = new SimpleIntegerProperty(peakStart);
    this.peakEnd = new SimpleIntegerProperty(peakEnd);
    this.peakMaxima = new SimpleIntegerProperty(peakMaxima);
    this.peakHeight = new SimpleIntegerProperty(peakHeight);
    this.peakWidth = new SimpleIntegerProperty(peakEnd - peakStart);
    this.rname = new SimpleStringProperty(rname);
}

public String getRname() {
    return rname.get();
}

public SimpleStringProperty rnameProperty() {
    return rname;
}

public int getPeakWidth() {
    return peakWidth.get();
}

public int getPeakHeight() {
    return peakHeight.get();
}

public int getPeakStart() {
    return peakStart.get();
}

public int getPeakEnd() {
    return peakEnd.get();
}

public int getPeakMaxima() {
    return peakMaxima.get();
}

@Override
public String toString() {
    return "Peak{" +
            "peakStart= " + peakStart.get() +
            ", peakEnd= " + peakEnd.get() +
            ", peakHeight= " + peakHeight.get() +
            ", rname= " + rname.get() +
            '}';
}

@Override
public boolean equals(Object o) {
    if (this == o) return true;
    if (o == null || getClass() != o.getClass()) return false;

    Peak peak = (Peak) o;

    if (!peakMaxima.equals(peak.peakMaxima)) return false;
    return rname.equals(peak.rname);
}

@Override
public int hashCode() {
    int result = peakMaxima.hashCode();
    result = 31 * result + rname.hashCode();
    return result;
}
}

And my loop for comparing the objects is here.

 List<Peak> interestingPeaks = new ArrayList<>();

            if(peakListOne != null && peakListTwo != null){
                for(Peak peak : peakListOne){
                    for(Peak peak2 : peakListTwo){
                        if(peak.equals(peak2)){ //number one, check the rnames match
                            if((peak2.getPeakHeight() / peak.getPeakHeight() >= 9) || (peak.getPeakHeight() / peak2.getPeakHeight() >= 9)){
                                    interestingPeaks.add(peak);
                            }
                        }
                    }
                }
            }

            return interestingPeaks;

The code is basically matching the position of the maxima, and the rname , which is just a String. Then appending the peak to the interestingPeaks list if the height of one is a factor of 9x larger than the other.

4
  • Did you try using hashes? Commented Apr 16, 2018 at 11:24
  • If you sort your lists, you can compare a lot smarter. Commented Apr 16, 2018 at 11:26
  • There are conditions that we don't know about that could make your algorithm simpler or more difficult. Can the lists be sorted or does the order matter? Are peak repetitions allowed within a single list (maybe it could be a HashSet)? There are many algorithms that could help you there but you have to see if the possible approaches match your particular case. Commented Apr 16, 2018 at 11:32
  • The lists are not sorted on input, but there's no reason why they couldn't be sorted as part of the method. Commented Apr 16, 2018 at 11:35

1 Answer 1

4

Appreciate that if the two lists were sorted by maxima and name, you could simply make a single linear pass down both lists, and compare items side by side. If the two lists were in fact completely equal, then you would never find a pair from the two lists which were not equal.

List<Peak> p1;
List<Peak> p2;

p1.sort((p1, p2) -> {
    int comp = Integer.compare(p1.getPeakMaxima(), p2.getPeakMaxima());
    return comp != 0 ? comp : p1.getRname().compareTo(p2.getRname());
});

// and also sort the second list

Now we can just walk down both lists and check for a comparison failure:

for (int i=0; i < p1.size(); ++i) {
    if (!p1.get(i).equals(p2.get(i))) {
        System.out.println("peaks are not equal");
        break;
    }
}

This reduces an O(N^2) operation to one which is O(N*lgN), which is the penalty for doing both sorts (the final walk down the list is O(N), and would be negligible with either approach).

Sign up to request clarification or add additional context in comments.

6 Comments

O(n^2) reduced to O(n). That's quite an optimisation - if and only if the lists are allowed to be reordered. The OP never mentioned that though, so while a good answer, we can only hope it'll be of use ;)
@mingos I answered incompletely. I am updating now.
I can absolutely sort the lists. This will work perfectly. thanks.
@Sam Did you want to flag every non matching peak, or did you just want to assert that all the peaks either are, or are not, the same? If the latter, you are done and my answer should work. It the former, my answer is a good start, but we need to do more work.
@TimBiegeleisen I only need to know where the peakMaximas and the rnames match, so your example will definitely work. Thanks again.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.