{Java - PriorityQueue} time complexity of this code

Question

Given an array containing N points find the K closest points to the origin (0, 0) in the 2D plane. You can assume K is much smaller than N and N is very large.

E.g:
    given array: (1,0), (3,0), (2,0), K = 2 
        Result = (1,0), (2,0)  
(result should be in ascending order by distance)

Code:

import java.util.*;

class CPoint {
    double x;
    double y;
    public CPoint(double x, double y) {
        this.x = x;
        this.y = y;
    }
}

public class KClosest {
    /**
     * @param myList: a list of myList
     * @param k: the number of closest myList
     * @return: the k closest myList
     */
    public static CPoint[] getKNearestPoints(CPoint[] myList, int k) {

        if (k <= 0 || k > myList.length)  return new CPoint[]{};                                
        if (myList == null || myList.length == 0 )  return myList; 

        final CPoint o = new CPoint(0, 0); // origin point

        // use a Max-Heap of size k for maintaining K closest points
        PriorityQueue<CPoint> pq = new PriorityQueue<CPoint> (k, new Comparator<CPoint> () {
            @Override
            public int compare(CPoint a, CPoint b) {
                return Double.compare(distance(b, o), distance(a, o));  
            }
        });

        for (CPoint p : myList) {   // Line 33
            // Keep adding the distance value until heap is full. // Line 34
            pq.offer(p);            // Line 35
            // If it is full        // Line 36
            if (pq.size() > k) {    // Line 37
                // Then remove the first element having the largest distance in PQ.// Line 38
                pq.poll();          // Line 39  
            }  // Line 40
        }       
        CPoint[] res = new CPoint[k];
        // Then make a second pass to get k closest points into result. 
        while (!pq.isEmpty()) {     // Line 44
            res[--k] = pq.poll();   // Line 45                   
        }                           // Line 46

        return res;
    }

    private static double distance(CPoint a, CPoint b) {        
        return (a.x - b.x) * (a.x - b.x) + (a.y - b.y) * (a.y - b.y);
    }

}

Question:

What is time complexity for line 35, line 39, independently and separately?

What is time complexity for line 35 - 40 (As a whole) ?

What is time complexity for line 44 - 46 (As a whole) ?

What is overall time complexity for entire method getKNearestPoints(), in best, worst and average case? What if n >> k ? and what if we don't have n >> k ?

Actually these questions are a couple of question during my technical interview, but I'm still kinda confused on it. Any help is appreciated.

This reads a lot like homework. What exactly are you confused by? What do you think the answer is and why? — Krease
– Krease, Commented Oct 8, 2017 at 5:00
I answered following during interview: Q1: log(K), log(K), Q2: log(K) or log(K) ^ 2 Q3: klog(K) Q4: NlogK + klogK but overall is NlogK? i'm not sure — Peter
– Peter, Commented Oct 8, 2017 at 5:03
I knew for PQ, all operations like add/offer, remove/poll takes OlogK, except peek is O1. but for these questions specifically. I'm really kinda lost.. — Peter
– Peter, Commented Oct 8, 2017 at 5:14

harmands · Accepted Answer · 2017-10-08 09:34:22Z

4

From the looks of it, I think the person who has written this code must be knowing the answer to these questions.

Anyways, Priority Queue here is based on Max Heap implementation.

So, complexities are as follows:

Line 35 - O(log k) The time to insert an element in the heap. Bottom up approach is followed in the heap at the time of insertion.

Line 37 - O(1), The time to check the size of the heap, generally it is maintained along with the heap.

Line 39 - O(log k), The time to remove the head of the heap, the heapify approach at the root of the heap is applied to remove the top of the heap.

Line 35-40: From the above complexities we can see that the overall complexity of one iteration will be O(log k). This loop runs for n elements, so the overall complexity will be O(n log k).

Line 44-46: The complexity of checking the size of the heap is again O(1), and polling is O(log k). So we are doing polling k times. The overall complexity of the loop will be O(k log k).

Overall complexity will remain O(n log k).

This is an awesome place to study this topic.

edited Oct 8, 2017 at 9:34

answered Oct 8, 2017 at 5:13

harmands

1,1349 silver badges25 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

Peter Over a year ago

Hi! this answer is really helpful. But I'm still kinda confused on Line 35-40. Say when this PQ is full, then there would be (n - k) times, pq.offer(p); and pq.poll(); should be executed together. That's should be O(logk) + O(logk) ,right? but why we still consider it as a O(logk) runtime?

harmands Over a year ago

Ok, to put it up mathematically, O(logk)+O(logk) = O(2logk)=O(logk^2)=O(logk), I mean they can be written in all those ways.

Peter Over a year ago

That makes perfect sense! Thanks! Just one more question, why we can "drop" the time of "make a second pass to get k closest points into result". (klogk)? Overall is O(nlogK), but not O(nlogk) + O(klogk)

Peter Over a year ago

Oh! is that because of N >> K, so it can be dropped. I got it. Thanks so much!

harmands Over a year ago

Yes, kind of, that is the reason.

Collectives™ on Stack Overflow

{Java - PriorityQueue} time complexity of this code

1 Answer 1

5 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

5 Comments

Your Answer

Sign up or log in

Post as a guest

Related