Java vs. Python specific code-fragment performance improvement

Question

I have a question regarding performance of a specific piece of code in Java and Python.

Algorithm:
I am generating random N-dimensional points and then for all points under a certain distance threshold of each other I do some processing. The processing itself is not important here as it does not impact total execution time. Generating the points also takes a fraction of a second in both cases, so I am only interested in the part that does the comparing.

Execution times:
For a fixed input of 3000 points and 2 dimensions, Java does this in 2 to 4 seconds while Python takes anywhere between 15 and 200 seconds.

I am a little skeptical about the Python execution times. Is there anything I'm missing in this Python code? Are there any algorithmic improvement suggestions (e.g. pre-allocating/reusing memory, a way of lowering Big-Oh complexity, etc.)?

Java

double random_points[][] = new double[number_of_points][dimensions];
for(i = 0; i < number_of_points; i++)
  for(d = 0; d < dimensions; d++)
    random_points[i][d] = Math.random();

double p1[], p2[];
for(i = 0; i < number_of_points; i++)
{
  p1 = random_points[i];
  for(j = i + 1; j < number_of_points; j++)
  {
    p2 = random_points[j];

    double sum_of_squares = 0;
    for(d = 0; d < DIM_; d++)
      sum_of_squares += (p2[d] - p1[d]) * (p2[d] - p1[d]);

    double distance = Math.sqrt(ss);
    if(distance > SOME_THRESHOLD) continue;

    //...else do something with p1 and p2

  }
}

Python 3.2

random_points = [[random.random() for _d in range(0,dimensions)] for _n in range(0,number_of_points)]

for i, p1 in enumerate(random_points):
  for j, p2 in enumerate(random_points[i+1:]):
    distance = math.sqrt(sum([(p1[d]-p2[d])**2 for d in range(0,dimensions)]))
    if distance > SOME_THRESHOLD: continue

    #...else do something with p1 and p2

I am surprised it takes 2-4 second on Java. I would expect much lower once the JVM warms up. Running this with 3000 points and 2 dimensions it take about 40 micro-second on my machine. — Peter Lawrey
– Peter Lawrey, Commented Apr 18, 2011 at 15:45
You're wasting quite some memory by using range over xrange and slicing (which creates shallow copies) instead of using itertools.islice. — user395760
– user395760, Commented Apr 18, 2011 at 16:02
Thanks for the input guys. The question is mostly about Python and whether the execution time could be improved to be in Java's range. I am aware of the range vs xrange, but I forgot to mention I'm running Python 3.2, so that makes it a non-issue and I'll update my post with that in mind. @Mike - definitely a nice general practice tip. — Vuk
– Vuk, Commented Apr 18, 2011 at 17:52

Glorfindel · Accepted Answer · 2023-02-06 19:00:29Z

5

You might want to consider using numpy.

I've just tried the following:

import numpy
from scipy.spatial.distance import pdist
D=2
N=3000
p=numpy.random.uniform(size=(N,D))
dist=pdist(p, 'euclidean')

The last line computes the distance matrix (this is equivalent to computing distance in your code for every pair of points). On my computer it takes about 0.07s.

The main disadvantage of this method is that it requires O(n^2) memory for the distance matrix. If that's a problem, the following may be a better alternative:

for i in xrange(1, N):
  v = p[:N-i] - p[i:]
  dist = numpy.sqrt(numpy.sum(numpy.square(v), axis=1))
  for j in numpy.nonzero(dist > 1.4)[0]:
    print j, i+j

For N=3000, this takes ~0.33s on my computer.

edited Feb 6, 2023 at 19:00

Glorfindel

22.8k13 gold badges97 silver badges124 bronze badges

answered Apr 18, 2011 at 15:42

NPE

503k114 gold badges970 silver badges1k bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Vuk Over a year ago

That definitely reduced running time. Cheers.

Peter Lawrey · Accepted Answer · 2011-04-18 16:21:09Z

If I take 30K points and 5 dimensions, which is 100x more work.

int number_of_points = 30000;
int dimensions = 5;
double SOME_THRESHOLD = 0.1;

long start = System.nanoTime();
double random_points[][] = new double[number_of_points][dimensions];
for (int i = 0; i < number_of_points; i++)
    for (int d = 0; d < dimensions; d++)
        random_points[i][d] = Math.random();

double p1[], p2[];
Comparator<double[]> compareX = new Comparator<double[]>() {
    @Override
    public int compare(double[] o1, double[] o2) {
        return Double.compare(o1[0], o2[0]);
    }
};
Arrays.sort(random_points, compareX);

double[] key = new double[dimensions];
int count = 0;
for (int i = 0; i < number_of_points; i++) {
    p1 = random_points[i];
    key[0] = p1[0] + SOME_THRESHOLD;
    int index = Arrays.binarySearch(random_points, key, compareX);
    if (index < 0) index = ~index;
    NEXT: for (int j = i + 1; j < index; j++) {
        p2 = random_points[j];

        double sum_of_squares = 0;
        for (int d = 0; d < dimensions; d++) {
            sum_of_squares += (p2[d] - p1[d]) * (p2[d] - p1[d]);
            if (sum_of_squares > SOME_THRESHOLD * SOME_THRESHOLD) 
                continue NEXT;
        }

        //...else do something with p1 and p2
        count++;
    }
}
long time = System.nanoTime() - start;
System.out.println("Took " + time / 1000 / 1000 + " ms. count= " + count);

Prints

Took 1549 ms. count= 20197

rlibby · Accepted Answer · 2011-04-18 16:35:31Z

1

Does speed really matter? Here are a couple obvious speed-ups:

Don't compute square root. Just square your threshold and compare against the squared threshold.
Sort your points along one dimension (the outer one in your loop). When two points i and j are farther than than your threshold on just this dimension, then further incrementing j will only produce points farther than this threshold, and you can continue the outer loop.

There may be other algorithmic speed ups, even the above is still O(n^d).

answered Apr 18, 2011 at 16:35

rlibby

6,05122 silver badges25 bronze badges

Collectives™ on Stack Overflow

Java vs. Python specific code-fragment performance improvement

3 Answers 3

1 Comment

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

1 Comment

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related