Performance of polynomial functions in Mathematica vs Python [closed]

Question

Closed. This question is off-topic. It is not currently accepting answers.

The question is out of scope for this site. The answer to this question requires either advice from Wolfram support or the services of a professional consultant.

Closed 2 years ago.

Improve this question

I am trying to compare the performance of Mathematica vs Python for vectorized operations involving polynomials. The data is floatMatrix which has dimensions (750000, 4). The function testFunction[x,y,w,z] is a polynomial function of 4 variables that returns a 4D vector and is meant to be applied to all of the 750000 vectors. Both codes below contain the same polynomial functions written in an explicit form (I have not included the full polynomials because they are long, and they are exactly the same).

For Mathematica, I am using a listable Compile with parallelization.

 Compile[{{f, _Real, 1}}, 
 {
   {0.011904761904761973` f[[2]]f[[1]]^3 + 
    0.002976190476190474` f[[1]]f[[2]]^3 - 0.020833333333333325` f[[3]] + 
    0.002976190476190474` f[[3]]^3 + 
    f[[2]]^2 (0.0029761904761904778` f[[3]] +...
   {0.002976190476190483` f[[1]]^3 + 0.011904761904761906` f[[2]]^3 - 
    0.0875` f[[3]] + 0.0029761904761904765` f[[3]]^3 + 
    f[[1]]^2 (0.005952380952380952` f[[2]] +...
 },CompilationTarget -> "C", RuntimeAttributes -> {Listable}, 
  Parallelization -> True];

time = RepeatedTiming[testFunction[floatMatrix]];
Print["In Mathematica-C it takes an average of ", time[[1]], " secs."]

For Python I am using NumPy

def testFunction(data):
    f1, f2, f3, f4 = data.T
    
    results = np.zeros((data.shape[0], 4))  # Initialize a results array

    results[:, 0] = (0.011904761904761973*f2*f1**3 + 0.002976190476190474*f1*f2**3 - 
                    0.020833333333333325*f3 + 0.002976190476190474*f3**3 + f2**2* 
                    (0.0029761904761904778*f3 +...
    results[:, 1] = (0.002976190476190483*f1**3 + 0.011904761904761906*f2**3 - 0.0875*f3
                   + 0.0029761904761904765*f3**3 + f1**2*(0.005952380952380952*f2 + 
                     0.002976190476190469*f3 + 0.0029761904761904726*f4) +...
    return results

duration=0

for i in range(10):
    start_time = time.time()
    testFunction(floatMatrix)
    end_time = time.time()
    duration = duration + end_time - start_time

duration=duration*0.1

print(f"With numpy it takes an average of': {duration} seconds")

As you can see it is a simple straightforward comparison. In my machine, Mathematica gets 0.1 secs and Python 0.4 secs (also in Google Colab). People often talk about NumPy as being incredibly fast this makes me think that either I am doing something wrong or those people don't know how to exploit Mathematica's parallelization and packed arrays.

Which one is it? Am I using the tools incorrectly?

EDIT: After I few suggestions by azerbajdzan I tried running the codes on a multi processor CPU. This time a 10 core. These are the results:

Mathematica with Parallelization -> True: 0.0319974 seconds
Mathematica with Parallelization -> False: 0.119938 seconds
Numpy: 0.206754 seconds

So, the question still stands.

UPDATE: I have used @azerbajdzan suggestions and removed the transposing from inside of the function. I have created a continuation question here. The python code without transposing is still slower than Mathematica's. They are both run in a single processor and produce the same output.

Am I right that Mathematica code runs on all your processors while the Python code only on one? If you want to compare performance you should test it on single processor in both cases. — azerbajdzan
– azerbajdzan, Commented Sep 24, 2023 at 10:15
What do you mean on single processor? I am testing them both on the same machine. Edit: I have a dual core CPU, do you think Mathematica is taking advantage of that but NumPy is not? That's why I tried to also run it in Colab and the timing was still worst than Mathematica in my computer. Which was unexpected. — mmen
– mmen, Commented Sep 24, 2023 at 10:28
Yes, same machine, but with Parallelization -> True you tell the compiler you want the code run on all processors. If you have 4 core processor then Mathematica code runs on all 4 of them. Your python code runs only on one core. But there are methods to parallelize also on Python. Just search for "python parallelize". Remove the parameter Parallelization -> True and then compare the times. Or parallelize also python code. — azerbajdzan
– azerbajdzan, Commented Sep 24, 2023 at 10:33
First, I don't understand why you expect an uncompiled, interpretable (!) code in Python to be faster or comparable to a compiled C code. Second, this question seems more about Python (or NumPy) than about Mathematica (you're not even using Mathematica, really, since you are just compiling it to C), and therefore more appropriate for some other StackExchange size. — Domen
– Domen, Commented Sep 24, 2023 at 11:34
you could give numba njit (parallel=True) a try, numba.pydata.org/numba-doc/latest/user/parallel.html — AsukaMinato
– AsukaMinato, Commented Sep 24, 2023 at 13:24

azerbajdzan · Accepted Answer · 2023-09-24 12:09:47Z

0

If you want to compare performance of two programming languages make sure:

you test it on the same number of CPU cores - preferably on a single one
both codes are doing the same - take the same input and produce the same output

Seems OP did not fulfill any of the two.

edited Sep 24, 2023 at 12:09

answered Sep 24, 2023 at 12:02

azerbajdzan

32.8k2 gold badges28 silver badges77 bronze badges

$\begingroup$ Both codes are running in the same number of cores and producing the same output. $\endgroup$

mmen
– mmen

2023-09-24 12:06:50 +00:00
Commented Sep 24, 2023 at 12:06
$\begingroup$ Please check stackoverflow.com/questions/77167008/… I have run the code doing the transposing outside of the function. It is still slow. I am using the same processor for both and I am stopping Mathematica from doing any parallelization. $\endgroup$

mmen
– mmen

2023-09-24 12:11:58 +00:00
Commented Sep 24, 2023 at 12:11
$\begingroup$ You still have not explained why you are not doing any transpose in Mathematica but in Python you are. $\endgroup$

azerbajdzan
– azerbajdzan

2023-09-24 12:15:33 +00:00
Commented Sep 24, 2023 at 12:15
$\begingroup$ Is just the way the functions are written in python for vectorization. The transposition is outside of the function that I am timing. Do you think that is still making the process slower? $\endgroup$

mmen
– mmen

2023-09-24 12:25:21 +00:00
Commented Sep 24, 2023 at 12:25
$\begingroup$ Since you did not provide complete functional codes I can not even test anything. $\endgroup$

azerbajdzan
– azerbajdzan

2023-09-24 12:35:48 +00:00
Commented Sep 24, 2023 at 12:35

| Show 1 more comment

Stack Exchange Network

Performance of polynomial functions in Mathematica vs Python [closed]

1 Answer 1

Hot Network Questions

Performance of polynomial functions in Mathematica vs Python [closed]

1 Answer 1

Related

Hot Network Questions