Most elegant/efficient/Pythonic way to calculate multiple parallel arrays?

Question

For some econometric work.

I often need to derive multiple parallel arrays of calculated variables given a (potentially) large number of parallel data arrays.

In the following example, I have two input arrays and two output arrays, but imagine in the real world there could by anywhere from 5-10 input and output arrays.

w, x are inputs
y, z are outputs

Method A:

w = [1, -2, 5]
x = [0, 3, 2]
N = len(w)
I = range(N)
y = map(lambda i: w[i] + x[i], I)
z = map(lambda i: w[i] - x[i], I)

Method B:

w = [1, -2, 5]
x = [0, 3, 2]
N = len(w)
I = range(N)
y, z = [], []
for i in I:
  y.append(w[i] + x[i])
  z.append(w[i] - x[i])

Method C:

w = [1, -2, 5]
x = [0, 3, 2]
y, z = [], []
for w_i, x_i in zip(w, x):
  y.append(w_i + x_i)
  z.append(w_i - x_i)

Method D:

w = [1, -2, 5]
x = [0, 3, 2]
N = len(w)
I = range(N)
(y, z) = transpose(map(lambda i: [w[i] + x[i], w[i] - x[i]], I))

D seems to be the most concise, extendable, and efficient. But it's also the most difficult to read, especially with many variables with complicated formulae.

A is my favorite, with a little duplication, but is it less efficient to construct a loop per vairable? Will this not scale with large data?

B vs. C: I know C is more pythonic but B seems more convenient and concise, and scales better with more variables. In both cases, I hate the extra line where I have to declare the variables up-front.

Overall, I am not perfectly satisfied with any of the above approaches. Is there something missing from my reasoning or is there a better method out there?

Have you considered using numpy? Most scientific computing in Python is done in numpy. In numpy, this would just be y = w + x; z = w - x. — senshin
– senshin, Commented Jan 30, 2015 at 1:03
@PadraicCunningham I mean, does it matter? They both do the same thing. — senshin
– senshin, Commented Jan 30, 2015 at 1:09
@PadraicCunningham Come on, you know I used a semicolon because you can't have newlines in comments. — senshin
– senshin, Commented Jan 30, 2015 at 1:11

Joran Beasley · Accepted Answer · 2015-01-30 01:02:57Z

2

use numpy ... that performs the operations in C++ so its much faster ... (especially if we assume your arrays are much bigger than 3 items)

w = numpy.array([1, -2, 5])
x = numpy.array([0, 3, 2])

y = w+x
z = w-x

answered Jan 30, 2015 at 1:02

Joran Beasley

114k13 gold badges168 silver badges187 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

cas5nq Over a year ago

Thanks. This is a winner. So simple and elegant.

adrianX · Accepted Answer · 2015-01-30 01:12:03Z

0

i think @Beasley's suggestion works well, and i suggest using multiprocessing on top of it so that the output generation is in parallel. your computation seems perfectly parallelizable!

what i can offer can't beat the tips discussed on here: Does python support multiprocessor/multicore programming?

answered Jan 30, 2015 at 1:12

adrianX

6277 silver badges22 bronze badges

Collectives™ on Stack Overflow

Most elegant/efficient/Pythonic way to calculate multiple parallel arrays?

2 Answers 2

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related