Improving speed when dealing with big numbers and big shape of arrays in Python

Question

I have a task:

How many pairs of (i,j): array_1[ i ] + array_1[ j ] > array_2[ i ] + array_2[ j ]

This is my code:

import  numpy as np 
import  pandas as pd

n = 200000

series_1 = np.random.randint(low = 1,high = 1000,size = n)
series_1_T = series_1.reshape(n,1)
series_2  = np.random.randint(low = 1,high = 1000,size = n)
series_2_T = series_2.reshape(n,1)

def differ(x):
    count = 0
    tabel_1 = series_1 + series_1_T[x:x+2000]
    tabel_2 = series_2 + series_2_T[x:x+2000]
    diff= tabel_1[tabel_1>tabel_2].shape[0]
    count += diff
    return count

arr = pd.DataFrame(data = np.arange(0,n,2000),columns = ["numbers"])

count_each_run = arr["numbers"].apply(differ) #this one take about 8min 40s

print(count_each_run.sum())

Are there any ways to speedup this?

FBruzzesi · Accepted Answer · 2020-03-21 09:40:05Z

1

If you don't run in memory error you can do:

n = 200_000

s1 = np.random.randint(low=1, high=1000, size=(n,1))
s2 = np.random.randint(low=1, high=1000, size=(n,1))

t1 = s1 + s1.T
t2 = s2 + s2.T

tot = np.sum(t1>t2)

Otherwise you can create batches, and again depending on what you can fit in memory you can use one or two for loops:

n = 200_000

s1 = np.random.randint(low=1, high=1000, size=(n,1))
s2 = np.random.randint(low=1, high=1000, size=(n,1))

bs = 10_000 # batchsize
tot = 0
for i in range(0, n, bs):
    for j in range(0, n, bs):

        t1 = s1[i:i+bs] + s1[j:j+bs].T
        t2 = s2[i:i+bs] + s2[j:j+bs].T

        tot += np.sum(t1>t2)

If you need speed you can try something like numba or cython.

answered Mar 21, 2020 at 9:40

FBruzzesi

6,6143 gold badges19 silver badges42 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Improving speed when dealing with big numbers and big shape of arrays in Python

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related