Complexity of pandas.apply() with lambda

Question

Hello I'd like to know how this code is on complexity notation big O, df1 has ``N rows and df2 has M rows, M << N. Evry x in var_ref will be searched in set(df2.var0). does this equal to N*N == O(n^2) ??

df1['var1'] = df1['var_ref'].apply(lambda x: True if x in df2.var0.unique() else False) * 1

Searching in set is constant, so this should be O(MN). Since you mentioned M<<N, so it's kind of still O(N). If you save your unique set somewhere first and then call that directly, then it will be O(N) for sure (so that you don't need to do var.unique() every time. — Yilun Zhang
– Yilun Zhang, Commented Mar 9, 2020 at 14:31

SidoShiro92 · Accepted Answer · 2020-03-09 14:40:03Z

1

Should be O(N * M). With M the number of unique in df2.

And you should save the unique list somewhere to not calculate it each time.

u = df2.var0.unique()
df1['var1'] = df1['var_ref'].apply(lambda x: True if x in u else False) * 1

I pass from 159 ms to 5 ms (600 rows)

answered Mar 9, 2020 at 14:40

SidoShiro92

1692 silver badges16 bronze badges

Sign up to request clarification or add additional context in comments.

7 Comments

abdoulsn Over a year ago

Why when I save it as u = df2.var.unique() it take more time than when I save it as set(df2.var)??

SidoShiro92 Over a year ago

Iterations over set might be faster than on numpy.ndarray .... On which size of dataset did you make this comparaison ?

abdoulsn Over a year ago

My N is about 1million fows, M when is set() is 40K. It was very fast about 5s!!

abdoulsn Over a year ago

When I use set() it’s very fast but when ndarray it take minutes. Thanks

SidoShiro92 Over a year ago

pd.col.unique() is called at each apply on each row, so it recompute the uniques values. If I recall well set use hashmap for search so it's O(1) complexity, on the other half a numpy array will be iterated to find the value.

|

Collectives™ on Stack Overflow

Complexity of pandas.apply() with lambda

1 Answer 1

7 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

7 Comments

Your Answer

Sign up or log in

Post as a guest

Related