apply function versus vectorised operation in pandas dataframe

I am working with a DataFrame of almost 1M rows and want to compute a column as a function of two others. My first idea was to use .apply(axis=1) with a lambda function to do the operation, but it was extremely slow compared to when I do vectorized operation.

An example of the task:

import pandas as pd
import numpy as np
import time

df = pd.DataFrame({
    "a": np.random.randint(0, 100, 100000),
    "b": np.random.randint(0, 100, 100000)})

start1 = time.time()
df["sum1"] = df.apply(lambda row: row["a"] + row["b"], axis=1)
print("apply:", time.time() - start1)

start2 = time.time()
df["sum2"] = df["a"] + df["b"]
print("vectorized:", time.time() - start2)

Is it always the case? or there are circumstances that apply() function works more efficient than vectorised operation? and if I need custom logic on rows that cannot turn into vectorized operations, what is the recommended alternative?

edited Sep 12 at 16:41

wjandrea

33.9k10 gold badges69 silver badges105 bronze badges

asked Sep 12 at 16:10

amiref

3,4918 gold badges47 silver badges64 bronze badges

We have a bunch of existing questions on this topic. I would start with cs95's answer on "How can I iterate over rows in a Pandas DataFrame?" and go from there. If you don't find a satisfactory/understandable answer, you can edit to say what you found. BTW, check out How to Ask, which has tips like starting with your own research and how to write a good title.

wjandrea
– wjandrea

2025-09-12 16:45:12 +00:00
Commented Sep 12 at 16:45
To find more questions, try googling site:stackoverflow.com is apply always slower than vectorized in pandas

wjandrea
– wjandrea

2025-09-12 16:46:45 +00:00
Commented Sep 12 at 16:46

Add a comment |

0

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

Collectives™ on Stack Overflow

apply function versus vectorised operation in pandas dataframe [duplicate]

0

Linked

Hot Network Questions