0

I have the following dataframe and I am trying to add a new column df['count'] that returns the count from each row based on the condition specified in df['E'].

df = pd.DataFrame({"A": [14, 4, 5, 4, 1],
                   "B": [5, 2, 54, 3, 2],
                   "C": [20, 20, 7, 3, 8],
                   "D": [14, 3, 6, 2, 6],
                   "E":[14, 2, 5, 3, 1]})

The output would look something like this:

    A   B   C   D   E  count
0  14   5  20  14  14      2
1   4   2  20   3   2      1
2   5  54   7   6   5      1
3   4   3   3   2   3      2
4   1   2   8   6   1      1

I think the function should look something like df['count'] = df[df == df['E']].count(axis=1). However, that is only returning 0s and includes df['E] in the output.

2 Answers 2

1

You could use:

df['count'] = df.drop('E', axis=1).eq(df['E'], axis=0).sum(axis=1)

or:

# count all including E and remove 1
df['count'] = df.eq(df['E'], axis=0).sum(axis=1).sub(1)

output:

    A   B   C   D   E  count
0  14   5  20  14  14      2
1   4   2  20   3   2      1
2   5  54   7   6   5      1
3   4   3   3   2   3      2
4   1   2   8   6   1      1

note on the speed

Timing on 500k rows:

%%timeit
df.drop('E', axis=1).eq(df['E'], axis=0).sum(axis=1)
# 7.27 ms ± 324 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

%%timeit
df.eq(df['E'], axis=0).sum(axis=1).sub(1)
# 5.88 ms ± 335 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

%%timeit
df.apply(lambda x:sum(x["A":"D"]==x["E"]), axis=1)
# still running after 5 minutes… (takes ~10s on only 50k rows)
Sign up to request clarification or add additional context in comments.

3 Comments

the last one you proposed is by far the most elegant and fast
Thanks! Yes, I assumed it would be faster to just subtract 1 than to regenerate a dataframe in memory ;)
@nikeros I added some timings
0

An alternative could be:

df.loc[:,"count"]=df.apply(lambda x:sum(x["A":"D"]==x["E"]), axis=1)

1 Comment

Using apply in such a simple case, when the code can be vectorized, is not efficient

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.