0

I have a pandas dataframe like below

Id Date Aid
0 2022-01-01 one
1 2022-01-01 two
2 2022-01-05 one
3 2022-01-06 three
4 2022-01-02 one
5 2022-01-01 one

I would like to add a column "counter" which, for each row, will contain the number of rows with the same "Aid" and "Date" lower or equal with the row's "Date"

So, for row 4, 'Counter' = 3 (rows 0, 4, 5).

The result should be:

Id Date Aid Counter
0 2022-01-01 one 2
1 2022-01-01 two 1
2 2022-01-05 one 4
3 2022-01-06 three 1
4 2022-01-02 one 3
5 2022-01-01 one 2

Is there a way to do this count in pandas ?

The original dataframe contains milions of rows so efficiency is very important

1 Answer 1

1

rank with max method

pd.to_datetime(df['Date']).groupby(df['Aid']).rank(method='max').astype('int')

output:

0    2
1    1
2    4
3    1
4    3
5    2
Name: Date, dtype: int32

make output to Counter column

df.assign(Counter=pd.to_datetime(df['Date']).groupby(df['Aid']).rank(method='max').astype('int'))

result:

Id  Date        Aid     Counter
0   2022-01-01  one     2
1   2022-01-01  two     1
2   2022-01-05  one     4
3   2022-01-06  three   1
4   2022-01-02  one     3
5   2022-01-01  one     2
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.