Conditional count of rows in a dataframe based on two columns in another dataframe

Question

I would like to add a column to df2 that includes a count of rows in df1 that have matching Herd and Ddat values.

import pandas as pd 
    
df1 = [[52, '1', '1/1/2020'], [54, '1', '1/1/2020'],
       [55, '2', '1/1/2020'], [56, '3', '1/1/1999']]
    
df = pd.DataFrame(df1, columns =['Cow','Herd', 'Ddat'])

df2 = [['1', '1/1/2020'], ['1', '1/5/2020'],
       ['2', '1/1/2020'], ['3', '1/1/1999']]
    
df2 = pd.DataFrame(df2, columns =['Herd', 'Ddat'])

The output I am looking for is

Herd    Ddat       Count
1     1/1/2020        2
1     1/5/2020        0
2     1/1/2020        1
3     1/1/1999        1

user17242583 · Accepted Answer · 2021-12-12 02:06:50Z

2

You can take advantage of the nice features of indexes:

cols = ['Herd', 'Ddat']
new_df = df2.set_index(cols).assign(Count=df.groupby(cols).count()).fillna(0).astype({'Count': int}).reset_index()

Output:

>>> new_df
  Herd      Ddat  Count
0    1  1/1/2020      2
1    1  1/5/2020      0
2    2  1/1/2020      1
3    3  1/1/1999      1

answered Dec 12, 2021 at 2:06

user17242583

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Conditional count of rows in a dataframe based on two columns in another dataframe

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related