1

I have dataframe like:

column1    column2    column3
 ram        tall        good
 rohan      short       fine
 ajay       tall        best
 alia       tall        good
 aman       medium      fine
 john       short       good
 jack       short       fine

now i need output like:

unique count of good in tall, short, medium on basis of column1->

tall=2 , short=1 , medium=0

unique count of fine in tall, short, medium on basis of column1->

tall=0 , short=2 , medium=1

unique count of best in tall, short, medium on basis of column1->

tall=1 , short=0 , medium=0

I am beginner in pandas. Thanks in advance

2 Answers 2

5

Let's try pd.crosstab:

pd.crosstab(df['column3'], df['column2'])

column2  medium  short  tall
column3                     
best          0      0     1
fine          1      2     0
good          0      1     2
Sign up to request clarification or add additional context in comments.

Comments

1

Use value_counts + unstack

res = df[['column3', 'column2']].value_counts().unstack('column2', fill_value=0)
print(res)

Output

column2  medium  short  tall
column3                     
best          0      0     1
fine          1      2     0
good          0      1     2

As an alternative groupby + unstack:

res = df.groupby(['column3', 'column2']).count().unstack('column2', fill_value=0)
print(res)

Output (groupby)

        column1           
column2  medium short tall
column3                   
best          0     0    1
fine          1     2    0
good          0     1    2

The idea behind both approaches is to create an index and then unstack it. If you want to match the same order as specify in your question, convert to Categorical first:

df['column2'] = pd.Categorical(df['column2'], categories=['tall', 'short', 'medium'], ordered=True)
res = df[['column3', 'column2']].value_counts().unstack('column2', fill_value=0)
print(res) 

Output

column2  tall  short  medium
column3                     
best        1      0       0
fine        0      2       1
good        2      1       0

3 Comments

Thanks dani, one small doubt if i have same row multiple times in dataframe, then to find count can i replace .value_counts() with nunique() ?
If you want to count the duplicates I would say use .value_counts, notice that you have the [fine, short] column 2 times
yes correct, but for [fine, short] the column1 name is different. So it will be count as different rows. Okay got it thanks

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.