2

I have a dataframe (below there's a super simplified version) which has transactions data of product bought and device used:

CUST_ID PRODUCT DEVICE
----------------------

1       A       MOBILE
1       B       TABLET
2       B       LAPTOP
2       A      MOBILE 
3       C      TABLET
3       C      TABLET

I would like to transform it in order to have frequencies of purchase for each product and device usage by single cust_id view: i.e. a dataframe (3x7)

CUST_ID PRODUCT_A   PRODUCT_B   PRODUCT_C   DEVICE_MOBILE   DEVICE_LAPTOP   DEVICE_TABLET

1   1   1   0   1   0   1
2   1   1   0   1   1   0
3   0   0   2   0   0   2

I tried to use the .pivot_table() function, but it adds me indexes and duplicate columns. This is a simplified version, I would need to do this for many products and devices, so maybe a function or loop would be more efficient?

2 Answers 2

1

You can use pd.get_dummies and df.groupby

pd.get_dummies(df, columns=['PRODUCT','DEVICE']).groupby(['CUST_ID'], as_index=False).sum()

Output:

CUST_ID  PRODUCT_A  PRODUCT_B  PRODUCT_C  DEVICE_LAPTOP  DEVICE_MOBILE  \
0       1          1          1          0              0              1   
1       2          1          1          0              1              1   
2       3          0          0          2              0              0   

   DEVICE_TABLET  
0              1  
1              0  
2              2 
Sign up to request clarification or add additional context in comments.

Comments

0

You can use pd.crosstab twice and join the results:

cross1 = pd.crosstab(index=df['CUST_ID'], columns=df['PRODUCT'])
cross2 = pd.crosstab(index=df['CUST_ID'], columns=df['DEVICE'])

res = cross1.join(cross2)

print(res)

         A  B  C  LAPTOP  MOBILE  TABLET
CUST_ID                                 
1        1  1  0       0       1       1
2        1  1  0       1       1       0
3        0  0  2       0       0       2

3 Comments

thanks a lot! I'm just getting this error when I try to join the different cross tables:
TypeError: categories must match existing categories when appending
@JamesPietroZanzarelli, Looks like you are using Categorical Data. This isn't part of your question. Convert your series to regular object dtype or (possibly better) post a new question with a minimal reproducible example if you're stuck.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.