1

I am trying to create a pivot of my Dataframe that has no numerical values and duplicates exist in the index column. Given below is how my data looks:

sale_id, product, sale_date
101, ABC, 2021-01-01
101, DEF, 2021-02-01
101, XYZ, 2021-03-01
101, KLM, 2021-01-04

Expect the below output:

    ABC, DEF, XYZ, KLM
101 2021-01-01, 2021-02-01, 2021-03-01, 2021-01-04

I tried the below

df.pivot(index='sale_id', columns='product', values='sale_date')

It threw the below error

ValueError: Index contains duplicate entries, cannot reshape

1 Answer 1

1

I am trying to create a pivot of my Dataframe that has no numerical values and duplicates exist in the index column.

For test duplicates use DataFrame.duplicated:

df1 = df[df.duplicated(['sale_id','product'], keep=False)]
print (df1)

For remove duplicates use DataFrame.drop_duplicates:

(df.drop_duplicates(['sale_id','product'])
   .pivot(index='sale_id', columns='product', values='sale_date'))
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.