Python/Pandas: create summary table

Question

In a python pandas dataframe "df", I have the following columns:

user_id | song_id | song_duration | song_title | artist | listen_count

Many users might have listened to the same song - therefore the song is not unique in this table. I would like to create a second dataframe with just song information (with unique song_ids).

song_id | song_title | artist

I manage to create a table with song_id and song_title.

song_df = df.groupby('song_id').song_title.first()

How can I add, the column "artist" into this?

This doesn't work:

song_df = df.groupby('song_id').df['song_title','artist'].first()

AttributeError: 'DataFrameGroupBy' object has no attribute 'df'

jezrael · Accepted Answer · 2016-05-30 19:32:17Z

1

IIUC try omit .df:

df.groupby('song_id')['song_title','artist'].first()

answered May 30, 2016 at 19:32

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Rosa Alejandra · Accepted Answer · 2016-05-30 19:57:20Z

0

You could just drop the duplicates of selected columns

song_df = df[['song_id','song_title','artist']].drop_duplicates()

answered May 30, 2016 at 19:57

Rosa Alejandra

7325 silver badges22 bronze badges

Collectives™ on Stack Overflow

Python/Pandas: create summary table

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related