Iterating and modifying a Pandas Dataframe or groupby object

Question

I'm new to Pandas and am working with a multi-index data set of the form (made from groupby):

Name 
    Year 
        Month 
             Day 
                DataA   DataB   SpeciesName   SpeciesValue
                  A       B         Name1        Value1
                  A       B         Name2        Value2
                  A       B         Name3        Value3

For every group (unique Name, Year, Month, Day) only the final two columns have a distinct value the rest of the columns are identical. I want to make each group contain a single row. The row will have the SpeciesName value as the column title and the SpeciesValue value as the entry. For instance, the result of the group above should be:

Name 
    Year 
        Month 
             Day 
                DataA     DataB     Name1     Name2     Name3 
                  A         B       Value1    Value2    Value3

How would I go about this? Iterate through the dataframe or groupby object and create a new dataframe with the structure I want or is there a better way?

maybe you can try df.set_index('SpeciesName').unstack('SpeciesName') — heyu91
– heyu91, Commented Aug 7, 2017 at 21:01

Scott Boston · Accepted Answer · 2017-08-09 03:44:57Z

1

Okay, use set_index and unstack then reset_index:

df = pd.DataFrame({'Name':['Blake']*3,'Year':[2017]*3,
                  'Month':[1]*3,
                  'Day':[15]*3,
                  'DataA':['A']*3,
                  'DataB':['B']*3,
                  'SpeciesName':['Name1','Name2','Name3'],
                  'SpeciesValue':['Value1','Value2','Value3']})

df = df.set_index(['Name','Year','Month','Day'])

df

Sample input dataframe:

                     DataA DataB SpeciesName SpeciesValue
Name  Year Month Day                                     
Blake 2017 1     15      A     B       Name1       Value1
                 15      A     B       Name2       Value2
                 15      A     B       Name3       Value3

Now, let's reshape the dataframe:

df_out = df.set_index(['DataA','DataB','SpeciesName'],append=True)['SpeciesValue']\
  .unstack()\
  .reset_index(level=[-1,-2])

print(df_out)

Output:

SpeciesName          DataA DataB   Name1   Name2   Name3
Name  Year Month Day                                    
Blake 2017 1     15      A     B  Value1  Value2  Value3

edited Aug 9, 2017 at 3:44

answered Aug 9, 2017 at 3:34

Scott Boston

154k15 gold badges160 silver badges207 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

urandom Over a year ago

Thanks, I could needed the data not to be under the SpeciesName like it is in your output. However your answer got me looking at some previously missed Pandas functions that I was able to use. I'll post what I did and you can let me know what you think. Thanks again for your help!

Collectives™ on Stack Overflow

Iterating and modifying a Pandas Dataframe or groupby object

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related