14

I have been able to use pandas groupby to create a new DataFrame but I'm getting an error when I create a barplot. The groupby command:

invYr = invoices.groupby(['FinYear']).sum()[['Amount']]

Which creates a new DataFrame that looks correct to me.

New DataFrame invYr

Running:

sns.barplot(x='FinYear', y='Amount', data=invYr)

I get the error:

ValueError: Could not interperet input 'FinYear'

It appears that the issue is related to the index, being FinYear but unfortunately I have not been able to solve the issue even when using reindex.

3
  • 2
    Try data=invYr.reset_index() to push the index back into a column. Commented Feb 3, 2016 at 3:37
  • The DataFrame has no index now and looks good, like a CSV that has just been imported but unfortunately the same error massage remains. Commented Feb 3, 2016 at 3:57
  • 2
    My guess would be that you are not reassigning the output of invYr.reset_index(). It's not an in-place operation, so it might print to the REPL and look like it "worked" but that's not what you're passing to barplot. Commented Feb 3, 2016 at 17:34

1 Answer 1

25
import pandas as pd
import seaborn as sns

invoices = pd.DataFrame({'FinYear': [2015, 2015, 2014], 'Amount': [10, 10, 15]})
invYr = invoices.groupby(['FinYear']).sum()[['Amount']]

>>> invYr
         Amount
FinYear        
2014         15
2015         20

The reason that you are getting the error is that when you created invYr by grouping invoices, the FinYear column becomes the index and is no longer a column. There are a few solutions:

1) One solution is to specify the source data directly. You need to specify the correct datasource for the chart. If you do not specify a data parameter, Seaborn does not know which dataframe/series has the columns 'FinYear' or 'Amount' as these are just text values. You must specify, for example, y=invYr.Amount to specify both the dataframe/series and the column you'd like to graph. The trick here is directly accessing the index of the dataframe.

sns.barplot(x=invYr.index, y=invYr.Amount)

2) Alternatively, you can specify the data source and then directly refer to its columns. Note that the grouped data frame had its index reset so that the column again becomes available.

sns.barplot(x='FinYear', y='Amount', data=invYr.reset_index())

3) A third solution is to specify as_index=False when you perform the groupby, making the column available in the grouped dataframe.

invYr = invoices.groupby('FinYear', as_index=False).Amount.sum()
sns.barplot(x='FinYear', y='Amount', data=invYr)

All solutions above produce the same plot below.

enter image description here

Sign up to request clarification or add additional context in comments.

5 Comments

Thanks Alexander this worked perfectly. Still unsure as to why I would have to do that though, why can't the index be called by column name? I also wonder why @BrenBarn solution didn't fix it? Anyway thanks for your help.
When you perform a groupby operation, the default behavior is to index the result on the grouped columns. To avoid this, you can use invYr = invoices.groupby(['FinYear'], as_index=False).sum()[['Amount']]. Then you would call sns.barplot(x=invYr.FinYear, y=invYr.Amount)
Thanks for the explanation. The only question now is why can't I just type ...x='FinYear', data=invYr... and instead have to qualify it by typing x=invYr.FinYear instead?
'FinYear' is just a text label. If you had another dataframe with that label, there would be a clash because either could be valid (not to mention that Seaborn does not know all the variables that dataframe/series types). As a programmer, you need to qualify the data to which you are referring.
"You need to specify the correct datasource for the chart. Seaborn does not know which dataframe/series has the columns 'FinYear' or 'Amount', as these are just text values." The code in the original question clearly specifies data=invYr; it is not necessary to pass Series objects to x and y.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.