0

My data is a Data Frame with retail items and their sales performance. Columns include: 2016 unit sales, 2015 unit sales, item description, etc. When I try to do a groupby for brand:

Data.groupby(by="Major Brand").sum()

I get the following error: TypeError: unorderable types: int() < str()

I assume this is because not all of the data in the DataFrame are numbers therefore pandas doesn't know how to 'sum'.

But I can get individual groupby's using something like:

Data.groupby(by="Major Brand")["2016 Units"].sum()

Ultimately I just want to group by "Major Brand" and compare "2016 Units" to "2015 Units" and put all three them into a new DataFrame with the "Major Brand" as the index.

I have tried merging my multiple groupby's together but that never seems to work.

Thank you!

2 Answers 2

2

you can do it this way:

Data.groupby(by="Major Brand")["2016 Units","2015 Units"].sum()

Demo:

In [29]: Data.groupby(by="Major Brand")["2016 Units","2015 Units"].sum()
Out[29]:
             2016 Units  2015 Units
Major Brand
1                   218         238
2                   172         122
3                   192         273
4                   176         172

Data:

In [30]: Data
Out[30]:
    Major Brand  2016 Units  2015 Units    X
0             1          75          83  xxx
1             1          82          95  xxx
2             3          85          47  xxx
3             3           1          40  xxx
4             1          43          43  xxx
5             4          35          65  xxx
6             3          38          71  xxx
7             4          56          90  xxx
8             3           9          77  xxx
9             1          18          17  xxx
10            3          59          38  xxx
11            4          85          17  xxx
12            2          64          13  xxx
13            2          32          33  xxx
14            2          76          76  xxx
Sign up to request clarification or add additional context in comments.

8 Comments

For some reason I get the "TypeError: unorderable types: str() < int()" error. I can do the groupby for both "2016 Units" and "2015 Units" separately, so I have no idea why I would get this message.
@Stephen, Try always to provide a Minimal, Complete, and Verifiable example when asking questions. In case of pandas questions please provide sample input and output data sets (5-7 rows in CSV/dict/JSON/Python code format as text, so one could use it when coding an answer for you). This will help to avoid situations like: your code isn't working for me or it doesn't work with my data, etc.
@Stephen, could you also post an output of the following command: print(Data.dtypes)
Sorry, I wish I could provide the whole data set, but it is huge. There are about 50 columns and 900 rows. The list provided from print(Data.dtypes) shows "Major Brand", "Product Description", etc as "object" and "2016 Units" , "2015 Units" as "float 64". Basically the columns that contain text are "object" and the columns that contain numbers are "float64". At the end of the list it says dtype: object.
@Stephen, what pandas version are you using?
|
1

I get the following error: TypeError: unorderable types: int() < str()

Could it be that your dtypes are not correct? Eg str. instead of int? You could try create your dataframe with something as follows:

In [18]: import numpy as np; import pandas as pd

In [19]: col1 = ['adidas','nike','yourturn','zara','nike','nike','bla','bla','zalando','amazon']

In [20]: data = {'Major Brand':col1, '2016 Units':range(len(col1)), '2015 Units':range(len(col1),len(col1)*2)}

In [21]: x = pd.DataFrame(data, dtype=np.int64  )

In [22]: 

In [22]: x.groupby(by="Major Brand").sum()
Out[22]: 
             2015 Units  2016 Units
Major Brand                        
adidas               10           0
amazon               19           9
bla                  33          13
nike                 40          10
yourturn             12           2
zalando              18           8
zara                 13           3

In [23]: x.groupby(by="Major Brand")["2016 Units","2015 Units"].sum()
Out[23]: 
             2016 Units  2015 Units
Major Brand                        
adidas                0          10
amazon                9          19
bla                  13          33
nike                 10          40
yourturn              2          12
zalando               8          18
zara                  3          13

In [24]: x.dtypes
Out[24]: 
2015 Units      int64
2016 Units      int64
Major Brand    object
dtype: object

In [25]: x.groupby(by="Major Brand").agg(['count','sum','mean','median'])
Out[25]: 
            2015 Units                       2016 Units                     
                 count sum       mean median      count sum      mean median
Major Brand                                                                 
adidas               1  10  10.000000   10.0          1   0  0.000000    0.0
amazon               1  19  19.000000   19.0          1   9  9.000000    9.0
bla                  2  33  16.500000   16.5          2  13  6.500000    6.5
nike                 3  40  13.333333   14.0          3  10  3.333333    4.0
yourturn             1  12  12.000000   12.0          1   2  2.000000    2.0
zalando              1  18  18.000000   18.0          1   8  8.000000    8.0
zara                 1  13  13.000000   13.0          1   3  3.000000    3.0

5 Comments

Sorry, I tried and got: "ValueError: invalid literal for int() with base 10: 'Data'"
@stephen, have you tried also with the example data i created?
@Stephen, for me it strongly looks like you have an type conversion error. Eg. some field in your raw data that cant be converted to an int.
The weird thing is that I can do the individual groupby's no problem: Data.groupby(by="Major Brand")["2016 Units"].sum(). Both 2016 and 2015 work fine, the issue is only if I try to do them together. I will try with your data when I get to my programming comp.
@stephen, what version do you have i cant reproduce your error so its very hard to debug ... try: "import sys; sys.version" and "import pandas as pd; pd.__version__"

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.