8

I have two dataframes which looks like this:

rating
   BMW  Fiat  Toyota
0    7     2       3
1    8     1       8
2    9    10       7
3    8     3       9

own
   BMW  Fiat  Toyota
0    1     1       0
1    0     1       1
2    0     0       1
3    0     1       1

I'm ultimately trying to get a pivot table of mean rating for usage by brand. Or something like this:

            BMW  Fiat  Toyota
Usage                        
0      8.333333    10       3
1      7.000000     2       8

My approach was to merge the datasets like this:

Measure  Rating                Own              
Brand       BMW  Fiat  Toyota  BMW  Fiat  Toyota
0             7     2       3    1     1       0
1             8     1       8    0     1       1
2             9    10       7    0     0       1
3             8     3       9    0     1       1

And then attempt to create a pivot table using rating as the value, own as the rows and brand as the columns. But I kept running to key issues. I have also attempted unstacking either the measure or brand levels, but I can't seem to use row index names as pivot keys.

What am I doing wrong? Is there a better approach to this?

2 Answers 2

4

I'm not an expert in Pandas, so the solution may be more clumsy than you want, but:

rating = pd.DataFrame({"BMW":[7, 8, 9, 8], "Fiat":[2, 1, 10, 3], "Toyota":[3, 8, 7,9]})
own = pd.DataFrame({"BMW":[1, 0, 0, 0], "Fiat":[1, 1, 0, 1], "Toyota":[0, 1, 1, 1]})

r = rating.unstack().reset_index(name='value')
o = own.unstack().reset_index(name='value')
res = DataFrame({"Brand":r["level_0"], "Rating": r["value"], "Own": o["value"]})
res = res.groupby(["Own", "Brand"]).mean().reset_index()
res.pivot(index="Own", columns="Brand", values="Rating")

# result
# Brand       BMW  Fiat  Toyota
# Own                          
# 0      8.333333    10       3
# 1      7.000000     2       8

another solution, although not very much generalizable (you can use for loop, but you have to know which values do you have in own dataframe):

d = []
for o in (0, 1):
    t = rating[own == o]
    t["own"] = o
    d.append(t)

res = pd.concat(d).groupby("own").mean()
Sign up to request clarification or add additional context in comments.

4 Comments

Thanks. Great to have a solution. You're right that I was hoping for something more elegant, but a solution unblocks me. I can always write a function.
@Brendon I'm trying to spend as much time as I can to learn Pandas now, will see what can I do a after week or two :) Please don't accept the answer, may be some gurus will arrive with superelegant solution
Well, your tagline on your profile says as much :). I will hold off accepting your answer for another week. Thanks again.
@Brendon take a look, I've added another solution, more pythonic one I think. If I knew how to add column to DataFrame inplace, it could be even shorter
3

I have a new answer to my own question (based on Roman's initial answer). The key is to get the index at the required dimensionality. For example

rating.columns.names = ["Brand"]
rating.index.names = ["n"]
print rating

Brand  BMW  Fiat  Toyota
n                       
0        7     2       3
1        8     1       8
2        9    10       7
3        8     3       9

own.columns.names = ["Brand"]
own.index.names = ["n"]
print own

Brand  BMW  Fiat  Toyota
n                       
0        1     1       0
1        0     1       1
2        0     0       1
3        0     1       1

merged = pd.merge(own.unstack().reset_index(name="Own"), 
                  rating.unstack().reset_index(name="Rating"))
print merged

     Brand  n  Own  Rating
0      BMW  0    1       7
1      BMW  1    0       8
2      BMW  2    0       9
3      BMW  3    0       8
4     Fiat  0    1       2
5     Fiat  1    1       1
6     Fiat  2    0      10
7     Fiat  3    1       3
8   Toyota  0    0       3
9   Toyota  1    1       8
10  Toyota  2    1       7
11  Toyota  3    1       9

Then it's easy to use the pivot_table command to turn this into the desired result:

print merged.pivot_table(rows="Brand", cols="Own", values="Rating")

Own             0  1
Brand               
BMW      8.333333  7
Fiat    10.000000  2
Toyota   3.000000  8

And that is what I was looking for. Thanks again to Roman for pointing the way.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.