7

In pandas how to go from a:

a = pd.DataFrame({'foo': ['m', 'm', 'm', 's', 's', 's'],
                    'bar': [1, 2, 3, 4, 5, 6]})
>>> a
   bar foo
0    1   m
1    2   m
2    3   m
3    4   s
4    5   s
5    6   s

to b:

b = pd.DataFrame({'m': [1, 2, 3],
                    's': [4, 5, 6]})
>>> b
   m  s
0  1  4
1  2  5
2  3  6

I tried solutions in other answers, e.g. here and here but none seemed to do what I want.

Basically, I want to swap rows with columns and drop the index, but how to do it?

2 Answers 2

6
a.set_index(
    [a.groupby('foo').cumcount(), 'foo']
).bar.unstack()
Sign up to request clarification or add additional context in comments.

2 Comments

can you detail a bit what's going on? I looked at the GroupBy.cumcount() documentation, but it's somewhat cryptic.
I apologize for the lack of detail. I'm on my phone. The problem with the information you have is that you need to differentiate between the different values with the same foo value. Cumcount does exactly that by creating a 0, 1, and 2 for the first 3 and the same again for the second three. What's more is that it will work if they weren't both of size 3. Given the positions I set the index at, it is laid out perfectly for an unstack.
4

This is my solution

a = pd.DataFrame({'foo': ['m', 'm', 'm', 's', 's', 's'],
                    'bar': [1, 2, 3, 4, 5, 6]})
a.pivot(columns='foo', values='bar').apply(lambda x: pd.Series(x.dropna().values))

foo    m    s
0    1.0  4.0
1    2.0  5.0
2    3.0  6.0

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.