1

I'm working with Python and Pandas and have a table like this:

      Name        Team    Fixture   Line-up     Min IN   Min Out
0     Player 1    RAY     J1        Starting             68
1     Player 2    RAY     J1        Bench       74       
2     Player 3    RSO     J2        Starting             45
3     Player 4    RSO     J2        Bench       45

I need to pivot the table making the rows of 'Fixture' as new columns containing the text of 'Line-up' + the number of Min IN and OUT. Then the result should be like this:

      Name        Team    J1                J2
0     Player 1    RAY     Starting - 68
1     Player 2    RAY     Bench - 74      
2     Player 3    RSO                       Starting - 45
3     Player 4    RSO                       Bench - 45

Is there any way to make it? Thanks in advance!

2 Answers 2

1

You could modify Line-up column by including the Min value, then pivot:

out = (df.assign(**{'Line-up': df['Line-up'] + ' - ' + 
                    df.filter(like='Min').bfill(axis=1).iloc[:,0].astype(int).astype(str)})
       .pivot(['Name','Team'], 'Fixture','Line-up').rename_axis(columns=None).reset_index())

Output:

       Name Team             J1             J2
0  Player 1  RAY  Starting - 68            NaN
1  Player 2  RAY     Bench - 74            NaN
2  Player 3  RSO            NaN  Starting - 45
3  Player 4  RSO            NaN     Bench - 45

N.B. This assumes that the empty spaces in the Min columns are NaN values. If they are empty space '' actually, then you could convert them to NaN values first. So like:

out = (df.assign(**{'Line-up': df['Line-up'] + ' - ' + 
                    df.filter(like='Min').replace('', pd.NA).bfill(axis=1).iloc[:,0].astype(int).astype(str)})
#                               here -->  ^^^^^^^^^^^^
       .pivot(['Name','Team'], 'Fixture','Line-up').rename_axis(columns=None).reset_index())
Sign up to request clarification or add additional context in comments.

4 Comments

Thanks for your answer. The pivot works but I receive this error 'ValueError: invalid literal for int() with base 10: ''' maybe because of the NaN values but the point is I don't undestand how to solve exactly because this "here --> ^^^^^^" doesn't work (sorry, I'm quite newbie)
@nokvk sorry I forgot to make it a comment. It's a comment meant for you. You can remove that line. I edited the answer to make sure it's a comment.
Ah, ok... Thanks for fix it but unfortunately (and I don't know why) I still get the same error ValueError: invalid literal for int() with base 10: ' I guess it's because the columns Mins are strings (the empty cells are not NaN, just empties)... And I'd like to keep them empty to get "Starting" with nothing else or "Starting - 68" if there is something in the column Min IN or OUT
@nokvk The error seems to say that there's a quotation mark (') in one Min column row. See if df.filter(like='Min').replace('', pd.NA).replace("'", pd.NA).bfill(axis=1).iloc[:,0] works
1

Another version:

df = (
    df.set_index(["Name", "Team", "Fixture"])
    .apply(lambda x: " - ".join(x[x != ""]), axis=1)
    .unstack(level=2)
    .reset_index()
)
df.columns.name = ""

Prints:

       Name Team             J1             J2
0  Player 1  RAY  Starting - 68            NaN
1  Player 2  RAY     Bench - 74            NaN
2  Player 3  RSO            NaN  Starting - 45
3  Player 4  RSO            NaN     Bench - 45

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.