1

I have some American Football data in a DataFrame like below:

df = pd.DataFrame({'Green Bay Packers' : ['30-18-0', '5-37', '10-71' ],
                    'Chicago Bears' : ['45-26-1', '5-20', '10-107']}, 
                 index=['Att - Comp - Int', 'Sacked - Yds Lost', 'Penalties - Yards'])
                    Green Bay Packers   Chicago Bears
Att - Comp - Int    30-18-0               45-26-1
Sacked - Yds Lost   5-37                    5-20
Penalties - Yards   10-71                  10-107

You can see above that each row contains multiple data points that need to be split off. What I'd like to do is find some way to split the rows up so that each data point is it's own row. The final output would like like:

        Green Bay Packers   Chicago Bears
Att           30                45
Comp          18                26
Int            0                 1
Sacked         5                 5
Yds Lost      37                20
Penalties     10                10
Yards         71               107

Is there a way to do this efficiently? I tried some Regex but it just turned into a mess. Sorry if my formatting isn't perfect...2nd question ever posted here.

1
  • Convert the values into a string and then recreate the data with corresponding values. That is what I would do Commented Jul 26, 2021 at 18:39

4 Answers 4

1

Try:

df = df.reset_index().apply(lambda x: x.str.split("-"))
df = pd.DataFrame(
    {c: df[c].explode().str.strip() for c in df.columns},
).set_index("index")
df.index.name = None
print(df)

Prints:

          Green Bay Packers Chicago Bears
Att                      30            45
Comp                     18            26
Int                       0             1
Sacked                    5             5
Yds Lost                 37            20
Penalties                10            10
Yards                    71           107
Sign up to request clarification or add additional context in comments.

Comments

0

First reset the index, then stack all the columns and split them on -, You can also additionally apply to remove any left over whitespace characters after after using split, then unstack again, then apply pd.Series.explode finally reset the index, and drop any left-over unrequired column.

out =  (df.reset_index()
        .stack().str.split('-').apply(lambda x:[i.strip() for i in x])
        .unstack()
        .apply(pd.Series.explode)
        .reset_index()
        .drop(columns='level_0'))

        index Green Bay Packers Chicago Bears
0        Att                 30            45
1       Comp                 18            26
2         Int                 0             1
3     Sacked                  5             5
4    Yds Lost                37            20
5  Penalties                 10            10
6       Yards                71           107

Comments

0

Assuming you have same number of splits for every row, with pandas >= 1.3.0, you can explode multiple columns at the same time:

df = df.reset_index().apply(lambda s: s.str.split(' *- *'))
df.explode(df.columns.tolist()).set_index('index')

          Green Bay Packers Chicago Bears
index
Att                      30            45
Comp                     18            26
Int                       0             1
Sacked                    5             5
Yds Lost                 37            20
Penalties                10            10
Yards                    71           107

Comments

0

Use .apply() on each column (including index) and for each column:

df_out = (df.reset_index()
            .apply(lambda x: x.str.split(r'\s*-\s*').explode())
            .set_index('index').rename_axis(index=None)
         )

Result:

print(df_out)

          Green Bay Packers Chicago Bears
Att                      30            45
Comp                     18            26
Int                       0             1
Sacked                    5             5
Yds Lost                 37            20
Penalties                10            10
Yards                    71           107

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.