0

I have set up the following very simple database to illustrate what I'm trying to do:

teams = pd.DataFrame({"spreads":['New England Patriots -7.0','Atlanta Falcons 2.5','New Orleans Saints -4.5']})
teams['home'] = ['New England Patriots','Carolina Panthers','New Orleans Saints']
teams['away'] = ['Miami Dolphins','Atlanta Falcons','Tampa Bay Buccaneers']

I'm essentially trying to extract the spread value. At first I was trying to use str.contains to first extract the team name thus separating out the numeric value but it seems that I can't use that as a comparison tool for a row-by-row analysis. If anyone has any tips for how to extract the numeric value (I don't think I can use a regex because there are cases where no '-' sign appears) or at the very least tell me what methodology to use to determine if the team listed for each row is the home or away team I would greatly appreciate it.

1
  • 2
    If all the spreads have numbers at the end then you can use - teams.spreads.str.split().str[-1] Commented Aug 25, 2016 at 20:43

2 Answers 2

2

Use .str.extract

teams.spreads.str.extract(r'(-?\d+\.?\d*)', expand=False)

0    -7.0
1     2.5
2    -4.5
Name: spreads, dtype: object

Fancier

teams.spreads.str.extract(r'(?P<spread_val>-?\d+\.?\d*)', expand=True)

enter image description here

Sign up to request clarification or add additional context in comments.

Comments

1

Try this Splitting Strings:

teams['spreads_val'] = teams['spreads'].str.rsplit(" ").str.get(-1)

0    -7.0
1     2.5
2    -4.5
Name: spreads_vals, dtype: object

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.