2

first post on here. Started toying around with the NFLScrapr package in python and am trying to create a scatter plot to display some info. Right now the scatter plot is only showing dots but I was wondering if there were a way to add labels to each plot from the corresponding data?

Started off with this

league_rushing_success = data.loc[(data['play_type']=='run') & (data['down']<=4)].groupby(by='posteam')[['epa','success','yards_gained']].mean()

trying to plot with this

#Make x and y variables for success rate data
x = league_rushing_success['success'].values
y = league_rushing_success['epa'].values
types = league_rushing_success['posteam'].values

fig, ax = plt.subplots(figsize=(10,10))

#Make a scatter plot with success rate data
ax.scatter(x, y,)

#Adding labels and text
ax.set_xlabel('Rush Success Rate', fontsize=14)
ax.set_ylabel('EPA', fontsize=14)
ax.set_title('Rush Success Rate and EPA', fontsize=18)
ax.text(.46, .39, 'Running Backs Dont Matter', fontsize=10, alpha=.7)

for i,type in enumerate(types):
    x = x_coords[i]
    y = y_coords[i]
    plt.scatter(x, y, marker='x', color='red')
    plt.text(x+0.3, y+0.3, type, fontsize=9)

The error I'm getting is

KeyError                                  Traceback (most recent call last)
//anaconda3/lib/python3.7/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   2656             try:
-> 2657                 return self._engine.get_loc(key)
   2658             except KeyError:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'posteam'

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
<ipython-input-81-ebdc88aed0ac> in <module>
      2 x = league_rushing_success['success'].values
      3 y = league_rushing_success['epa'].values
----> 4 types = league_rushing_success['posteam'].values
      5 
      6 fig, ax = plt.subplots(figsize=(10,10))

//anaconda3/lib/python3.7/site-packages/pandas/core/frame.py in __getitem__(self, key)
   2925             if self.columns.nlevels > 1:
   2926                 return self._getitem_multilevel(key)
-> 2927             indexer = self.columns.get_loc(key)
   2928             if is_integer(indexer):
   2929                 indexer = [indexer]

//anaconda3/lib/python3.7/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   2657                 return self._engine.get_loc(key)
   2658             except KeyError:
-> 2659                 return self._engine.get_loc(self._maybe_cast_indexer(key))
   2660         indexer = self.get_indexer([key], method=method, tolerance=tolerance)
   2661         if indexer.ndim > 1 or indexer.size > 1:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'posteam'

enter image description here

1
  • To summarize, you try something like df["colname"], but "colname" is not the name of a column in df. The reason is that you have previously grouped by that column, so it's not part of the data anymore. Commented Aug 4, 2019 at 19:55

1 Answer 1

9

This line gives you an error because posteam is an index:

# error
types = league_rushing_success['posteam'].values

# try this
types = league_rushing_success.reset_index()['posteam'].values

From your code it is also not very clear what x_coords and y_coords mean (and would give you an error, too):

for i,type in enumerate(types):
    x = x_coords[i] # would give name 'x_coords' is not defined
    y = y_coords[i] # 'y_coords' is not defined

If you want to add labels to a plot, it might be a good idea to look into matplotlib.pyplot.annotate

# I would use something like this instead of `plt.text`
for i, txt in enumerate(types):
    ax.annotate(txt, (x[i], y[i]), xytext=(10,10), textcoords='offset points')
    plt.scatter(x, y, marker='x', color='red')

To sum it up:

x = league_rushing_success['success'].values
y = league_rushing_success['epa'].values
types = league_rushing_success.reset_index()['posteam'].values

fig, ax = plt.subplots(figsize=(10,10))
ax.scatter(x, y)

ax.set_xlabel('Rush Success Rate', fontsize=14)
ax.set_ylabel('EPA', fontsize=14)
ax.set_title('Rush Success Rate and EPA', fontsize=18)

for i, txt in enumerate(types):
    ax.annotate(txt, (x[i], y[i]), xytext=(10,10), textcoords='offset points')
    plt.scatter(x, y, marker='x', color='red')

enter image description here

Points are randomly generated, but the style of the plot should be the same.

I hope this helps.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.