I'm attempting to follow and work through this video tutorial by Wes McKinney. I'm to the point where we're going through the baby names examples, and I'm having the same problems both in my code I'm writing and with his code (BabyNames.ipynb).
For reference, I'm on a Mac (OS X 10.10.1) using:
- Python 2.7.6
- IPython 2.3.1
- Pandas 0.15.2
I can successfully do all of this:
names = read_csv('baby-names2.csv') # read the data in
boys = names[names.sex == 'boy'] # create boys list
girls = names[names.sex == 'girl'] # create girls list
# create a function
def get_quantile_count(group, quantile=0.5):
df = group.sort_index(by='prop', ascending=False)
return df.prop.cumsum().searchsorted(quantile)
# call the function
boys.groupby('year').apply(get_quantile_count)
This gives me output that looks like this (for brevity, only showing a small section of the data):
year
1880 [15]
1881 [15]
1882 [17]
1883 [17]
1884 [19]
1885 [20]
1886 [20]
1887 [21]
1888 [22]
1889 [22]
1890 [23]
1891 [24]
1892 [25]
I want to then plot this data, like this:
boys.groupby('year').apply(get_quantile_count).plot()
but it's giving me this error:
TypeError: Empty 'Series': no numeric data to plot
In watching the video, the data he shows does not have the square brackets [ ] around the numbers in the data frame. I'm guessing this is what's causing my problems.
Anyone have any tricks on how to change this? I was watching the video and writing the code myself, but the same thing happens if I run the provided notebook BabyNames.ipynb.