-1

This is a follow-up to this question: Pandas limit Series/DataFrame to range of values of one column

I'd next like to histogram the numbers in the "Age" column and then smooth the result (to reduce scatter). What's an elegant way to do this?

3
  • Why don't you try it yourself? It's easy. Look up Matplotlib's hist function, or pandas's hist function. Commented Feb 24, 2016 at 5:03
  • Density plot, Plotting probability, or maybe you just want df['Age'].plot(kind='density') or df['Age'].hist(). I'm personally unclear on what you mean by "then smooth the result". Commented Feb 24, 2016 at 5:53
  • It's easy to make a histogram, but I'm having some formatting issues when trying to smooth over it. It seems like the only way is to produce a new function function from the histogram and smooth/interpolate over it using scipy's interpolate, but I thought there may be a more pythonic way to do it, e.g., with a pandas-native function? Commented Feb 24, 2016 at 21:21

1 Answer 1

4

You can use Seaborn and its function distplot which plot by default a kernel density estimate and histogram with bin size determined automatically.

import seaborn as sns
import numpy as np
import pandas as pd

# Some test data
np.random.seed(33454)
df = pd.DataFrame({'nb': np.random.randint(0, 1000, 100)})
df.sort_values('nb', inplace=True)

ax = sns.distplot(df['nb'])

enter image description here

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.