3

Often outliers are present in data set. gnuplot uses the minimum and the maximum of the data for autoscaling. But any ideas how to make the scaling robust against outliers? I'm thinking about bind and a function that computes quantiles/percentiles.

array DATA[1000]
do for [i=1:1000] {DATA[i]=invnorm(rand(0))}
DATA[42] = 1e7

pl DATA pt 7
pl [][-4:4] DATA pt 7

After a lot of interactive zooming or manual adjusting of the plot range:

1 Answer 1

5

The stats command will report quartiles. You could bind a keystroke to rescale on y using the previous result from stats. Using the median +/- 1.5 * (inter-quartile range) would give the same overall range as the default whisker bars on a boxplot.

bind 'S' 'set yrange [ STATS_median_y - 1.5 * (STATS_up_quartile_y-STATS_lo_quartile_y) : STATS_median_y + 1.5 * (STATS_up_quartile_y-STATS_lo_quartile_y) ]; replot'

stats DATA using 1:2
plot DATA using 1:2
pause -1 "Type 'S' in plot window to rescale"

Compare this to

plot DATA using 1:2 with boxplot
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks, it works. The interquartile range corresponds to 2*0.6744 sigma in a Gaussian distribution. Times 1.5 is 1.01 sigma, which I find to aggressive. For 3 sigma the factor is 2.22.
I think you dropped a factor of 2 there. The interquartile range of a Gaussian is 1.35 sigma, so 1.5 times that is 2.03 sigma.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.