You can subscribe to this list here.
| 2003 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(1) |
Nov
(33) |
Dec
(20) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2004 |
Jan
(7) |
Feb
(44) |
Mar
(51) |
Apr
(43) |
May
(43) |
Jun
(36) |
Jul
(61) |
Aug
(44) |
Sep
(25) |
Oct
(82) |
Nov
(97) |
Dec
(47) |
| 2005 |
Jan
(77) |
Feb
(143) |
Mar
(42) |
Apr
(31) |
May
(93) |
Jun
(93) |
Jul
(35) |
Aug
(78) |
Sep
(56) |
Oct
(44) |
Nov
(72) |
Dec
(75) |
| 2006 |
Jan
(116) |
Feb
(99) |
Mar
(181) |
Apr
(171) |
May
(112) |
Jun
(86) |
Jul
(91) |
Aug
(111) |
Sep
(77) |
Oct
(72) |
Nov
(57) |
Dec
(51) |
| 2007 |
Jan
(64) |
Feb
(116) |
Mar
(70) |
Apr
(74) |
May
(53) |
Jun
(40) |
Jul
(519) |
Aug
(151) |
Sep
(132) |
Oct
(74) |
Nov
(282) |
Dec
(190) |
| 2008 |
Jan
(141) |
Feb
(67) |
Mar
(69) |
Apr
(96) |
May
(227) |
Jun
(404) |
Jul
(399) |
Aug
(96) |
Sep
(120) |
Oct
(205) |
Nov
(126) |
Dec
(261) |
| 2009 |
Jan
(136) |
Feb
(136) |
Mar
(119) |
Apr
(124) |
May
(155) |
Jun
(98) |
Jul
(136) |
Aug
(292) |
Sep
(174) |
Oct
(126) |
Nov
(126) |
Dec
(79) |
| 2010 |
Jan
(109) |
Feb
(83) |
Mar
(139) |
Apr
(91) |
May
(79) |
Jun
(164) |
Jul
(184) |
Aug
(146) |
Sep
(163) |
Oct
(128) |
Nov
(70) |
Dec
(73) |
| 2011 |
Jan
(235) |
Feb
(165) |
Mar
(147) |
Apr
(86) |
May
(74) |
Jun
(118) |
Jul
(65) |
Aug
(75) |
Sep
(162) |
Oct
(94) |
Nov
(48) |
Dec
(44) |
| 2012 |
Jan
(49) |
Feb
(40) |
Mar
(88) |
Apr
(35) |
May
(52) |
Jun
(69) |
Jul
(90) |
Aug
(123) |
Sep
(112) |
Oct
(120) |
Nov
(105) |
Dec
(116) |
| 2013 |
Jan
(76) |
Feb
(26) |
Mar
(78) |
Apr
(43) |
May
(61) |
Jun
(53) |
Jul
(147) |
Aug
(85) |
Sep
(83) |
Oct
(122) |
Nov
(18) |
Dec
(27) |
| 2014 |
Jan
(58) |
Feb
(25) |
Mar
(49) |
Apr
(17) |
May
(29) |
Jun
(39) |
Jul
(53) |
Aug
(52) |
Sep
(35) |
Oct
(47) |
Nov
(110) |
Dec
(27) |
| 2015 |
Jan
(50) |
Feb
(93) |
Mar
(96) |
Apr
(30) |
May
(55) |
Jun
(83) |
Jul
(44) |
Aug
(8) |
Sep
(5) |
Oct
|
Nov
(1) |
Dec
(1) |
| 2016 |
Jan
|
Feb
|
Mar
(1) |
Apr
|
May
|
Jun
(2) |
Jul
|
Aug
(3) |
Sep
(1) |
Oct
(3) |
Nov
|
Dec
|
| 2017 |
Jan
|
Feb
(5) |
Mar
|
Apr
|
May
|
Jun
|
Jul
(3) |
Aug
|
Sep
(7) |
Oct
|
Nov
|
Dec
|
| 2018 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(2) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| S | M | T | W | T | F | S |
|---|---|---|---|---|---|---|
|
|
|
1
(2) |
2
(5) |
3
|
4
|
5
(1) |
|
6
|
7
|
8
|
9
|
10
(2) |
11
(3) |
12
|
|
13
(1) |
14
|
15
(3) |
16
(6) |
17
(4) |
18
(4) |
19
(5) |
|
20
(2) |
21
(9) |
22
(3) |
23
(1) |
24
(1) |
25
(2) |
26
|
|
27
|
28
(10) |
29
(6) |
30
(5) |
31
(4) |
|
|
|
From: Fernando P. <fpe...@gm...> - 2009-12-15 22:26:27
|
On Tue, Dec 15, 2009 at 9:57 AM, Andrew Straw <str...@as...> wrote:
>
> notch_max = med + 1.57*iq/np.sqrt(row)
> notch_min = med - 1.57*iq/np.sqrt(row)
>
> Is this code actually calculating a meaningful value? If so, what?
>
>From the statistics ignoramus in the room, so take this with a grain
of salt... I'd write that code as
notch_max = med + (iq/2) * (pi/np.sqrt(row))
and it makes more sense. The notch limits are an estimate of the
interval of the median, which is (one-half, for each up/down) the
q3-q1 range times a normalization factor which is pi/sqrt(n), where
n==row=len(d). The 1/sqrt(n) makes some sense, as it's the usual
statistical error normalization factor. The multiplication by pi, I'm
not so sure, and I can't find that exact formula in any quick stats
reference, but I'm sure someone who actually knows stats can point out
where it comes from.
Note that the code below does:
if notch_max > q3:
notch_max = q3
if notch_min < q1:
notch_min = q1
though matlab explicitly states in:
http://www.mathworks.com/access/helpdesk/help/toolbox/stats/boxplot.html
that
"""
Interval endpoints are the extremes of the notches or the centers of
the triangular markers. When the sample size is small, notches may
extend beyond the end of the box.
"""
So it seems to me that the more principled thing to do would be to
leave those notch markers outside the box if they land there, because
that's a warning of the robustness of the estimation. Clipping them to
q1/q3 is effectively hiding a problem...
cheers,
f
|
|
From: Andrew S. <str...@as...> - 2009-12-15 17:58:06
|
Hi, I've been reading about box plots and examining the source code for boxplot() lately. While there doesn't seem to be a convention about what the notch specifies, I can't find any justification (or text describing) what exactly the MPL notch is. The source code is: # get median and quartiles q1, med, q3 = mlab.prctile(d,[25,50,75]) iq = q3 - q1 notch_max = med + 1.57*iq/np.sqrt(row) notch_min = med - 1.57*iq/np.sqrt(row) Is this code actually calculating a meaningful value? If so, what? The original commit was r1098, which doesn't offer a useful comment either (only "aaplied several sf patches" ... looking through the SF bug tracker, I couldn't find anything relevant from before the commit date of 2005-03-28). |
|
From: Andrew S. <str...@as...> - 2009-12-15 17:23:08
|
The following (uncommitted) test currently fails. The reason is that
mlab.prctile(x,50) doesn't handle even length sequences according to the
numpy and wikipedia convention for the definition of median. Do we agree
that it should pass?
Not only would I commit the test, but I also have a fix to make it pass,
derived from scipy.stats.scoreatpercentile().
This would affect boxplot, if not more.
def test_prctile():
# test odd lengths
x=[1,2,3]
assert mlab.prctile(x,50)==np.median(x)
# test even lengths
x=[1,2,3,4]
assert mlab.prctile(x,50)==np.median(x)
# derived from email sent by jason-sage to MPL-user on 20090914
ob1=[1,1,2,2,1,2,4,3,2,2,2,3,4,5,6,7,8,9,7,6,4,5,5]
p = [75]
expected = [5.5]
# test vectorized
actual = mlab.prctile(ob1,p)
assert np.allclose( expected, actual )
# test scalar
for pi, expectedi in zip(p,expected):
actuali = mlab.prctile(ob1,pi)
assert np.allclose( expectedi, actuali )
|