You can subscribe to this list here.
| 2003 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(3) |
Jun
|
Jul
|
Aug
(12) |
Sep
(12) |
Oct
(56) |
Nov
(65) |
Dec
(37) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2004 |
Jan
(59) |
Feb
(78) |
Mar
(153) |
Apr
(205) |
May
(184) |
Jun
(123) |
Jul
(171) |
Aug
(156) |
Sep
(190) |
Oct
(120) |
Nov
(154) |
Dec
(223) |
| 2005 |
Jan
(184) |
Feb
(267) |
Mar
(214) |
Apr
(286) |
May
(320) |
Jun
(299) |
Jul
(348) |
Aug
(283) |
Sep
(355) |
Oct
(293) |
Nov
(232) |
Dec
(203) |
| 2006 |
Jan
(352) |
Feb
(358) |
Mar
(403) |
Apr
(313) |
May
(165) |
Jun
(281) |
Jul
(316) |
Aug
(228) |
Sep
(279) |
Oct
(243) |
Nov
(315) |
Dec
(345) |
| 2007 |
Jan
(260) |
Feb
(323) |
Mar
(340) |
Apr
(319) |
May
(290) |
Jun
(296) |
Jul
(221) |
Aug
(292) |
Sep
(242) |
Oct
(248) |
Nov
(242) |
Dec
(332) |
| 2008 |
Jan
(312) |
Feb
(359) |
Mar
(454) |
Apr
(287) |
May
(340) |
Jun
(450) |
Jul
(403) |
Aug
(324) |
Sep
(349) |
Oct
(385) |
Nov
(363) |
Dec
(437) |
| 2009 |
Jan
(500) |
Feb
(301) |
Mar
(409) |
Apr
(486) |
May
(545) |
Jun
(391) |
Jul
(518) |
Aug
(497) |
Sep
(492) |
Oct
(429) |
Nov
(357) |
Dec
(310) |
| 2010 |
Jan
(371) |
Feb
(657) |
Mar
(519) |
Apr
(432) |
May
(312) |
Jun
(416) |
Jul
(477) |
Aug
(386) |
Sep
(419) |
Oct
(435) |
Nov
(320) |
Dec
(202) |
| 2011 |
Jan
(321) |
Feb
(413) |
Mar
(299) |
Apr
(215) |
May
(284) |
Jun
(203) |
Jul
(207) |
Aug
(314) |
Sep
(321) |
Oct
(259) |
Nov
(347) |
Dec
(209) |
| 2012 |
Jan
(322) |
Feb
(414) |
Mar
(377) |
Apr
(179) |
May
(173) |
Jun
(234) |
Jul
(295) |
Aug
(239) |
Sep
(276) |
Oct
(355) |
Nov
(144) |
Dec
(108) |
| 2013 |
Jan
(170) |
Feb
(89) |
Mar
(204) |
Apr
(133) |
May
(142) |
Jun
(89) |
Jul
(160) |
Aug
(180) |
Sep
(69) |
Oct
(136) |
Nov
(83) |
Dec
(32) |
| 2014 |
Jan
(71) |
Feb
(90) |
Mar
(161) |
Apr
(117) |
May
(78) |
Jun
(94) |
Jul
(60) |
Aug
(83) |
Sep
(102) |
Oct
(132) |
Nov
(154) |
Dec
(96) |
| 2015 |
Jan
(45) |
Feb
(138) |
Mar
(176) |
Apr
(132) |
May
(119) |
Jun
(124) |
Jul
(77) |
Aug
(31) |
Sep
(34) |
Oct
(22) |
Nov
(23) |
Dec
(9) |
| 2016 |
Jan
(26) |
Feb
(17) |
Mar
(10) |
Apr
(8) |
May
(4) |
Jun
(8) |
Jul
(6) |
Aug
(5) |
Sep
(9) |
Oct
(4) |
Nov
|
Dec
|
| 2017 |
Jan
(5) |
Feb
(7) |
Mar
(1) |
Apr
(5) |
May
|
Jun
(3) |
Jul
(6) |
Aug
(1) |
Sep
|
Oct
(2) |
Nov
(1) |
Dec
|
| 2018 |
Jan
|
Feb
|
Mar
|
Apr
(1) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2020 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2025 |
Jan
(1) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| S | M | T | W | T | F | S |
|---|---|---|---|---|---|---|
|
|
|
|
1
(1) |
2
(8) |
3
(10) |
4
|
|
5
(4) |
6
|
7
(5) |
8
(6) |
9
(4) |
10
(12) |
11
(7) |
|
12
(2) |
13
(2) |
14
(5) |
15
(9) |
16
(4) |
17
(7) |
18
(2) |
|
19
(12) |
20
(8) |
21
(11) |
22
(11) |
23
(2) |
24
(18) |
25
(18) |
|
26
(6) |
27
(7) |
28
(10) |
29
(7) |
30
(31) |
31
(10) |
|
|
From: Francesco M. <fra...@gm...> - 2012-08-24 15:13:26
|
2012/8/24 Carlos Grohmann <car...@gm...>: > Hello all, > > I just did a fresh macports install, and installed py27-matplotlib-basemap, > so all dependencies were installed as well. > > After installing python, I did run port-select (or something like it) to > make sure I'm using macports python. > > My problem is that I can't run it: > > > GuanoMac:~ guano$ python > Python 2.7.2 (v2.7.2:8527427914a2, Jun 11 2011, 15:22:34) > [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin > Type "help", "copyright", "credits" or "license" for more information. >>>> from mpl_toolkits.basemap import Basemap > Traceback (most recent call last): > File "<stdin>", line 1, in <module> > ImportError: No module named basemap > > > Anyone experienced in this kind of installation could share hints? if you do import sys print sys.path you can check if the directory of basemap is in your path if not, you can add it both in the scritp/session appending/inserting/extending sys.path (which is a list): e.g.: sys.path.append( "dir/to/basemap" ) or in the .profile, .bash_profile or .bash_rc files (in this way is loaded in every session) export PYTHONPATH=$PYTHONPATH:dir/to/basemap cheers, Francesco > > tks > > Carlos > > -- > Prof. Carlos Henrique Grohmann > Institute of Geosciences - Univ. of São Paulo, Brazil > - Digital Terrain Analysis | GIS | Remote Sensing - > > http://carlosgrohmann.com > ________________ > Can’t stop the signal. > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Matplotlib-users mailing list > Mat...@li... > https://lists.sourceforge.net/lists/listinfo/matplotlib-users > |
|
From: Carlos G. <car...@gm...> - 2012-08-24 14:47:15
|
Hello all, I just did a fresh macports install, and installed py27-matplotlib-basemap, so all dependencies were installed as well. After installing python, I did run port-select (or something like it) to make sure I'm using macports python. My problem is that I can't run it: GuanoMac:~ guano$ python Python 2.7.2 (v2.7.2:8527427914a2, Jun 11 2011, 15:22:34) [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> from mpl_toolkits.basemap import Basemap Traceback (most recent call last): File "<stdin>", line 1, in <module> ImportError: No module named basemap Anyone experienced in this kind of installation could share hints? tks Carlos -- Prof. Carlos Henrique Grohmann Institute of Geosciences - Univ. of São Paulo, Brazil - Digital Terrain Analysis | GIS | Remote Sensing - http://carlosgrohmann.com ________________ Can’t stop the signal. |
|
From: Joe B. <ma...@jd...> - 2012-08-24 13:38:52
|
Thanks Ben, Works fine. Regards, Joseph David Borġ http://www.jdborg.com On 24 August 2012 13:41, Benjamin Root <ben...@ou...> wrote: > > On Fri, Aug 24, 2012 at 6:35 AM, Joe Borġ <ma...@jd...> wrote: > >> Hi, >> >> I've reinstalled numpy and the error from gcc has changed. Please see >> log, all the system information should be in there. >> >> > Matplotlib 1.1.1 does not work with py3k. We are just about ready to > release v1.2.0 which will work with py3k. Please feel free to checkout the > master branch on our github page and test it out before the release! > > Ben Root > > |
|
From: Benjamin R. <ben...@ou...> - 2012-08-24 12:41:43
|
On Fri, Aug 24, 2012 at 6:35 AM, Joe Borġ <ma...@jd...> wrote: > Hi, > > I've reinstalled numpy and the error from gcc has changed. Please see > log, all the system information should be in there. > > Matplotlib 1.1.1 does not work with py3k. We are just about ready to release v1.2.0 which will work with py3k. Please feel free to checkout the master branch on our github page and test it out before the release! Ben Root |
|
From: Joe B. <ma...@jd...> - 2012-08-24 10:35:58
|
Hi, I've reinstalled numpy and the error from gcc has changed. Please see log, all the system information should be in there. Regards, Joseph David Borġ http://www.jdborg.com |
|
From: Eric F. <ef...@ha...> - 2012-08-24 06:16:03
|
On 2012/08/23 6:52 PM, Andrew Nelson wrote: > Dear list, > apologies for what might be a simple question. I am creating an > application that uses matplotlib for plotting, using the Qt4Agg backend. > I can create the figures without a problem. > > However, I wish to save the state of the application, including the > graphs. The complicating factor is that the user may have altered the > appearance of the graphs via a NavigationToolbar. > > I have no problems saving the data that makes up the graphs, but how do > I save the properties of the graphs (line colour, linewidth, etc)? > I tried using matplotlib.artist.ArtistInspector(Line2D).properties(). > This gives a dictionary of all the properties. However, when I try to > pickle this I get picking errors: > > cPickle.PicklingError: Can't pickle <class > 'matplotlib.axes.AxesSubplot'>: attribute lookup > matplotlib.axes.AxesSubplot failed > > I am sure that there is an easy way of achieving this, I just can't see > it in the documentation. I appreciate any help the list is able to give me. Maybe there is not an easy way... See https://github.com/matplotlib/matplotlib/pull/1020. Eric > > regards, > Andrew > |
|
From: Andrew N. <and...@gm...> - 2012-08-24 04:52:14
|
Dear list, apologies for what might be a simple question. I am creating an application that uses matplotlib for plotting, using the Qt4Agg backend. I can create the figures without a problem. However, I wish to save the state of the application, including the graphs. The complicating factor is that the user may have altered the appearance of the graphs via a NavigationToolbar. I have no problems saving the data that makes up the graphs, but how do I save the properties of the graphs (line colour, linewidth, etc)? I tried using matplotlib.artist.ArtistInspector(Line2D).properties(). This gives a dictionary of all the properties. However, when I try to pickle this I get picking errors: cPickle.PicklingError: Can't pickle <class 'matplotlib.axes.AxesSubplot'>: attribute lookup matplotlib.axes.AxesSubplot failed I am sure that there is an easy way of achieving this, I just can't see it in the documentation. I appreciate any help the list is able to give me. regards, Andrew -- _____________________________________ Dr. Andrew Nelson _____________________________________ |
|
From: Fernando P. <fpe...@gm...> - 2012-08-23 18:27:31
|
Dear colleagues, next year's SIAM conference on Computational Science and Engineering, CSE'13, will take place in Boston, February 25-March 1 (http://www.siam.org/meetings/cse13), and for this version there will be a track focused on the topic of Big Data. This term has rapidly risen in recent discussions of science and even of mainstream business computing, and for good reasons. Today virtually all disciplines are facing a flood of quantitative information whose volumes have often grown faster than the quality of our tools for extracting insight from these data. SIAM hopes that CSE'13 will provide an excellent venue for discussing these problems, from the vantage point offered by a community whose expertise combines analytical insights, algorithmic development, software engineering and domain-specific applications. As part of this event, Titus Brown (http://ged.msu.edu) and I are organizing a minisymposium where we would like to have a group of presentations that address both novel algorithmic ideas and computational approaches as well as domain-specific problems. Data doesn't appear in a vacuum, and data from different domains presents a mix of common problems along with questions that may be specific to each; we hope that by engaging a dialog between those working on algorithmic and implementation questions and those with specific problems from the field, valuable insights can be obtained. If you would like to contribute to this minisymposium, please contact us directly at: "C. Titus Brown" <ct...@ms...>, "Fernando Perez" <Fer...@be...> with your name and affiliation, the title of your proposed talk and a brief description (actual abstracts are due later so an informal description will suffice for now), by Wednesday August 29. For more details on the submission process, see: http://www.siam.org/meetings/cse13/submissions.php Please forward this to any interested colleagues. Regards, Titus and Fernando. |
|
From: Nils G. <nil...@gm...> - 2012-08-23 16:33:43
|
Hi, I have noticed that matplotlib's plot_date() function occasionally places the label at the date (or month or year) end instead of the beginning. I attach a script illustrating the problem: I plot a range of y-values for a range of dates. The y-value 29 should be associated with 11 Nov 2011. In figure 1, left I am using UTC and everything is fine. In the right plot I am using the 'Europe/London' timezone and now the graph suggests that the y-value 29 is associated with 10 Nov 2011. When moving the mouse over the plot and watching the status bar at the bottom it suggest that indeed the label is placed at the date-end rather than the beginning. I have observed the same behaviour with month and year labels being shifted in non-UTC timezones. I have my time data as POSIX timestamps, so using datetime objects in plotting directly is not a (convenient) option. (In the sample code I use datetime only to generate example POSIX timestamps at midnight). Instead I would like to rely on the epoch2num() helper function supplied by matplotlib. I am using matplotlib-1.1.0, numpy-1.6.1. Any help appreciated, thanks, Nils. |
|
From: Benjamin R. <ben...@ou...> - 2012-08-22 17:27:48
|
On Wed, Aug 22, 2012 at 12:31 PM, Joe Borġ <ma...@jd...> wrote: > Not sure if this is an issue with an out-of-date GCC or if something else > is wrong. I've got 1.1.0 on no problem. > > $python setup.py build > ... > gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall > -fPIC -DPY_ARRAY_UNIQUE_SYMBOL=MPL_ARRAY_API -DPYCXX_ISO_CPP_LIB=1 > -I/software/Python/272/lib/python2.7/site-packages/numpy/core/include > -I/usr/include/freetype2 -I/usr/local/include -I/usr/include -I. > -I/software/Python/272/include/python2.7 -c src/ft2font.cpp -o > build/temp.linux-x86_64-2.7/src/ft2font.o > In file included from > /software/Python/272/lib/python2.7/site-packages/numpy/core/include/numpy/ndarraytypes.h:7, > from > /software/Python/272/lib/python2.7/site-packages/numpy/core/include/numpy/ndarrayobject.h:17, > from > /software/Python/272/lib/python2.7/site-packages/numpy/core/include/numpy/arrayobject.h:14, > from src/ft2font.cpp:7: > /software/Python/272/lib/python2.7/site-packages/numpy/core/include/numpy/npy_common.h:120:2: > error: #error npy_cdouble definition is not compatible with C99 complex > definition ! Please contact Numpy maintainers and give detailed information > about your compiler and platform > /software/Python/272/lib/python2.7/site-packages/numpy/core/include/numpy/npy_common.h:127:2: > error: #error npy_cfloat definition is not compatible with C99 complex > definition ! Please contact Numpy maintainers and give detailed information > about your compiler and platform > /software/Python/272/lib/python2.7/site-packages/numpy/core/include/numpy/npy_common.h:134:2: > error: #error npy_clongdouble definition is not compatible with C99 complex > definition ! Please contact Numpy maintainers and give detailed information > about your compiler and platform > In file included from > /software/Python/272/lib/python2.7/site-packages/numpy/core/include/numpy/ndarrayobject.h:26, > from > /software/Python/272/lib/python2.7/site-packages/numpy/core/include/numpy/arrayobject.h:14, > from src/ft2font.cpp:7: > /software/Python/272/lib/python2.7/site-packages/numpy/core/include/numpy/__multiarray_api.h: > In function 'int _import_array()': > /software/Python/272/lib/python2.7/site-packages/numpy/core/include/numpy/__multiarray_api.h:1226: > error: 'NPY_ABI_VERSION' was not declared in this scope > /software/Python/272/lib/python2.7/site-packages/numpy/core/include/numpy/__multiarray_api.h:1232: > error: 'NPY_API_VERSION' was not declared in this scope > error: command 'gcc' failed with exit status 1 > > $ gcc --version > gcc (GCC) 4.4.5 20110214 (Red Hat 4.4.5-6) > Copyright (C) 2010 Free Software Foundation, Inc. > This is free software; see the source for copying conditions. There is NO > warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. > > Joe, This appears to be a problem with NumPy. I would suggest sending this email to the numpy-discussion list. Be sure to include detailed information about your compiler, OS, and your machine. Cheers! Ben Root |
|
From: Joe B. <ma...@jd...> - 2012-08-22 17:04:27
|
Not sure if this is an issue with an out-of-date GCC or if something else
is wrong. I've got 1.1.0 on no problem.
$python setup.py build
...
gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall
-fPIC -DPY_ARRAY_UNIQUE_SYMBOL=MPL_ARRAY_API -DPYCXX_ISO_CPP_LIB=1
-I/software/Python/272/lib/python2.7/site-packages/numpy/core/include
-I/usr/include/freetype2 -I/usr/local/include -I/usr/include -I.
-I/software/Python/272/include/python2.7 -c src/ft2font.cpp -o
build/temp.linux-x86_64-2.7/src/ft2font.o
In file included from
/software/Python/272/lib/python2.7/site-packages/numpy/core/include/numpy/ndarraytypes.h:7,
from
/software/Python/272/lib/python2.7/site-packages/numpy/core/include/numpy/ndarrayobject.h:17,
from
/software/Python/272/lib/python2.7/site-packages/numpy/core/include/numpy/arrayobject.h:14,
from src/ft2font.cpp:7:
/software/Python/272/lib/python2.7/site-packages/numpy/core/include/numpy/npy_common.h:120:2:
error: #error npy_cdouble definition is not compatible with C99 complex
definition ! Please contact Numpy maintainers and give detailed information
about your compiler and platform
/software/Python/272/lib/python2.7/site-packages/numpy/core/include/numpy/npy_common.h:127:2:
error: #error npy_cfloat definition is not compatible with C99 complex
definition ! Please contact Numpy maintainers and give detailed information
about your compiler and platform
/software/Python/272/lib/python2.7/site-packages/numpy/core/include/numpy/npy_common.h:134:2:
error: #error npy_clongdouble definition is not compatible with C99 complex
definition ! Please contact Numpy maintainers and give detailed information
about your compiler and platform
In file included from
/software/Python/272/lib/python2.7/site-packages/numpy/core/include/numpy/ndarrayobject.h:26,
from
/software/Python/272/lib/python2.7/site-packages/numpy/core/include/numpy/arrayobject.h:14,
from src/ft2font.cpp:7:
/software/Python/272/lib/python2.7/site-packages/numpy/core/include/numpy/__multiarray_api.h:
In function 'int _import_array()':
/software/Python/272/lib/python2.7/site-packages/numpy/core/include/numpy/__multiarray_api.h:1226:
error: 'NPY_ABI_VERSION' was not declared in this scope
/software/Python/272/lib/python2.7/site-packages/numpy/core/include/numpy/__multiarray_api.h:1232:
error: 'NPY_API_VERSION' was not declared in this scope
error: command 'gcc' failed with exit status 1
$ gcc --version
gcc (GCC) 4.4.5 20110214 (Red Hat 4.4.5-6)
Copyright (C) 2010 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Regards,
Joseph David Borġ
http://www.jdborg.com
|
|
From: Jeffrey B. <jbl...@al...> - 2012-08-22 15:29:43
|
On Aug 22, 2012, at 10:04 AM, Virgil Stokes wrote:
> On 21-Aug-2012 17:52, Jeffrey Blackburne wrote:
>>
>> On Aug 21, 2012, at 10:58 AM, Virgil Stokes wrote:
>>
>>> In reference to my previous email.
>>>
>>> How can I find the outliers (samples points beyond the whiskers)
>>> in the data
>>> used for the boxplot?
>>>
>>> Here is a code snippet that shows how it was used for the timings
>>> data (a list
>>> of 4 sublists (y1,y2,y3,y4), each containing 400,000 real data
>>> values),
>>> ...
>>> ...
>>> ...
>>> # Box Plots
>>> plt.subplot(2,1,2)
>>> timings = [y1,y2,y3,y4]
>>> pos = np.array(range(len(timings)))+1
>>> bp = plt.boxplot( timings, sym='k+', patch_artist=True,
>>> positions=pos, notch=1, bootstrap=5000 )
>>>
>>> plt.xlabel('Algorithm')
>>> plt.ylabel('Exection time (sec)')
>>> plt.ylim(0.9*ymin,1.1*ymax)
>>>
>>> plt.setp(bp['whiskers'], color='k', linestyle='-' )
>>> plt.setp(bp['fliers'], markersize=3.0)
>>> plt.title('Box plots (%4d trials)' %(n))
>>> plt.show()
>>> ...
>>> ...
>>> ...
>>>
>>> Again my questions:
>>> 1) How to get the value of the median?
>>
>> This is easily calculated from your data. Numpy will even do it
>> for you: np.median(timings)
>>
>>> 2) How to find the outliers (outside the whiskers)?
>>
>> From the boxplot documentation: the whiskers extend to the most
>> extreme data point within distance X of the bottom or top of the
>> box, where X is 1.5 times the extent of the box. Any points more
>> extreme than that are the outliers. The box itself of course
>> extends from the 25th percentile to the 75th percentile of your
>> data. Again, you can easily calculate these values from your data.
>>
>>> 3) How to find the width of the notch?
>>
>> Again, from the docs: with bootstrap=5000, it calculates the width
>> of the notch by bootstrap resampling your data (the timings array)
>> 5000 times and finding the 95% confidence interval of the median,
>> and uses that as the notch width. You can redo that yourself
>> pretty easily. Here is some bootstrap code for you to adapt:
>> http://mail.scipy.org/pipermail/scipy-user/2009-July/021704.html
>>
>> I encourage you to read the documentation! This page is very
>> useful for reference:
>> http://matplotlib.sourceforge.net/api/pyplot_api.html
>>
>> -Jeff
>>
> Yes Jeff,
> These are very useful links; however, box plots have a parameter
> called the "adjacent value" (from the McGill reference),
>
> "The plotted whisker extends to the adjacent value, which is the
> most extreme data value that is not an outlier."
>
> It seems there should be one for the lower and one for the upper
> whisker --- how can one get these two values from boxplot?
Look at bp['whiskers']
For those who got here by searching: bp is the object returned by
plt.boxplot()
> Also, is there anyway to directly get the indices of the outliers?
Look into np.where()
|
|
From: Virgil S. <vs...@it...> - 2012-08-22 14:24:51
|
On 22-Aug-2012 11:23, Virgil Stokes wrote:
> On 21-Aug-2012 17:59, Paul Hobson wrote:
>> On Tue, Aug 21, 2012 at 8:56 AM, Virgil Stokes <vs...@it...> wrote:
>>> On 21-Aug-2012 17:50, Paul Hobson wrote:
>>>> On Tue, Aug 21, 2012 at 7:58 AM, Virgil Stokes <vs...@it...> wrote:
>>>>> In reference to my previous email.
>>>>>
>>>>> How can I find the outliers (samples points beyond the whiskers) in the
>>>>> data
>>>>> used for the boxplot?
>>>>>
>>>>> Here is a code snippet that shows how it was used for the timings data (a
>>>>> list
>>>>> of 4 sublists (y1,y2,y3,y4), each containing 400,000 real data values),
>>>>> ...
>>>>> ...
>>>>> ...
>>>>> # Box Plots
>>>>> plt.subplot(2,1,2)
>>>>> timings = [y1,y2,y3,y4]
>>>>> pos = np.array(range(len(timings)))+1
>>>>> bp = plt.boxplot( timings, sym='k+', patch_artist=True,
>>>>> positions=pos, notch=1, bootstrap=5000 )
>>>>>
>>>>> plt.xlabel('Algorithm')
>>>>> plt.ylabel('Exection time (sec)')
>>>>> plt.ylim(0.9*ymin,1.1*ymax)
>>>>>
>>>>> plt.setp(bp['whiskers'], color='k', linestyle='-' )
>>>>> plt.setp(bp['fliers'], markersize=3.0)
>>>>> plt.title('Box plots (%4d trials)' %(n))
>>>>> plt.show()
>>>>> ...
>>>>> ...
>>>>> ...
>>>>>
>>>>> Again my questions:
>>>>> 1) How to get the value of the median?
>>>>> 2) How to find the outliers (outside the whiskers)?
>>>>> 3) How to find the width of the notch?
>>>> Virgil, the objects stuffed inside the `bp` dictionary should have
>>>> methods to retrieve their values. Let's see:
>>>>
>>>> In [35]: x = np.random.lognormal(mean=1.25, sigma=1.35, size=(37,3))
>>>>
>>>> In [36]: bp = plt.boxplot(x, bootstrap=5000, notch=True)
>>>>
>>>> In [37]: # Question 1
>>>> ...: print('medians')
>>>> ...: for n, median in enumerate(bp['medians']):
>>>> ...: print('%d: %f' % (n, median.get_ydata()[0]))
>>>> ...:
>>>> medians
>>>> 0: 6.339692
>>>> 1: 3.449320
>>>> 2: 4.503706
>>>>
>>>> In [38]: # Question 2
>>>> ...: print('fliers')
>>>> ...: for n in range(0, len(bp['fliers']), 2):
>>>> ...: print('%d: upper outliers = \t' % (n/2,))
>>>> ...: print(bp['fliers'][n].get_ydata())
>>>> ...: print('\n%d: lower outliers = \t' % (n/2,))
>>>> ...: print(bp['fliers'][n+1].get_ydata())
>>>> ...: print('\n')
>>>> ...:
>>> You had no outliers!
>>>
>>>> In [39]: # Question 3
>>>> ...: print('Confidence Intervals')
>>>> ...: for n, box in enumerate(bp['boxes']):
>>>> ...: print('%d: lower CI: %f' % (n, box.get_ydata()[2]))
>>>> ...: print('%d: upper CI: %f' % (n, box.get_ydata()[4]))
>>>> ...:
>>>> Confidence Intervals
>>>> 0: lower CI: 1.760701
>>>> 0: upper CI: 10.102221
>>>> 1: lower CI: 1.626386
>>>> 1: upper CI: 5.601927
>>>> 2: lower CI: 2.173173
>>>>
>>>> Hope that helps,
>>>> -paul
>>> Just what I was looking for Paul! Thanks very much.
>>>
>>> One final question --- Where can I find the documentation that answers my
>>> questions and gives more details about the equations used for the width of
>>> notch. etc.?
>>>
>>> Thanks again :-)
>> That should all be in the boxplot docstring. Do you use ipython? If
>> not, you should :)
>>
>> if so, just do `plt.boxplot?` at the ipython terminal and it'll show up.
>> -paul
> I still have a problem...
> Let me show the updated code snippet again
> ...
> ...
> ...
> # Box Plots
> iplt += 1
> plt.figure(iplt)
> timings = [ya[0],ya[1],ya[2],ya[3]]
> pos = np.array(range(len(timings)))+1
> bp = plt.boxplot( timings, sym='k+', patch_artist=True,
> positions=pos, notch=1, bootstrap=5000 )
> print ('medians')
> for nn,median in enumerate(bp['medians']):
> print('%d: %f' %(nn,median.get_ydata()[0]))
>
> print('fliers')
> for nn in range(0, len(bp['fliers']), 2):
> print('%d: upper outliers = \t' % (nn/2,))
> print(bp['fliers'][nn].get_ydata())
> print('\n%d: lower outliers = \t' % (nn/2,))
> print(bp['fliers'][nn+1].get_ydata())
> print('\n')
>
> print('Confidence Intervals')
> for nn, box in enumerate(bp['boxes']):
> print('%d: lower CI: %f' % (nn, box.get_ydata()[2]))<--- FAILS!
> print('%d: upper CI: %f' % (nn, box.get_ydata()[4]))
> ...
> ...
> ...
>
> Medians and fliers work perfectly; but, I get the following error message when
> trying to access the confidence intervals:
>
> AttributeError: 'PathPatch' object has no attribute 'get_ydata'
>
> Note, I am using boxplot with 4 sets of data and I am using matplotlib vers. 1.1.0.
>
> Any suggestions on how to fix this problem?
I found the solution,
one must have,
patch_artist=False
in the boxplot call.
:-)
|
|
From: Virgil S. <vs...@it...> - 2012-08-22 14:04:10
|
On 21-Aug-2012 17:52, Jeffrey Blackburne wrote:
>
> On Aug 21, 2012, at 10:58 AM, Virgil Stokes wrote:
>
>> In reference to my previous email.
>>
>> How can I find the outliers (samples points beyond the whiskers) in the data
>> used for the boxplot?
>>
>> Here is a code snippet that shows how it was used for the timings data (a list
>> of 4 sublists (y1,y2,y3,y4), each containing 400,000 real data values),
>> ...
>> ...
>> ...
>> # Box Plots
>> plt.subplot(2,1,2)
>> timings = [y1,y2,y3,y4]
>> pos = np.array(range(len(timings)))+1
>> bp = plt.boxplot( timings, sym='k+', patch_artist=True,
>> positions=pos, notch=1, bootstrap=5000 )
>>
>> plt.xlabel('Algorithm')
>> plt.ylabel('Exection time (sec)')
>> plt.ylim(0.9*ymin,1.1*ymax)
>>
>> plt.setp(bp['whiskers'], color='k', linestyle='-' )
>> plt.setp(bp['fliers'], markersize=3.0)
>> plt.title('Box plots (%4d trials)' %(n))
>> plt.show()
>> ...
>> ...
>> ...
>>
>> Again my questions:
>> 1) How to get the value of the median?
>
> This is easily calculated from your data. Numpy will even do it for you:
> np.median(timings)
>
>> 2) How to find the outliers (outside the whiskers)?
>
> From the boxplot documentation: the whiskers extend to the most extreme data
> point within distance X of the bottom or top of the box, where X is 1.5 times
> the extent of the box. Any points more extreme than that are the outliers. The
> box itself of course extends from the 25th percentile to the 75th percentile
> of your data. Again, you can easily calculate these values from your data.
>
>> 3) How to find the width of the notch?
>
> Again, from the docs: with bootstrap=5000, it calculates the width of the
> notch by bootstrap resampling your data (the timings array) 5000 times and
> finding the 95% confidence interval of the median, and uses that as the notch
> width. You can redo that yourself pretty easily. Here is some bootstrap code
> for you to adapt:
> http://mail.scipy.org/pipermail/scipy-user/2009-July/021704.html
>
> I encourage you to read the documentation! This page is very useful for
> reference:
> http://matplotlib.sourceforge.net/api/pyplot_api.html
>
> -Jeff
>
Yes Jeff,
These are very useful links; however, box plots have a parameter called the
"adjacent value" (from the McGill reference),
"The plotted whisker extends to the adjacent value, which is the most extreme
data value that is not an outlier."
It seems there should be one for the lower and one for the upper whisker --- how
can one get these two values from boxplot?
Also, is there anyway to directly get the indices of the outliers?
|
|
From: Petro <x....@gm...> - 2012-08-22 14:02:55
|
Michael Droettboom <md...@st...> writes: > Can you try the GtkAgg backend instead and confirm the bug isn't there? > The "pure" Gtk backend doesn't see a lot of use these days and isn't > very well tested. > > Mike > Thanks. It solved the problem. |
|
From: Michael D. <md...@st...> - 2012-08-22 12:28:43
|
Can you try the GtkAgg backend instead and confirm the bug isn't there? The "pure" Gtk backend doesn't see a lot of use these days and isn't very well tested. Mike On 08/22/2012 08:17 AM, Petro Khoroshyy wrote: > Damon McDougall > <dam...@gm...> writes: > >> On Wed, Aug 22, 2012 at 11:28:54AM +0200, Petro wrote: >>> Hi list. >>> I generate some png images using matplotlib, and get very different >>> results depending on figuresize >>> __________________________________________________________________ >>> from pylab import figure, plot >>> import pylab as plt >>> import numpy as np >>> figure() >>> plt.subplot(2,1,1) >>> plot(np.random.rand(10),'o') >>> plt.subplot(2,1,2) >>> plot(np.random.rand(10),'o') >>> pic_name='fit_rates1.png' >>> path_name='/home/petro/tmp/' >>> plt.savefig(path_name + pic_name) >>> __________________________________________________________________ >>> >>> the code above generates the following image: >>> https://lh3.googleusercontent.com/-107Ducz_CA0/UDShKMtejtI/AAAAAAAACls/YOeahS3tQA8/s400/fit_rates1.png >>> >>> now if I increase a figure size parameter: >>> __________________________________________________________________ >>> from pylab import figure, plot >>> import pylab as plt >>> import numpy as np >>> plt.ioff() >>> from matplotlib import rcParams >>> golden_mean = (np.sqrt(5)-1.0)/2.0 # Aesthetic ratio >>> fig_width = 5.6 # width in inches >>> fig_height = fig_width*golden_mean # height in inches >>> rcParams['figure.figsize']=fig_width, fig_height*3 >>> figure() >>> plt.subplot(2,1,1) >>> plot(np.random.rand(10),'o') >>> plt.subplot(2,1,2) >>> plot(np.random.rand(10),'o') >>> pic_name='fit_rates2.png' >>> path_name='/home/petro/tmp/' >>> plt.savefig(path_name + pic_name) >>> >> What backend are you using? >> >> print plt.get_backend() > It outputs GTK. > > > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Matplotlib-users mailing list > Mat...@li... > https://lists.sourceforge.net/lists/listinfo/matplotlib-users |
|
From: Petro K. <kho...@gm...> - 2012-08-22 12:17:53
|
Damon McDougall <dam...@gm...> writes: > On Wed, Aug 22, 2012 at 11:28:54AM +0200, Petro wrote: >> Hi list. >> I generate some png images using matplotlib, and get very different >> results depending on figuresize >> __________________________________________________________________ >> from pylab import figure, plot >> import pylab as plt >> import numpy as np >> figure() >> plt.subplot(2,1,1) >> plot(np.random.rand(10),'o') >> plt.subplot(2,1,2) >> plot(np.random.rand(10),'o') >> pic_name='fit_rates1.png' >> path_name='/home/petro/tmp/' >> plt.savefig(path_name + pic_name) >> __________________________________________________________________ >> >> the code above generates the following image: >> https://lh3.googleusercontent.com/-107Ducz_CA0/UDShKMtejtI/AAAAAAAACls/YOeahS3tQA8/s400/fit_rates1.png >> >> now if I increase a figure size parameter: >> __________________________________________________________________ >> from pylab import figure, plot >> import pylab as plt >> import numpy as np >> plt.ioff() >> from matplotlib import rcParams >> golden_mean = (np.sqrt(5)-1.0)/2.0 # Aesthetic ratio >> fig_width = 5.6 # width in inches >> fig_height = fig_width*golden_mean # height in inches >> rcParams['figure.figsize']=fig_width, fig_height*3 >> figure() >> plt.subplot(2,1,1) >> plot(np.random.rand(10),'o') >> plt.subplot(2,1,2) >> plot(np.random.rand(10),'o') >> pic_name='fit_rates2.png' >> path_name='/home/petro/tmp/' >> plt.savefig(path_name + pic_name) >> > > What backend are you using? > > print plt.get_backend() It outputs GTK. |
|
From: Damon M. <dam...@gm...> - 2012-08-22 10:29:30
|
On Wed, Aug 22, 2012 at 11:28:54AM +0200, Petro wrote: > Hi list. > I generate some png images using matplotlib, and get very different > results depending on figuresize > __________________________________________________________________ > from pylab import figure, plot > import pylab as plt > import numpy as np > figure() > plt.subplot(2,1,1) > plot(np.random.rand(10),'o') > plt.subplot(2,1,2) > plot(np.random.rand(10),'o') > pic_name='fit_rates1.png' > path_name='/home/petro/tmp/' > plt.savefig(path_name + pic_name) > __________________________________________________________________ > > the code above generates the following image: > https://lh3.googleusercontent.com/-107Ducz_CA0/UDShKMtejtI/AAAAAAAACls/YOeahS3tQA8/s400/fit_rates1.png > > now if I increase a figure size parameter: > __________________________________________________________________ > from pylab import figure, plot > import pylab as plt > import numpy as np > plt.ioff() > from matplotlib import rcParams > golden_mean = (np.sqrt(5)-1.0)/2.0 # Aesthetic ratio > fig_width = 5.6 # width in inches > fig_height = fig_width*golden_mean # height in inches > rcParams['figure.figsize']=fig_width, fig_height*3 > figure() > plt.subplot(2,1,1) > plot(np.random.rand(10),'o') > plt.subplot(2,1,2) > plot(np.random.rand(10),'o') > pic_name='fit_rates2.png' > path_name='/home/petro/tmp/' > plt.savefig(path_name + pic_name) > What backend are you using? print plt.get_backend() -- Damon McDougall http://www.damon-is-a-geek.com B2.39 Mathematics Institute University of Warwick Coventry West Midlands CV4 7AL United Kingdom |
|
From: Petro <x....@gm...> - 2012-08-22 09:29:19
|
Hi list. I generate some png images using matplotlib, and get very different results depending on figuresize __________________________________________________________________ from pylab import figure, plot import pylab as plt import numpy as np figure() plt.subplot(2,1,1) plot(np.random.rand(10),'o') plt.subplot(2,1,2) plot(np.random.rand(10),'o') pic_name='fit_rates1.png' path_name='/home/petro/tmp/' plt.savefig(path_name + pic_name) __________________________________________________________________ the code above generates the following image: https://lh3.googleusercontent.com/-107Ducz_CA0/UDShKMtejtI/AAAAAAAACls/YOeahS3tQA8/s400/fit_rates1.png now if I increase a figure size parameter: __________________________________________________________________ from pylab import figure, plot import pylab as plt import numpy as np plt.ioff() from matplotlib import rcParams golden_mean = (np.sqrt(5)-1.0)/2.0 # Aesthetic ratio fig_width = 5.6 # width in inches fig_height = fig_width*golden_mean # height in inches rcParams['figure.figsize']=fig_width, fig_height*3 figure() plt.subplot(2,1,1) plot(np.random.rand(10),'o') plt.subplot(2,1,2) plot(np.random.rand(10),'o') pic_name='fit_rates2.png' path_name='/home/petro/tmp/' plt.savefig(path_name + pic_name) __________________________________________________________________ the result looks strange like this: https://lh5.googleusercontent.com/-4GRQxuRFvh4/UDSiRrNy59I/AAAAAAAACmA/Kho3prHFpUU/s640/fit_rates2.png Has anyone experienced behaviour like this? Thanks. Petro |
|
From: Virgil S. <vs...@it...> - 2012-08-22 09:23:55
|
On 21-Aug-2012 17:59, Paul Hobson wrote:
> On Tue, Aug 21, 2012 at 8:56 AM, Virgil Stokes <vs...@it...> wrote:
>> On 21-Aug-2012 17:50, Paul Hobson wrote:
>>> On Tue, Aug 21, 2012 at 7:58 AM, Virgil Stokes <vs...@it...> wrote:
>>>> In reference to my previous email.
>>>>
>>>> How can I find the outliers (samples points beyond the whiskers) in the
>>>> data
>>>> used for the boxplot?
>>>>
>>>> Here is a code snippet that shows how it was used for the timings data (a
>>>> list
>>>> of 4 sublists (y1,y2,y3,y4), each containing 400,000 real data values),
>>>> ...
>>>> ...
>>>> ...
>>>> # Box Plots
>>>> plt.subplot(2,1,2)
>>>> timings = [y1,y2,y3,y4]
>>>> pos = np.array(range(len(timings)))+1
>>>> bp = plt.boxplot( timings, sym='k+', patch_artist=True,
>>>> positions=pos, notch=1, bootstrap=5000 )
>>>>
>>>> plt.xlabel('Algorithm')
>>>> plt.ylabel('Exection time (sec)')
>>>> plt.ylim(0.9*ymin,1.1*ymax)
>>>>
>>>> plt.setp(bp['whiskers'], color='k', linestyle='-' )
>>>> plt.setp(bp['fliers'], markersize=3.0)
>>>> plt.title('Box plots (%4d trials)' %(n))
>>>> plt.show()
>>>> ...
>>>> ...
>>>> ...
>>>>
>>>> Again my questions:
>>>> 1) How to get the value of the median?
>>>> 2) How to find the outliers (outside the whiskers)?
>>>> 3) How to find the width of the notch?
>>> Virgil, the objects stuffed inside the `bp` dictionary should have
>>> methods to retrieve their values. Let's see:
>>>
>>> In [35]: x = np.random.lognormal(mean=1.25, sigma=1.35, size=(37,3))
>>>
>>> In [36]: bp = plt.boxplot(x, bootstrap=5000, notch=True)
>>>
>>> In [37]: # Question 1
>>> ...: print('medians')
>>> ...: for n, median in enumerate(bp['medians']):
>>> ...: print('%d: %f' % (n, median.get_ydata()[0]))
>>> ...:
>>> medians
>>> 0: 6.339692
>>> 1: 3.449320
>>> 2: 4.503706
>>>
>>> In [38]: # Question 2
>>> ...: print('fliers')
>>> ...: for n in range(0, len(bp['fliers']), 2):
>>> ...: print('%d: upper outliers = \t' % (n/2,))
>>> ...: print(bp['fliers'][n].get_ydata())
>>> ...: print('\n%d: lower outliers = \t' % (n/2,))
>>> ...: print(bp['fliers'][n+1].get_ydata())
>>> ...: print('\n')
>>> ...:
>> You had no outliers!
>>
>>> In [39]: # Question 3
>>> ...: print('Confidence Intervals')
>>> ...: for n, box in enumerate(bp['boxes']):
>>> ...: print('%d: lower CI: %f' % (n, box.get_ydata()[2]))
>>> ...: print('%d: upper CI: %f' % (n, box.get_ydata()[4]))
>>> ...:
>>> Confidence Intervals
>>> 0: lower CI: 1.760701
>>> 0: upper CI: 10.102221
>>> 1: lower CI: 1.626386
>>> 1: upper CI: 5.601927
>>> 2: lower CI: 2.173173
>>>
>>> Hope that helps,
>>> -paul
>> Just what I was looking for Paul! Thanks very much.
>>
>> One final question --- Where can I find the documentation that answers my
>> questions and gives more details about the equations used for the width of
>> notch. etc.?
>>
>> Thanks again :-)
> That should all be in the boxplot docstring. Do you use ipython? If
> not, you should :)
>
> if so, just do `plt.boxplot?` at the ipython terminal and it'll show up.
> -paul
I still have a problem...
Let me show the updated code snippet again
...
...
...
# Box Plots
iplt += 1
plt.figure(iplt)
timings = [ya[0],ya[1],ya[2],ya[3]]
pos = np.array(range(len(timings)))+1
bp = plt.boxplot( timings, sym='k+', patch_artist=True,
positions=pos, notch=1, bootstrap=5000 )
print ('medians')
for nn,median in enumerate(bp['medians']):
print('%d: %f' %(nn,median.get_ydata()[0]))
print('fliers')
for nn in range(0, len(bp['fliers']), 2):
print('%d: upper outliers = \t' % (nn/2,))
print(bp['fliers'][nn].get_ydata())
print('\n%d: lower outliers = \t' % (nn/2,))
print(bp['fliers'][nn+1].get_ydata())
print('\n')
print('Confidence Intervals')
for nn, box in enumerate(bp['boxes']):
print('%d: lower CI: %f' % (nn, box.get_ydata()[2]))<--- FAILS!
print('%d: upper CI: %f' % (nn, box.get_ydata()[4]))
...
...
...
Medians and fliers work perfectly; but, I get the following error message when
trying to access the confidence intervals:
AttributeError: 'PathPatch' object has no attribute 'get_ydata'
Note, I am using boxplot with 4 sets of data and I am using matplotlib vers. 1.1.0.
Any suggestions on how to fix this problem?
|
|
From: Peter S. J. <pet...@gm...> - 2012-08-21 17:45:31
|
Hi Everyone,
I'm having problems when rasterizing many lines in a plot using the
rasterized=True keyword using the pdf output.
Some version info:
matplotlib version 1.1.1rc
ubuntu 12.04
python 2.7.3
Here's a basic example that demonstrates my problem:
# Import matplotlib to create a pdf document
import matplotlib
matplotlib.use('Agg')
from matplotlib.backends.backend_pdf import PdfPages
pdf = PdfPages('rasterized_test.pdf')
import matplotlib.pylab as plt
# some test data
import numpy as np
ts = np.linspace(0,2*np.pi,100) * np.ones((200,100))
ts += (np.linspace(0, np.pi, 200)[np.newaxis] * np.ones((100,200))).T
ys = np.sin(ts)
fig = plt.figure()
ax = fig.add_subplot(111)
ax.plot(ts[0], ys.T, color='r', lw=0.5, alpha=0.5, rasterized=True)
pdf.savefig()
pdf.close()
Essentially, I have a lot (200 in this case) of closely overlapping lines
which makes the resulting figure (not rasterized) overly difficult to load.
I would like to rasterize these lines, such that the axis labels (and other
elements of the plot, not shown) remain vectors while the solution
trajectories are flattened to a single raster background. However, using
the code above, the image still takes a long time to load since each
trajectory is independently rasterized, resulting in multiple layers. (If I
open the resulting pdf with a program like inkscape, I can manipulate each
trajectory independently.)
Is it possible to flatten all of the rasterized elements into a single
layer, so the pdf size would be greatly reduced?
Thanks,
--Peter
|
|
From: Paul H. <pmh...@gm...> - 2012-08-21 15:59:09
|
On Tue, Aug 21, 2012 at 8:56 AM, Virgil Stokes <vs...@it...> wrote:
> On 21-Aug-2012 17:50, Paul Hobson wrote:
>>
>> On Tue, Aug 21, 2012 at 7:58 AM, Virgil Stokes <vs...@it...> wrote:
>>>
>>> In reference to my previous email.
>>>
>>> How can I find the outliers (samples points beyond the whiskers) in the
>>> data
>>> used for the boxplot?
>>>
>>> Here is a code snippet that shows how it was used for the timings data (a
>>> list
>>> of 4 sublists (y1,y2,y3,y4), each containing 400,000 real data values),
>>> ...
>>> ...
>>> ...
>>> # Box Plots
>>> plt.subplot(2,1,2)
>>> timings = [y1,y2,y3,y4]
>>> pos = np.array(range(len(timings)))+1
>>> bp = plt.boxplot( timings, sym='k+', patch_artist=True,
>>> positions=pos, notch=1, bootstrap=5000 )
>>>
>>> plt.xlabel('Algorithm')
>>> plt.ylabel('Exection time (sec)')
>>> plt.ylim(0.9*ymin,1.1*ymax)
>>>
>>> plt.setp(bp['whiskers'], color='k', linestyle='-' )
>>> plt.setp(bp['fliers'], markersize=3.0)
>>> plt.title('Box plots (%4d trials)' %(n))
>>> plt.show()
>>> ...
>>> ...
>>> ...
>>>
>>> Again my questions:
>>> 1) How to get the value of the median?
>>> 2) How to find the outliers (outside the whiskers)?
>>> 3) How to find the width of the notch?
>>
>> Virgil, the objects stuffed inside the `bp` dictionary should have
>> methods to retrieve their values. Let's see:
>>
>> In [35]: x = np.random.lognormal(mean=1.25, sigma=1.35, size=(37,3))
>>
>> In [36]: bp = plt.boxplot(x, bootstrap=5000, notch=True)
>>
>> In [37]: # Question 1
>> ...: print('medians')
>> ...: for n, median in enumerate(bp['medians']):
>> ...: print('%d: %f' % (n, median.get_ydata()[0]))
>> ...:
>> medians
>> 0: 6.339692
>> 1: 3.449320
>> 2: 4.503706
>>
>> In [38]: # Question 2
>> ...: print('fliers')
>> ...: for n in range(0, len(bp['fliers']), 2):
>> ...: print('%d: upper outliers = \t' % (n/2,))
>> ...: print(bp['fliers'][n].get_ydata())
>> ...: print('\n%d: lower outliers = \t' % (n/2,))
>> ...: print(bp['fliers'][n+1].get_ydata())
>> ...: print('\n')
>> ...:
>
> You had no outliers!
>
>>
>> In [39]: # Question 3
>> ...: print('Confidence Intervals')
>> ...: for n, box in enumerate(bp['boxes']):
>> ...: print('%d: lower CI: %f' % (n, box.get_ydata()[2]))
>> ...: print('%d: upper CI: %f' % (n, box.get_ydata()[4]))
>> ...:
>> Confidence Intervals
>> 0: lower CI: 1.760701
>> 0: upper CI: 10.102221
>> 1: lower CI: 1.626386
>> 1: upper CI: 5.601927
>> 2: lower CI: 2.173173
>>
>> Hope that helps,
>> -paul
>
> Just what I was looking for Paul! Thanks very much.
>
> One final question --- Where can I find the documentation that answers my
> questions and gives more details about the equations used for the width of
> notch. etc.?
>
> Thanks again :-)
That should all be in the boxplot docstring. Do you use ipython? If
not, you should :)
if so, just do `plt.boxplot?` at the ipython terminal and it'll show up.
-paul
|
|
From: Paul H. <pmh...@gm...> - 2012-08-21 15:55:22
|
On Tue, Aug 21, 2012 at 7:58 AM, Virgil Stokes <vs...@it...> wrote:
> In reference to my previous email.
>
> How can I find the outliers (samples points beyond the whiskers) in the data
> used for the boxplot?
>
> Here is a code snippet that shows how it was used for the timings data (a list
> of 4 sublists (y1,y2,y3,y4), each containing 400,000 real data values),
> ...
> ...
> ...
> # Box Plots
> plt.subplot(2,1,2)
> timings = [y1,y2,y3,y4]
> pos = np.array(range(len(timings)))+1
> bp = plt.boxplot( timings, sym='k+', patch_artist=True,
> positions=pos, notch=1, bootstrap=5000 )
>
> plt.xlabel('Algorithm')
> plt.ylabel('Exection time (sec)')
> plt.ylim(0.9*ymin,1.1*ymax)
>
> plt.setp(bp['whiskers'], color='k', linestyle='-' )
> plt.setp(bp['fliers'], markersize=3.0)
> plt.title('Box plots (%4d trials)' %(n))
> plt.show()
> ...
> ...
> ...
>
> Again my questions:
> 1) How to get the value of the median?
> 2) How to find the outliers (outside the whiskers)?
> 3) How to find the width of the notch?
Ooops. Here's my reply -- this time to whole list
Virgil, the objects stuffed inside the `bp` dictionary should have
methods to retrieve their values. Let's see:
In [35]: x = np.random.lognormal(mean=1.25, sigma=1.35, size=(37,3))
In [36]: bp = plt.boxplot(x, bootstrap=5000, notch=True)
In [37]: # Question 1
...: print('medians')
...: for n, median in enumerate(bp['medians']):
...: print('%d: %f' % (n, median.get_ydata()[0]))
...:
medians
0: 6.339692
1: 3.449320
2: 4.503706
In [38]: # Question 2
...: print('fliers')
...: for n in range(0, len(bp['fliers']), 2):
...: print('%d: upper outliers = \t' % (n/2,))
...: print(bp['fliers'][n].get_ydata())
...: print('\n%d: lower outliers = \t' % (n/2,))
...: print(bp['fliers'][n+1].get_ydata())
...: print('\n')
...:
In [39]: # Question 3
...: print('Confidence Intervals')
...: for n, box in enumerate(bp['boxes']):
...: print('%d: lower CI: %f' % (n, box.get_ydata()[2]))
...: print('%d: upper CI: %f' % (n, box.get_ydata()[4]))
...:
Confidence Intervals
0: lower CI: 1.760701
0: upper CI: 10.102221
1: lower CI: 1.626386
1: upper CI: 5.601927
2: lower CI: 2.173173
Hope that helps,
-paul
|
|
From: Jeffrey B. <jbl...@al...> - 2012-08-21 15:52:39
|
On Aug 21, 2012, at 10:58 AM, Virgil Stokes wrote:
> In reference to my previous email.
>
> How can I find the outliers (samples points beyond the whiskers) in
> the data
> used for the boxplot?
>
> Here is a code snippet that shows how it was used for the timings
> data (a list
> of 4 sublists (y1,y2,y3,y4), each containing 400,000 real data
> values),
> ...
> ...
> ...
> # Box Plots
> plt.subplot(2,1,2)
> timings = [y1,y2,y3,y4]
> pos = np.array(range(len(timings)))+1
> bp = plt.boxplot( timings, sym='k+', patch_artist=True,
> positions=pos, notch=1, bootstrap=5000 )
>
> plt.xlabel('Algorithm')
> plt.ylabel('Exection time (sec)')
> plt.ylim(0.9*ymin,1.1*ymax)
>
> plt.setp(bp['whiskers'], color='k', linestyle='-' )
> plt.setp(bp['fliers'], markersize=3.0)
> plt.title('Box plots (%4d trials)' %(n))
> plt.show()
> ...
> ...
> ...
>
> Again my questions:
> 1) How to get the value of the median?
This is easily calculated from your data. Numpy will even do it for
you: np.median(timings)
> 2) How to find the outliers (outside the whiskers)?
From the boxplot documentation: the whiskers extend to the most
extreme data point within distance X of the bottom or top of the box,
where X is 1.5 times the extent of the box. Any points more extreme
than that are the outliers. The box itself of course extends from the
25th percentile to the 75th percentile of your data. Again, you can
easily calculate these values from your data.
> 3) How to find the width of the notch?
Again, from the docs: with bootstrap=5000, it calculates the width of
the notch by bootstrap resampling your data (the timings array) 5000
times and finding the 95% confidence interval of the median, and uses
that as the notch width. You can redo that yourself pretty easily.
Here is some bootstrap code for you to adapt:
http://mail.scipy.org/pipermail/scipy-user/2009-July/021704.html
I encourage you to read the documentation! This page is very useful
for reference:
http://matplotlib.sourceforge.net/api/pyplot_api.html
-Jeff
|
|
From: Virgil S. <vs...@it...> - 2012-08-21 14:58:28
|
In reference to my previous email.
How can I find the outliers (samples points beyond the whiskers) in the data
used for the boxplot?
Here is a code snippet that shows how it was used for the timings data (a list
of 4 sublists (y1,y2,y3,y4), each containing 400,000 real data values),
...
...
...
# Box Plots
plt.subplot(2,1,2)
timings = [y1,y2,y3,y4]
pos = np.array(range(len(timings)))+1
bp = plt.boxplot( timings, sym='k+', patch_artist=True,
positions=pos, notch=1, bootstrap=5000 )
plt.xlabel('Algorithm')
plt.ylabel('Exection time (sec)')
plt.ylim(0.9*ymin,1.1*ymax)
plt.setp(bp['whiskers'], color='k', linestyle='-' )
plt.setp(bp['fliers'], markersize=3.0)
plt.title('Box plots (%4d trials)' %(n))
plt.show()
...
...
...
Again my questions:
1) How to get the value of the median?
2) How to find the outliers (outside the whiskers)?
3) How to find the width of the notch?
|