matplotlib-users Mailing List for matplotlib (Page 5)

Brought to you by: cjgohlke, dsdale, efiring, heeres, and 8 others

matplotlib-users — Discussion related to using matplotlib

You can subscribe to this list here.

2003	Jan	Feb	Mar	Apr	May (3)	Jun	Jul	Aug (12)	Sep (12)	Oct (56)	Nov (65)	Dec (37)
2004	Jan (59)	Feb (78)	Mar (153)	Apr (205)	May (184)	Jun (123)	Jul (171)	Aug (156)	Sep (190)	Oct (120)	Nov (154)	Dec (223)
2005	Jan (184)	Feb (267)	Mar (214)	Apr (286)	May (320)	Jun (299)	Jul (348)	Aug (283)	Sep (355)	Oct (293)	Nov (232)	Dec (203)
2006	Jan (352)	Feb (358)	Mar (403)	Apr (313)	May (165)	Jun (281)	Jul (316)	Aug (228)	Sep (279)	Oct (243)	Nov (315)	Dec (345)
2007	Jan (260)	Feb (323)	Mar (340)	Apr (319)	May (290)	Jun (296)	Jul (221)	Aug (292)	Sep (242)	Oct (248)	Nov (242)	Dec (332)
2008	Jan (312)	Feb (359)	Mar (454)	Apr (287)	May (340)	Jun (450)	Jul (403)	Aug (324)	Sep (349)	Oct (385)	Nov (363)	Dec (437)
2009	Jan (500)	Feb (301)	Mar (409)	Apr (486)	May (545)	Jun (391)	Jul (518)	Aug (497)	Sep (492)	Oct (429)	Nov (357)	Dec (310)
2010	Jan (371)	Feb (657)	Mar (519)	Apr (432)	May (312)	Jun (416)	Jul (477)	Aug (386)	Sep (419)	Oct (435)	Nov (320)	Dec (202)
2011	Jan (321)	Feb (413)	Mar (299)	Apr (215)	May (284)	Jun (203)	Jul (207)	Aug (314)	Sep (321)	Oct (259)	Nov (347)	Dec (209)
2012	Jan (322)	Feb (414)	Mar (377)	Apr (179)	May (173)	Jun (234)	Jul (295)	Aug (239)	Sep (276)	Oct (355)	Nov (144)	Dec (108)
2013	Jan (170)	Feb (89)	Mar (204)	Apr (133)	May (142)	Jun (89)	Jul (160)	Aug (180)	Sep (69)	Oct (136)	Nov (83)	Dec (32)
2014	Jan (71)	Feb (90)	Mar (161)	Apr (117)	May (78)	Jun (94)	Jul (60)	Aug (83)	Sep (102)	Oct (132)	Nov (154)	Dec (96)
2015	Jan (45)	Feb (138)	Mar (176)	Apr (132)	May (119)	Jun (124)	Jul (77)	Aug (31)	Sep (34)	Oct (22)	Nov (23)	Dec (9)
2016	Jan (26)	Feb (17)	Mar (10)	Apr (8)	May (4)	Jun (8)	Jul (6)	Aug (5)	Sep (9)	Oct (4)	Nov	Dec
2017	Jan (5)	Feb (7)	Mar (1)	Apr (5)	May	Jun (3)	Jul (6)	Aug (1)	Sep	Oct (2)	Nov (1)	Dec
2018	Jan	Feb	Mar	Apr (1)	May	Jun	Jul	Aug	Sep	Oct	Nov	Dec
2020	Jan	Feb	Mar	Apr	May (1)	Jun	Jul	Aug	Sep	Oct	Nov	Dec
2025	Jan (1)	Feb	Mar	Apr	May	Jun	Jul	Aug	Sep	Oct	Nov	Dec

S	M	T	W	T	F	S
			1 (1)	2 (8)	3 (10)	4
5 (4)	6	7 (5)	8 (6)	9 (4)	10 (12)	11 (7)
12 (2)	13 (2)	14 (5)	15 (9)	16 (4)	17 (7)	18 (2)
19 (12)	20 (8)	21 (11)	22 (11)	23 (2)	24 (18)	25 (18)
26 (6)	27 (7)	28 (10)	29 (7)	30 (31)	31 (10)

Flat | Threaded

<< < 1 .. 3 4 5 6 7 .. 10 > >> (Page 5 of 10)

Re: [Matplotlib-users] basemap via macports

From: Francesco M. <fra...@gm...> - 2012-08-24 15:13:26

2012/8/24 Carlos Grohmann <car...@gm...>:
> Hello all,
>
> I just did a fresh macports install, and installed py27-matplotlib-basemap,
> so all dependencies were installed as well.
>
> After installing python, I did run port-select (or something like it) to
> make sure I'm using macports python.
>
> My problem is that I can't run it:
>
>
> GuanoMac:~ guano$ python
> Python 2.7.2 (v2.7.2:8527427914a2, Jun 11 2011, 15:22:34)
> [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin
> Type "help", "copyright", "credits" or "license" for more information.
>>>> from mpl_toolkits.basemap import Basemap
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> ImportError: No module named basemap
>
>
> Anyone experienced in this kind of installation could share hints?

if you do
import sys
print sys.path

you can check if the directory of basemap is in your path

if not, you can add it both in the scritp/session
appending/inserting/extending sys.path (which is a list):
e.g.: sys.path.append( "dir/to/basemap" )
or in the .profile, .bash_profile or .bash_rc files (in this way is
loaded in every session)
export PYTHONPATH=$PYTHONPATH:dir/to/basemap


cheers,
Francesco

>
> tks
>
> Carlos
>
> --
> Prof. Carlos Henrique Grohmann
> Institute of Geosciences - Univ. of São Paulo, Brazil
> - Digital Terrain Analysis | GIS | Remote Sensing -
>
> http://carlosgrohmann.com
> ________________
> Can’t stop the signal.
>
>
> ------------------------------------------------------------------------------
> Live Security Virtual Conference
> Exclusive live event will cover all the ways today's security and
> threat landscape has changed and how IT managers can respond. Discussions
> will include endpoint security, mobile security and the latest in malware
> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
> _______________________________________________
> Matplotlib-users mailing list
> Mat...@li...
> https://lists.sourceforge.net/lists/listinfo/matplotlib-users
>

[Matplotlib-users] basemap via macports

From: Carlos G. <car...@gm...> - 2012-08-24 14:47:15

Hello all,

I just did a fresh macports install, and installed py27-matplotlib-basemap,
so all dependencies were installed as well.

After installing python, I did run port-select (or something like it) to
make sure I'm using macports python.

My problem is that I can't run it:


GuanoMac:~ guano$ python
Python 2.7.2 (v2.7.2:8527427914a2, Jun 11 2011, 15:22:34)
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from mpl_toolkits.basemap import Basemap
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: No module named basemap


Anyone experienced in this kind of installation could share hints?

tks

Carlos

-- 
Prof. Carlos Henrique Grohmann
Institute of Geosciences - Univ. of São Paulo, Brazil
- Digital Terrain Analysis | GIS | Remote Sensing -

http://carlosgrohmann.com
________________
Can’t stop the signal.

Re: [Matplotlib-users] GCC error when building matplotlib

From: Joe B. <ma...@jd...> - 2012-08-24 13:38:52

Thanks Ben,

Works fine.


Regards,
Joseph David Borġ
http://www.jdborg.com


On 24 August 2012 13:41, Benjamin Root <ben...@ou...> wrote:

>
> On Fri, Aug 24, 2012 at 6:35 AM, Joe Borġ <ma...@jd...> wrote:
>
>> Hi,
>>
>> I've reinstalled numpy and the error from gcc has changed.  Please see
>> log, all the system information should be in there.
>>
>>
> Matplotlib 1.1.1 does not work with py3k.  We are just about ready to
> release v1.2.0 which will work with py3k.  Please feel free to checkout the
> master branch on our github page and test it out before the release!
>
> Ben Root
>
>

Re: [Matplotlib-users] GCC error when building matplotlib

From: Benjamin R. <ben...@ou...> - 2012-08-24 12:41:43

On Fri, Aug 24, 2012 at 6:35 AM, Joe Borġ <ma...@jd...> wrote:

> Hi,
>
> I've reinstalled numpy and the error from gcc has changed.  Please see
> log, all the system information should be in there.
>
>
Matplotlib 1.1.1 does not work with py3k.  We are just about ready to
release v1.2.0 which will work with py3k.  Please feel free to checkout the
master branch on our github page and test it out before the release!

Ben Root

[Matplotlib-users] GCC error when building matplotlib

From: Joe B. <ma...@jd...> - 2012-08-24 10:35:58

Attachments: matplotlib_build.log

Hi,

I've reinstalled numpy and the error from gcc has changed.  Please see log,
all the system information should be in there.


Regards,
Joseph David Borġ
http://www.jdborg.com

Re: [Matplotlib-users] Correct way of saving properties for plot reconstruction

From: Eric F. <ef...@ha...> - 2012-08-24 06:16:03

On 2012/08/23 6:52 PM, Andrew Nelson wrote:
> Dear list,
> apologies for what might be a simple question. I am creating an
> application that uses matplotlib for plotting, using the Qt4Agg backend.
>   I can create the figures without a problem.
>
> However, I wish to save the state of the application, including the
> graphs.  The complicating factor is that the user may have altered the
> appearance of the graphs via a NavigationToolbar.
>
> I have no problems saving the data that makes up the graphs, but how do
> I save the properties of the graphs (line colour, linewidth, etc)?
> I tried using matplotlib.artist.ArtistInspector(Line2D).properties().
> This gives a dictionary of all the properties. However, when I try to
> pickle this I get picking errors:
>
> cPickle.PicklingError: Can't pickle <class
> 'matplotlib.axes.AxesSubplot'>: attribute lookup
> matplotlib.axes.AxesSubplot failed
>
> I am sure that there is an easy way of achieving this, I just can't see
> it in the documentation.  I appreciate any help the list is able to give me.

Maybe there is not an easy way...
See https://github.com/matplotlib/matplotlib/pull/1020.

Eric

>
> regards,
> Andrew
>

[Matplotlib-users] Correct way of saving properties for plot reconstruction

From: Andrew N. <and...@gm...> - 2012-08-24 04:52:14

Dear list,
apologies for what might be a simple question. I am creating an application
that uses matplotlib for plotting, using the Qt4Agg backend.  I can create
the figures without a problem.

However, I wish to save the state of the application, including the graphs.
 The complicating factor is that the user may have altered the appearance
of the graphs via a NavigationToolbar.

I have no problems saving the data that makes up the graphs, but how do I
save the properties of the graphs (line colour, linewidth, etc)?
I tried using matplotlib.artist.ArtistInspector(Line2D).properties(). This
gives a dictionary of all the properties. However, when I try to pickle
this I get picking errors:

cPickle.PicklingError: Can't pickle <class 'matplotlib.axes.AxesSubplot'>:
attribute lookup matplotlib.axes.AxesSubplot failed

I am sure that there is an easy way of achieving this, I just can't see it
in the documentation.  I appreciate any help the list is able to give me.

regards,
Andrew



-- 
_____________________________________
Dr. Andrew Nelson


_____________________________________

[Matplotlib-users] [ANN] Call for abstracts: BigData minisymposium at CSE'13, February 2013, Boston

From: Fernando P. <fpe...@gm...> - 2012-08-23 18:27:31

Dear colleagues,

next year's SIAM conference on Computational Science and Engineering,
CSE'13, will take place in Boston, February 25-March 1
(http://www.siam.org/meetings/cse13), and for this version there will
be a track focused on the topic of Big Data.  This term has rapidly
risen in recent discussions of science and even of mainstream business
computing, and for good reasons.  Today virtually all disciplines are
facing a flood of quantitative information whose volumes have often
grown faster than the quality of our tools for extracting insight from
these data.  SIAM hopes that CSE'13 will provide an excellent venue
for discussing these problems, from the vantage point offered by a
community whose expertise combines analytical insights, algorithmic
development, software engineering and domain-specific applications.

As part of this event, Titus Brown (http://ged.msu.edu) and I are
organizing a minisymposium where we would like to have a group of
presentations that address both novel algorithmic ideas and
computational approaches as well as domain-specific problems.   Data
doesn't appear in a vacuum, and data from different domains presents a
mix of common problems along with questions that may be specific to
each; we hope that by engaging a dialog between those working on
algorithmic and implementation questions and those with specific
problems from the field, valuable insights can be obtained.

If you would like to contribute to this minisymposium, please contact
us directly at:

"C. Titus Brown" <ct...@ms...>,
"Fernando Perez" <Fer...@be...>

with your name and affiliation, the title of your proposed talk and a
brief description (actual abstracts are due later so an informal
description will suffice for now), by Wednesday August 29.  For more
details on the submission process, see:

http://www.siam.org/meetings/cse13/submissions.php

Please forward this to any interested colleagues.

Regards,

Titus and Fernando.

[Matplotlib-users] plot_date with non-UTC timezone

From: Nils G. <nil...@gm...> - 2012-08-23 16:33:43

Attachments: plot_date_example.py

Hi,

I have noticed that matplotlib's plot_date() function occasionally
places the label at the date (or month or year) end instead of the
beginning. I attach a script illustrating the problem: I plot a range
of y-values for a range of dates. The y-value 29 should be associated
with 11 Nov 2011. In figure 1, left I am using UTC and everything is
fine. In the right plot I am using the 'Europe/London' timezone and
now the graph suggests that the y-value 29 is associated with 10 Nov
2011. When moving the mouse over the plot and watching the status bar
at the bottom it suggest that indeed the label is placed at the
date-end rather than the beginning. I have observed the same behaviour
with month and year labels being shifted in non-UTC timezones.

I have my time data as POSIX timestamps, so using datetime objects in
plotting directly is not a (convenient) option. (In the sample code I
use datetime only to generate example POSIX timestamps at midnight).
Instead I would like to rely on the epoch2num() helper function
supplied by matplotlib.


I am using matplotlib-1.1.0, numpy-1.6.1.


Any help appreciated,
thanks, Nils.

Re: [Matplotlib-users] GCC failure when updating from 1.1.0 to 1.1.1

From: Benjamin R. <ben...@ou...> - 2012-08-22 17:27:48

On Wed, Aug 22, 2012 at 12:31 PM, Joe Borġ <ma...@jd...> wrote:

> Not sure if this is an issue with an out-of-date GCC or if something else
> is wrong.  I've got 1.1.0 on no problem.
>
> $python setup.py build
> ...
> gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall
> -fPIC -DPY_ARRAY_UNIQUE_SYMBOL=MPL_ARRAY_API -DPYCXX_ISO_CPP_LIB=1
> -I/software/Python/272/lib/python2.7/site-packages/numpy/core/include
> -I/usr/include/freetype2 -I/usr/local/include -I/usr/include -I.
> -I/software/Python/272/include/python2.7 -c src/ft2font.cpp -o
> build/temp.linux-x86_64-2.7/src/ft2font.o
> In file included from
> /software/Python/272/lib/python2.7/site-packages/numpy/core/include/numpy/ndarraytypes.h:7,
>                  from
> /software/Python/272/lib/python2.7/site-packages/numpy/core/include/numpy/ndarrayobject.h:17,
>                  from
> /software/Python/272/lib/python2.7/site-packages/numpy/core/include/numpy/arrayobject.h:14,
>                  from src/ft2font.cpp:7:
> /software/Python/272/lib/python2.7/site-packages/numpy/core/include/numpy/npy_common.h:120:2:
> error: #error npy_cdouble definition is not compatible with C99 complex
> definition ! Please contact Numpy maintainers and give detailed information
> about your compiler and platform
> /software/Python/272/lib/python2.7/site-packages/numpy/core/include/numpy/npy_common.h:127:2:
> error: #error npy_cfloat definition is not compatible with C99 complex
> definition ! Please contact Numpy maintainers and give detailed information
> about your compiler and platform
> /software/Python/272/lib/python2.7/site-packages/numpy/core/include/numpy/npy_common.h:134:2:
> error: #error npy_clongdouble definition is not compatible with C99 complex
> definition ! Please contact Numpy maintainers and give detailed information
> about your compiler and platform
> In file included from
> /software/Python/272/lib/python2.7/site-packages/numpy/core/include/numpy/ndarrayobject.h:26,
>                  from
> /software/Python/272/lib/python2.7/site-packages/numpy/core/include/numpy/arrayobject.h:14,
>                  from src/ft2font.cpp:7:
> /software/Python/272/lib/python2.7/site-packages/numpy/core/include/numpy/__multiarray_api.h:
> In function 'int _import_array()':
> /software/Python/272/lib/python2.7/site-packages/numpy/core/include/numpy/__multiarray_api.h:1226:
> error: 'NPY_ABI_VERSION' was not declared in this scope
> /software/Python/272/lib/python2.7/site-packages/numpy/core/include/numpy/__multiarray_api.h:1232:
> error: 'NPY_API_VERSION' was not declared in this scope
> error: command 'gcc' failed with exit status 1
>
> $ gcc --version
> gcc (GCC) 4.4.5 20110214 (Red Hat 4.4.5-6)
> Copyright (C) 2010 Free Software Foundation, Inc.
> This is free software; see the source for copying conditions.  There is NO
> warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
>
>
Joe,

This appears to be a problem with NumPy.  I would suggest sending this
email to the numpy-discussion list.  Be sure to include detailed
information about your compiler, OS, and your machine.

Cheers!
Ben Root

[Matplotlib-users] GCC failure when updating from 1.1.0 to 1.1.1

From: Joe B. <ma...@jd...> - 2012-08-22 17:04:27

Not sure if this is an issue with an out-of-date GCC or if something else
is wrong.  I've got 1.1.0 on no problem.

$python setup.py build
...
gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall
-fPIC -DPY_ARRAY_UNIQUE_SYMBOL=MPL_ARRAY_API -DPYCXX_ISO_CPP_LIB=1
-I/software/Python/272/lib/python2.7/site-packages/numpy/core/include
-I/usr/include/freetype2 -I/usr/local/include -I/usr/include -I.
-I/software/Python/272/include/python2.7 -c src/ft2font.cpp -o
build/temp.linux-x86_64-2.7/src/ft2font.o
In file included from
/software/Python/272/lib/python2.7/site-packages/numpy/core/include/numpy/ndarraytypes.h:7,
                 from
/software/Python/272/lib/python2.7/site-packages/numpy/core/include/numpy/ndarrayobject.h:17,
                 from
/software/Python/272/lib/python2.7/site-packages/numpy/core/include/numpy/arrayobject.h:14,
                 from src/ft2font.cpp:7:
/software/Python/272/lib/python2.7/site-packages/numpy/core/include/numpy/npy_common.h:120:2:
error: #error npy_cdouble definition is not compatible with C99 complex
definition ! Please contact Numpy maintainers and give detailed information
about your compiler and platform
/software/Python/272/lib/python2.7/site-packages/numpy/core/include/numpy/npy_common.h:127:2:
error: #error npy_cfloat definition is not compatible with C99 complex
definition ! Please contact Numpy maintainers and give detailed information
about your compiler and platform
/software/Python/272/lib/python2.7/site-packages/numpy/core/include/numpy/npy_common.h:134:2:
error: #error npy_clongdouble definition is not compatible with C99 complex
definition ! Please contact Numpy maintainers and give detailed information
about your compiler and platform
In file included from
/software/Python/272/lib/python2.7/site-packages/numpy/core/include/numpy/ndarrayobject.h:26,
                 from
/software/Python/272/lib/python2.7/site-packages/numpy/core/include/numpy/arrayobject.h:14,
                 from src/ft2font.cpp:7:
/software/Python/272/lib/python2.7/site-packages/numpy/core/include/numpy/__multiarray_api.h:
In function 'int _import_array()':
/software/Python/272/lib/python2.7/site-packages/numpy/core/include/numpy/__multiarray_api.h:1226:
error: 'NPY_ABI_VERSION' was not declared in this scope
/software/Python/272/lib/python2.7/site-packages/numpy/core/include/numpy/__multiarray_api.h:1232:
error: 'NPY_API_VERSION' was not declared in this scope
error: command 'gcc' failed with exit status 1

$ gcc --version
gcc (GCC) 4.4.5 20110214 (Red Hat 4.4.5-6)
Copyright (C) 2010 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.


Regards,
Joseph David Borġ
http://www.jdborg.com

Re: [Matplotlib-users] boxplot -- how (more)

From: Jeffrey B. <jbl...@al...> - 2012-08-22 15:29:43

On Aug 22, 2012, at 10:04 AM, Virgil Stokes wrote:

> On 21-Aug-2012 17:52, Jeffrey Blackburne wrote:
>>
>> On Aug 21, 2012, at 10:58 AM, Virgil Stokes wrote:
>>
>>> In reference to my previous email.
>>>
>>> How can I find the outliers (samples points beyond the whiskers)  
>>> in the data
>>> used for the boxplot?
>>>
>>> Here is a code snippet that shows how it was used for the timings  
>>> data (a list
>>> of 4 sublists (y1,y2,y3,y4), each containing 400,000 real data  
>>> values),
>>>    ...
>>>    ...
>>>    ...
>>>    # Box Plots
>>>    plt.subplot(2,1,2)
>>>    timings = [y1,y2,y3,y4]
>>>    pos = np.array(range(len(timings)))+1
>>>    bp = plt.boxplot( timings, sym='k+', patch_artist=True,
>>>                     positions=pos, notch=1, bootstrap=5000 )
>>>
>>>    plt.xlabel('Algorithm')
>>>    plt.ylabel('Exection time (sec)')
>>>    plt.ylim(0.9*ymin,1.1*ymax)
>>>
>>>    plt.setp(bp['whiskers'], color='k',  linestyle='-' )
>>>    plt.setp(bp['fliers'], markersize=3.0)
>>>    plt.title('Box plots (%4d trials)' %(n))
>>>    plt.show()
>>>    ...
>>>    ...
>>>    ...
>>>
>>> Again my questions:
>>> 1) How to get the value of the median?
>>
>> This is easily calculated from your data. Numpy will even do it  
>> for you: np.median(timings)
>>
>>> 2) How to find the outliers (outside the whiskers)?
>>
>> From the boxplot documentation: the whiskers extend to the most  
>> extreme data point within distance X of the bottom or top of the  
>> box, where X is 1.5 times the extent of the box. Any points more  
>> extreme than that are the outliers. The box itself of course  
>> extends from the 25th percentile to the 75th percentile of your  
>> data. Again, you can easily calculate these values from your data.
>>
>>> 3) How to find the width of the notch?
>>
>> Again, from the docs: with bootstrap=5000, it calculates the width  
>> of the notch by bootstrap resampling your data (the timings array)  
>> 5000 times and finding the 95% confidence interval of the median,  
>> and uses that as the notch width. You can redo that yourself  
>> pretty easily. Here is some bootstrap code for you to adapt:
>> http://mail.scipy.org/pipermail/scipy-user/2009-July/021704.html
>>
>> I encourage you to read the documentation! This page is very  
>> useful for reference:
>> http://matplotlib.sourceforge.net/api/pyplot_api.html
>>
>> -Jeff
>>
> Yes Jeff,
> These are very useful links; however, box plots have a parameter  
> called the "adjacent value" (from the McGill reference),
>
> "The plotted whisker extends to the adjacent value,  which is the  
> most extreme data value that is not an outlier."
>
> It seems there should be one for the lower and one for the upper  
> whisker --- how can one get these two values from boxplot?

Look at bp['whiskers']

For those who got here by searching: bp is the object returned by  
plt.boxplot()

> Also, is there anyway to directly get the indices of the outliers?

Look into np.where()

Re: [Matplotlib-users] boxplot -- how (more)

From: Virgil S. <vs...@it...> - 2012-08-22 14:24:51

On 22-Aug-2012 11:23, Virgil Stokes wrote:
> On 21-Aug-2012 17:59, Paul Hobson wrote:
>> On Tue, Aug 21, 2012 at 8:56 AM, Virgil Stokes <vs...@it...> wrote:
>>> On 21-Aug-2012 17:50, Paul Hobson wrote:
>>>> On Tue, Aug 21, 2012 at 7:58 AM, Virgil Stokes <vs...@it...> wrote:
>>>>> In reference to my previous email.
>>>>>
>>>>> How can I find the outliers (samples points beyond the whiskers) in the
>>>>> data
>>>>> used for the boxplot?
>>>>>
>>>>> Here is a code snippet that shows how it was used for the timings data (a
>>>>> list
>>>>> of 4 sublists (y1,y2,y3,y4), each containing 400,000 real data values),
>>>>>       ...
>>>>>       ...
>>>>>       ...
>>>>>       # Box Plots
>>>>>       plt.subplot(2,1,2)
>>>>>       timings = [y1,y2,y3,y4]
>>>>>       pos = np.array(range(len(timings)))+1
>>>>>       bp = plt.boxplot( timings, sym='k+', patch_artist=True,
>>>>>                        positions=pos, notch=1, bootstrap=5000 )
>>>>>
>>>>>       plt.xlabel('Algorithm')
>>>>>       plt.ylabel('Exection time (sec)')
>>>>>       plt.ylim(0.9*ymin,1.1*ymax)
>>>>>
>>>>>       plt.setp(bp['whiskers'], color='k',  linestyle='-' )
>>>>>       plt.setp(bp['fliers'], markersize=3.0)
>>>>>       plt.title('Box plots (%4d trials)' %(n))
>>>>>       plt.show()
>>>>>       ...
>>>>>       ...
>>>>>       ...
>>>>>
>>>>> Again my questions:
>>>>> 1) How to get the value of the median?
>>>>> 2) How to find the outliers (outside the whiskers)?
>>>>> 3) How to find the width of the notch?
>>>> Virgil, the objects stuffed inside the `bp` dictionary should have
>>>> methods to retrieve their values. Let's see:
>>>>
>>>> In [35]: x = np.random.lognormal(mean=1.25, sigma=1.35, size=(37,3))
>>>>
>>>> In [36]: bp = plt.boxplot(x, bootstrap=5000, notch=True)
>>>>
>>>> In [37]: # Question 1
>>>>        ...: print('medians')
>>>>        ...: for n, median in enumerate(bp['medians']):
>>>>        ...:     print('%d: %f' % (n, median.get_ydata()[0]))
>>>>        ...:
>>>> medians
>>>> 0: 6.339692
>>>> 1: 3.449320
>>>> 2: 4.503706
>>>>
>>>> In [38]: # Question 2
>>>>        ...: print('fliers')
>>>>        ...: for n in range(0, len(bp['fliers']), 2):
>>>>        ...:     print('%d: upper outliers = \t' % (n/2,))
>>>>        ...:     print(bp['fliers'][n].get_ydata())
>>>>        ...:     print('\n%d: lower outliers = \t' % (n/2,))
>>>>        ...:     print(bp['fliers'][n+1].get_ydata())
>>>>        ...:     print('\n')
>>>>        ...:
>>> You had no outliers!
>>>
>>>> In [39]: # Question 3
>>>>        ...: print('Confidence Intervals')
>>>>        ...: for n, box in enumerate(bp['boxes']):
>>>>        ...:     print('%d: lower CI: %f' % (n, box.get_ydata()[2]))
>>>>        ...:     print('%d: upper CI: %f' % (n, box.get_ydata()[4]))
>>>>        ...:
>>>> Confidence Intervals
>>>> 0: lower CI: 1.760701
>>>> 0: upper CI: 10.102221
>>>> 1: lower CI: 1.626386
>>>> 1: upper CI: 5.601927
>>>> 2: lower CI: 2.173173
>>>>
>>>> Hope that helps,
>>>> -paul
>>> Just what I was looking for Paul! Thanks very much.
>>>
>>> One final question --- Where can I find the documentation that answers my
>>> questions and gives more details about the equations used for the width of
>>> notch. etc.?
>>>
>>> Thanks again :-)
>> That should all be in the boxplot docstring. Do you use ipython? If
>> not, you should :)
>>
>> if so, just do `plt.boxplot?` at the ipython terminal and it'll show up.
>> -paul
> I still have a problem...
> Let me show the updated code snippet again
>     ...
>     ...
>     ...
>     # Box Plots
>     iplt += 1
>     plt.figure(iplt)
>     timings = [ya[0],ya[1],ya[2],ya[3]]
>     pos = np.array(range(len(timings)))+1
>     bp = plt.boxplot( timings, sym='k+', patch_artist=True,
>                    positions=pos, notch=1, bootstrap=5000 )
>     print ('medians')
>     for nn,median in enumerate(bp['medians']):
>         print('%d: %f' %(nn,median.get_ydata()[0]))
>
>     print('fliers')
>     for nn in range(0, len(bp['fliers']), 2):
>         print('%d: upper outliers = \t' % (nn/2,))
>         print(bp['fliers'][nn].get_ydata())
>         print('\n%d: lower outliers = \t' % (nn/2,))
>         print(bp['fliers'][nn+1].get_ydata())
>         print('\n')
>
>     print('Confidence Intervals')
>     for nn, box in enumerate(bp['boxes']):
>         print('%d: lower CI: %f' % (nn, box.get_ydata()[2]))<--- FAILS!
>         print('%d: upper CI: %f' % (nn, box.get_ydata()[4]))
>     ...
>     ...
>     ...
>
> Medians and fliers work perfectly; but, I get the following error message when
> trying to access the confidence intervals:
>
> AttributeError: 'PathPatch' object has no attribute 'get_ydata'
>
> Note, I am using boxplot with 4 sets of data and I am using matplotlib vers. 1.1.0.
>
> Any suggestions on how to fix this problem?

I found the solution,

  one must have,

patch_artist=False

in the boxplot call.

:-)

Re: [Matplotlib-users] boxplot -- how (more)

From: Virgil S. <vs...@it...> - 2012-08-22 14:04:10

On 21-Aug-2012 17:52, Jeffrey Blackburne wrote:
>
> On Aug 21, 2012, at 10:58 AM, Virgil Stokes wrote:
>
>> In reference to my previous email.
>>
>> How can I find the outliers (samples points beyond the whiskers) in the data
>> used for the boxplot?
>>
>> Here is a code snippet that shows how it was used for the timings data (a list
>> of 4 sublists (y1,y2,y3,y4), each containing 400,000 real data values),
>>    ...
>>    ...
>>    ...
>>    # Box Plots
>>    plt.subplot(2,1,2)
>>    timings = [y1,y2,y3,y4]
>>    pos = np.array(range(len(timings)))+1
>>    bp = plt.boxplot( timings, sym='k+', patch_artist=True,
>>                     positions=pos, notch=1, bootstrap=5000 )
>>
>>    plt.xlabel('Algorithm')
>>    plt.ylabel('Exection time (sec)')
>>    plt.ylim(0.9*ymin,1.1*ymax)
>>
>>    plt.setp(bp['whiskers'], color='k',  linestyle='-' )
>>    plt.setp(bp['fliers'], markersize=3.0)
>>    plt.title('Box plots (%4d trials)' %(n))
>>    plt.show()
>>    ...
>>    ...
>>    ...
>>
>> Again my questions:
>> 1) How to get the value of the median?
>
> This is easily calculated from your data. Numpy will even do it for you: 
> np.median(timings)
>
>> 2) How to find the outliers (outside the whiskers)?
>
> From the boxplot documentation: the whiskers extend to the most extreme data 
> point within distance X of the bottom or top of the box, where X is 1.5 times 
> the extent of the box. Any points more extreme than that are the outliers. The 
> box itself of course extends from the 25th percentile to the 75th percentile 
> of your data. Again, you can easily calculate these values from your data.
>
>> 3) How to find the width of the notch?
>
> Again, from the docs: with bootstrap=5000, it calculates the width of the 
> notch by bootstrap resampling your data (the timings array) 5000 times and 
> finding the 95% confidence interval of the median, and uses that as the notch 
> width. You can redo that yourself pretty easily. Here is some bootstrap code 
> for you to adapt:
> http://mail.scipy.org/pipermail/scipy-user/2009-July/021704.html
>
> I encourage you to read the documentation! This page is very useful for 
> reference:
> http://matplotlib.sourceforge.net/api/pyplot_api.html
>
> -Jeff
>
Yes Jeff,
These are very useful links; however, box plots have a parameter called the 
"adjacent value" (from the McGill reference),

"The plotted whisker extends to the adjacent value,  which is the most extreme 
data value that is not an outlier."

It seems there should be one for the lower and one for the upper whisker --- how 
can one get these two values from boxplot?

Also, is there anyway to directly get the indices of the outliers?

Re: [Matplotlib-users] problem with png image

From: Petro <x....@gm...> - 2012-08-22 14:02:55

Michael Droettboom <md...@st...>
writes:

> Can you try the GtkAgg backend instead and confirm the bug isn't there?  
> The "pure" Gtk backend doesn't see a lot of use these days and isn't 
> very well tested.
>
> Mike
>
Thanks. It solved the problem.

Re: [Matplotlib-users] problem with png image

From: Michael D. <md...@st...> - 2012-08-22 12:28:43

Can you try the GtkAgg backend instead and confirm the bug isn't there?  
The "pure" Gtk backend doesn't see a lot of use these days and isn't 
very well tested.

Mike

On 08/22/2012 08:17 AM, Petro Khoroshyy wrote:
> Damon McDougall
> <dam...@gm...> writes:
>
>> On Wed, Aug 22, 2012 at 11:28:54AM +0200, Petro wrote:
>>> Hi list.
>>> I generate some png images using matplotlib, and get very different
>>> results depending on figuresize
>>> __________________________________________________________________
>>>    from pylab import figure, plot
>>>    import pylab as plt
>>>    import numpy as np
>>>    figure()
>>>    plt.subplot(2,1,1)
>>>    plot(np.random.rand(10),'o')
>>>    plt.subplot(2,1,2)
>>>    plot(np.random.rand(10),'o')
>>>    pic_name='fit_rates1.png'
>>>    path_name='/home/petro/tmp/'
>>>    plt.savefig(path_name + pic_name)
>>> __________________________________________________________________
>>>
>>> the code above generates the following image:
>>> https://lh3.googleusercontent.com/-107Ducz_CA0/UDShKMtejtI/AAAAAAAACls/YOeahS3tQA8/s400/fit_rates1.png
>>>
>>> now if I increase a figure  size parameter:
>>> __________________________________________________________________
>>>    from pylab import figure, plot
>>>    import pylab as plt
>>>    import numpy as np
>>>    plt.ioff()
>>>    from matplotlib import rcParams
>>>    golden_mean = (np.sqrt(5)-1.0)/2.0    # Aesthetic ratio
>>>    fig_width = 5.6  # width in inches
>>>    fig_height = fig_width*golden_mean    # height in inches
>>>    rcParams['figure.figsize']=fig_width, fig_height*3
>>>    figure()
>>>    plt.subplot(2,1,1)
>>>    plot(np.random.rand(10),'o')
>>>    plt.subplot(2,1,2)
>>>    plot(np.random.rand(10),'o')
>>>    pic_name='fit_rates2.png'
>>>    path_name='/home/petro/tmp/'
>>>    plt.savefig(path_name + pic_name)
>>>    
>> What backend are you using?
>>
>> print plt.get_backend()
> It outputs GTK.
>
>
>
>
> ------------------------------------------------------------------------------
> Live Security Virtual Conference
> Exclusive live event will cover all the ways today's security and
> threat landscape has changed and how IT managers can respond. Discussions
> will include endpoint security, mobile security and the latest in malware
> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
> _______________________________________________
> Matplotlib-users mailing list
> Mat...@li...
> https://lists.sourceforge.net/lists/listinfo/matplotlib-users

Re: [Matplotlib-users] problem with png image

From: Petro K. <kho...@gm...> - 2012-08-22 12:17:53

Damon McDougall
<dam...@gm...> writes:

> On Wed, Aug 22, 2012 at 11:28:54AM +0200, Petro wrote:
>> Hi list.
>> I generate some png images using matplotlib, and get very different
>> results depending on figuresize
>> __________________________________________________________________
>>   from pylab import figure, plot
>>   import pylab as plt 
>>   import numpy as np
>>   figure()
>>   plt.subplot(2,1,1)
>>   plot(np.random.rand(10),'o')
>>   plt.subplot(2,1,2)
>>   plot(np.random.rand(10),'o')
>>   pic_name='fit_rates1.png'
>>   path_name='/home/petro/tmp/'
>>   plt.savefig(path_name + pic_name) 
>> __________________________________________________________________
>> 
>> the code above generates the following image:
>> https://lh3.googleusercontent.com/-107Ducz_CA0/UDShKMtejtI/AAAAAAAACls/YOeahS3tQA8/s400/fit_rates1.png
>> 
>> now if I increase a figure  size parameter:
>> __________________________________________________________________
>>   from pylab import figure, plot
>>   import pylab as plt 
>>   import numpy as np
>>   plt.ioff()
>>   from matplotlib import rcParams 
>>   golden_mean = (np.sqrt(5)-1.0)/2.0    # Aesthetic ratio
>>   fig_width = 5.6  # width in inches
>>   fig_height = fig_width*golden_mean    # height in inches
>>   rcParams['figure.figsize']=fig_width, fig_height*3
>>   figure()
>>   plt.subplot(2,1,1)
>>   plot(np.random.rand(10),'o')
>>   plt.subplot(2,1,2)
>>   plot(np.random.rand(10),'o')
>>   pic_name='fit_rates2.png'
>>   path_name='/home/petro/tmp/'
>>   plt.savefig(path_name + pic_name) 
>>   
>
> What backend are you using?
>
> print plt.get_backend()

It outputs GTK.

Re: [Matplotlib-users] problem with png image

From: Damon M. <dam...@gm...> - 2012-08-22 10:29:30

On Wed, Aug 22, 2012 at 11:28:54AM +0200, Petro wrote:
> Hi list.
> I generate some png images using matplotlib, and get very different
> results depending on figuresize
> __________________________________________________________________
>   from pylab import figure, plot
>   import pylab as plt 
>   import numpy as np
>   figure()
>   plt.subplot(2,1,1)
>   plot(np.random.rand(10),'o')
>   plt.subplot(2,1,2)
>   plot(np.random.rand(10),'o')
>   pic_name='fit_rates1.png'
>   path_name='/home/petro/tmp/'
>   plt.savefig(path_name + pic_name) 
> __________________________________________________________________
> 
> the code above generates the following image:
> https://lh3.googleusercontent.com/-107Ducz_CA0/UDShKMtejtI/AAAAAAAACls/YOeahS3tQA8/s400/fit_rates1.png
> 
> now if I increase a figure  size parameter:
> __________________________________________________________________
>   from pylab import figure, plot
>   import pylab as plt 
>   import numpy as np
>   plt.ioff()
>   from matplotlib import rcParams 
>   golden_mean = (np.sqrt(5)-1.0)/2.0    # Aesthetic ratio
>   fig_width = 5.6  # width in inches
>   fig_height = fig_width*golden_mean    # height in inches
>   rcParams['figure.figsize']=fig_width, fig_height*3
>   figure()
>   plt.subplot(2,1,1)
>   plot(np.random.rand(10),'o')
>   plt.subplot(2,1,2)
>   plot(np.random.rand(10),'o')
>   pic_name='fit_rates2.png'
>   path_name='/home/petro/tmp/'
>   plt.savefig(path_name + pic_name) 
>   

What backend are you using?

print plt.get_backend()

-- 
Damon McDougall
http://www.damon-is-a-geek.com
B2.39
Mathematics Institute
University of Warwick
Coventry
West Midlands
CV4 7AL
United Kingdom

[Matplotlib-users] problem with png image

From: Petro <x....@gm...> - 2012-08-22 09:29:19

Hi list.
I generate some png images using matplotlib, and get very different
results depending on figuresize
__________________________________________________________________
  from pylab import figure, plot
  import pylab as plt 
  import numpy as np
  figure()
  plt.subplot(2,1,1)
  plot(np.random.rand(10),'o')
  plt.subplot(2,1,2)
  plot(np.random.rand(10),'o')
  pic_name='fit_rates1.png'
  path_name='/home/petro/tmp/'
  plt.savefig(path_name + pic_name) 
__________________________________________________________________

the code above generates the following image:
https://lh3.googleusercontent.com/-107Ducz_CA0/UDShKMtejtI/AAAAAAAACls/YOeahS3tQA8/s400/fit_rates1.png

now if I increase a figure  size parameter:
__________________________________________________________________
  from pylab import figure, plot
  import pylab as plt 
  import numpy as np
  plt.ioff()
  from matplotlib import rcParams 
  golden_mean = (np.sqrt(5)-1.0)/2.0    # Aesthetic ratio
  fig_width = 5.6  # width in inches
  fig_height = fig_width*golden_mean    # height in inches
  rcParams['figure.figsize']=fig_width, fig_height*3
  figure()
  plt.subplot(2,1,1)
  plot(np.random.rand(10),'o')
  plt.subplot(2,1,2)
  plot(np.random.rand(10),'o')
  pic_name='fit_rates2.png'
  path_name='/home/petro/tmp/'
  plt.savefig(path_name + pic_name) 
  
__________________________________________________________________
the result looks strange like this:
https://lh5.googleusercontent.com/-4GRQxuRFvh4/UDSiRrNy59I/AAAAAAAACmA/Kho3prHFpUU/s640/fit_rates2.png
Has anyone experienced behaviour like this?
Thanks.
Petro

Re: [Matplotlib-users] boxplot -- how (more)

From: Virgil S. <vs...@it...> - 2012-08-22 09:23:55

On 21-Aug-2012 17:59, Paul Hobson wrote:
> On Tue, Aug 21, 2012 at 8:56 AM, Virgil Stokes <vs...@it...> wrote:
>> On 21-Aug-2012 17:50, Paul Hobson wrote:
>>> On Tue, Aug 21, 2012 at 7:58 AM, Virgil Stokes <vs...@it...> wrote:
>>>> In reference to my previous email.
>>>>
>>>> How can I find the outliers (samples points beyond the whiskers) in the
>>>> data
>>>> used for the boxplot?
>>>>
>>>> Here is a code snippet that shows how it was used for the timings data (a
>>>> list
>>>> of 4 sublists (y1,y2,y3,y4), each containing 400,000 real data values),
>>>>      ...
>>>>      ...
>>>>      ...
>>>>      # Box Plots
>>>>      plt.subplot(2,1,2)
>>>>      timings = [y1,y2,y3,y4]
>>>>      pos = np.array(range(len(timings)))+1
>>>>      bp = plt.boxplot( timings, sym='k+', patch_artist=True,
>>>>                       positions=pos, notch=1, bootstrap=5000 )
>>>>
>>>>      plt.xlabel('Algorithm')
>>>>      plt.ylabel('Exection time (sec)')
>>>>      plt.ylim(0.9*ymin,1.1*ymax)
>>>>
>>>>      plt.setp(bp['whiskers'], color='k',  linestyle='-' )
>>>>      plt.setp(bp['fliers'], markersize=3.0)
>>>>      plt.title('Box plots (%4d trials)' %(n))
>>>>      plt.show()
>>>>      ...
>>>>      ...
>>>>      ...
>>>>
>>>> Again my questions:
>>>> 1) How to get the value of the median?
>>>> 2) How to find the outliers (outside the whiskers)?
>>>> 3) How to find the width of the notch?
>>> Virgil, the objects stuffed inside the `bp` dictionary should have
>>> methods to retrieve their values. Let's see:
>>>
>>> In [35]: x = np.random.lognormal(mean=1.25, sigma=1.35, size=(37,3))
>>>
>>> In [36]: bp = plt.boxplot(x, bootstrap=5000, notch=True)
>>>
>>> In [37]: # Question 1
>>>       ...: print('medians')
>>>       ...: for n, median in enumerate(bp['medians']):
>>>       ...:     print('%d: %f' % (n, median.get_ydata()[0]))
>>>       ...:
>>> medians
>>> 0: 6.339692
>>> 1: 3.449320
>>> 2: 4.503706
>>>
>>> In [38]: # Question 2
>>>       ...: print('fliers')
>>>       ...: for n in range(0, len(bp['fliers']), 2):
>>>       ...:     print('%d: upper outliers = \t' % (n/2,))
>>>       ...:     print(bp['fliers'][n].get_ydata())
>>>       ...:     print('\n%d: lower outliers = \t' % (n/2,))
>>>       ...:     print(bp['fliers'][n+1].get_ydata())
>>>       ...:     print('\n')
>>>       ...:
>> You had no outliers!
>>
>>> In [39]: # Question 3
>>>       ...: print('Confidence Intervals')
>>>       ...: for n, box in enumerate(bp['boxes']):
>>>       ...:     print('%d: lower CI: %f' % (n, box.get_ydata()[2]))
>>>       ...:     print('%d: upper CI: %f' % (n, box.get_ydata()[4]))
>>>       ...:
>>> Confidence Intervals
>>> 0: lower CI: 1.760701
>>> 0: upper CI: 10.102221
>>> 1: lower CI: 1.626386
>>> 1: upper CI: 5.601927
>>> 2: lower CI: 2.173173
>>>
>>> Hope that helps,
>>> -paul
>> Just what I was looking for Paul! Thanks very much.
>>
>> One final question --- Where can I find the documentation that answers my
>> questions and gives more details about the equations used for the width of
>> notch. etc.?
>>
>> Thanks again :-)
> That should all be in the boxplot docstring. Do you use ipython? If
> not, you should :)
>
> if so, just do `plt.boxplot?` at the ipython terminal and it'll show up.
> -paul
I still have a problem...
Let me show the updated code snippet again
   ...
   ...
   ...
   # Box Plots
   iplt += 1
   plt.figure(iplt)
   timings = [ya[0],ya[1],ya[2],ya[3]]
   pos = np.array(range(len(timings)))+1
   bp = plt.boxplot( timings, sym='k+', patch_artist=True,
                  positions=pos, notch=1, bootstrap=5000 )
   print ('medians')
   for nn,median in enumerate(bp['medians']):
       print('%d: %f' %(nn,median.get_ydata()[0]))

   print('fliers')
   for nn in range(0, len(bp['fliers']), 2):
       print('%d: upper outliers = \t' % (nn/2,))
       print(bp['fliers'][nn].get_ydata())
       print('\n%d: lower outliers = \t' % (nn/2,))
       print(bp['fliers'][nn+1].get_ydata())
       print('\n')

   print('Confidence Intervals')
   for nn, box in enumerate(bp['boxes']):
       print('%d: lower CI: %f' % (nn, box.get_ydata()[2]))<--- FAILS!
       print('%d: upper CI: %f' % (nn, box.get_ydata()[4]))
   ...
   ...
   ...

Medians and fliers work perfectly; but, I get the following error message when 
trying to access the confidence intervals:

AttributeError: 'PathPatch' object has no attribute 'get_ydata'

Note, I am using boxplot with 4 sets of data and I am using matplotlib vers. 1.1.0.

Any suggestions on how to fix this problem?

[Matplotlib-users] Problems with rasterizing multiple elements

From: Peter S. J. <pet...@gm...> - 2012-08-21 17:45:31

Hi Everyone,

I'm having problems when rasterizing many lines in a plot using the
rasterized=True keyword using the pdf output.
Some version info:
matplotlib version 1.1.1rc
ubuntu 12.04
python 2.7.3


Here's a basic example that demonstrates my problem:
# Import matplotlib to create a pdf document
import matplotlib
matplotlib.use('Agg')
from matplotlib.backends.backend_pdf import PdfPages
pdf = PdfPages('rasterized_test.pdf')

import matplotlib.pylab as plt

# some test data
import numpy as np
ts = np.linspace(0,2*np.pi,100) * np.ones((200,100))
ts += (np.linspace(0, np.pi, 200)[np.newaxis] * np.ones((100,200))).T
ys = np.sin(ts)

fig = plt.figure()
ax = fig.add_subplot(111)
ax.plot(ts[0], ys.T, color='r', lw=0.5, alpha=0.5, rasterized=True)
pdf.savefig()

pdf.close()



Essentially, I have a lot (200 in this case) of closely overlapping lines
which makes the resulting figure (not rasterized) overly difficult to load.
I would like to rasterize these lines, such that the axis labels (and other
elements of the plot, not shown) remain vectors while the solution
trajectories are flattened to a single raster background. However, using
the code above, the image still takes a long time to load since each
trajectory is independently rasterized, resulting in multiple layers. (If I
open the resulting pdf with a program like inkscape, I can manipulate each
trajectory independently.)

Is it possible to flatten all of the rasterized elements into a single
layer, so the pdf size would be greatly reduced?

Thanks,
--Peter

Re: [Matplotlib-users] boxplot -- how (more)

From: Paul H. <pmh...@gm...> - 2012-08-21 15:59:09

On Tue, Aug 21, 2012 at 8:56 AM, Virgil Stokes <vs...@it...> wrote:
> On 21-Aug-2012 17:50, Paul Hobson wrote:
>>
>> On Tue, Aug 21, 2012 at 7:58 AM, Virgil Stokes <vs...@it...> wrote:
>>>
>>> In reference to my previous email.
>>>
>>> How can I find the outliers (samples points beyond the whiskers) in the
>>> data
>>> used for the boxplot?
>>>
>>> Here is a code snippet that shows how it was used for the timings data (a
>>> list
>>> of 4 sublists (y1,y2,y3,y4), each containing 400,000 real data values),
>>>     ...
>>>     ...
>>>     ...
>>>     # Box Plots
>>>     plt.subplot(2,1,2)
>>>     timings = [y1,y2,y3,y4]
>>>     pos = np.array(range(len(timings)))+1
>>>     bp = plt.boxplot( timings, sym='k+', patch_artist=True,
>>>                      positions=pos, notch=1, bootstrap=5000 )
>>>
>>>     plt.xlabel('Algorithm')
>>>     plt.ylabel('Exection time (sec)')
>>>     plt.ylim(0.9*ymin,1.1*ymax)
>>>
>>>     plt.setp(bp['whiskers'], color='k',  linestyle='-' )
>>>     plt.setp(bp['fliers'], markersize=3.0)
>>>     plt.title('Box plots (%4d trials)' %(n))
>>>     plt.show()
>>>     ...
>>>     ...
>>>     ...
>>>
>>> Again my questions:
>>> 1) How to get the value of the median?
>>> 2) How to find the outliers (outside the whiskers)?
>>> 3) How to find the width of the notch?
>>
>> Virgil, the objects stuffed inside the `bp` dictionary should have
>> methods to retrieve their values. Let's see:
>>
>> In [35]: x = np.random.lognormal(mean=1.25, sigma=1.35, size=(37,3))
>>
>> In [36]: bp = plt.boxplot(x, bootstrap=5000, notch=True)
>>
>> In [37]: # Question 1
>>      ...: print('medians')
>>      ...: for n, median in enumerate(bp['medians']):
>>      ...:     print('%d: %f' % (n, median.get_ydata()[0]))
>>      ...:
>> medians
>> 0: 6.339692
>> 1: 3.449320
>> 2: 4.503706
>>
>> In [38]: # Question 2
>>      ...: print('fliers')
>>      ...: for n in range(0, len(bp['fliers']), 2):
>>      ...:     print('%d: upper outliers = \t' % (n/2,))
>>      ...:     print(bp['fliers'][n].get_ydata())
>>      ...:     print('\n%d: lower outliers = \t' % (n/2,))
>>      ...:     print(bp['fliers'][n+1].get_ydata())
>>      ...:     print('\n')
>>      ...:
>
> You had no outliers!
>
>>
>> In [39]: # Question 3
>>      ...: print('Confidence Intervals')
>>      ...: for n, box in enumerate(bp['boxes']):
>>      ...:     print('%d: lower CI: %f' % (n, box.get_ydata()[2]))
>>      ...:     print('%d: upper CI: %f' % (n, box.get_ydata()[4]))
>>      ...:
>> Confidence Intervals
>> 0: lower CI: 1.760701
>> 0: upper CI: 10.102221
>> 1: lower CI: 1.626386
>> 1: upper CI: 5.601927
>> 2: lower CI: 2.173173
>>
>> Hope that helps,
>> -paul
>
> Just what I was looking for Paul! Thanks very much.
>
> One final question --- Where can I find the documentation that answers my
> questions and gives more details about the equations used for the width of
> notch. etc.?
>
> Thanks again :-)

That should all be in the boxplot docstring. Do you use ipython? If
not, you should :)

if so, just do `plt.boxplot?` at the ipython terminal and it'll show up.
-paul

Re: [Matplotlib-users] boxplot -- how (more)

From: Paul H. <pmh...@gm...> - 2012-08-21 15:55:22

On Tue, Aug 21, 2012 at 7:58 AM, Virgil Stokes <vs...@it...> wrote:
> In reference to my previous email.
>
> How can I find the outliers (samples points beyond the whiskers) in the data
> used for the boxplot?
>
> Here is a code snippet that shows how it was used for the timings data (a list
> of 4 sublists (y1,y2,y3,y4), each containing 400,000 real data values),
>    ...
>    ...
>    ...
>    # Box Plots
>    plt.subplot(2,1,2)
>    timings = [y1,y2,y3,y4]
>    pos = np.array(range(len(timings)))+1
>    bp = plt.boxplot( timings, sym='k+', patch_artist=True,
>                     positions=pos, notch=1, bootstrap=5000 )
>
>    plt.xlabel('Algorithm')
>    plt.ylabel('Exection time (sec)')
>    plt.ylim(0.9*ymin,1.1*ymax)
>
>    plt.setp(bp['whiskers'], color='k',  linestyle='-' )
>    plt.setp(bp['fliers'], markersize=3.0)
>    plt.title('Box plots (%4d trials)' %(n))
>    plt.show()
>    ...
>    ...
>    ...
>
> Again my questions:
> 1) How to get the value of the median?
> 2) How to find the outliers (outside the whiskers)?
> 3) How to find the width of the notch?

Ooops. Here's my reply -- this time to whole list
Virgil, the objects stuffed inside the `bp` dictionary should have
methods to retrieve their values. Let's see:

In [35]: x = np.random.lognormal(mean=1.25, sigma=1.35, size=(37,3))

In [36]: bp = plt.boxplot(x, bootstrap=5000, notch=True)

In [37]: # Question 1
    ...: print('medians')
    ...: for n, median in enumerate(bp['medians']):
    ...:     print('%d: %f' % (n, median.get_ydata()[0]))
    ...:
medians
0: 6.339692
1: 3.449320
2: 4.503706

In [38]: # Question 2
    ...: print('fliers')
    ...: for n in range(0, len(bp['fliers']), 2):
    ...:     print('%d: upper outliers = \t' % (n/2,))
    ...:     print(bp['fliers'][n].get_ydata())
    ...:     print('\n%d: lower outliers = \t' % (n/2,))
    ...:     print(bp['fliers'][n+1].get_ydata())
    ...:     print('\n')
    ...:

In [39]: # Question 3
    ...: print('Confidence Intervals')
    ...: for n, box in enumerate(bp['boxes']):
    ...:     print('%d: lower CI: %f' % (n, box.get_ydata()[2]))
    ...:     print('%d: upper CI: %f' % (n, box.get_ydata()[4]))
    ...:
Confidence Intervals
0: lower CI: 1.760701
0: upper CI: 10.102221
1: lower CI: 1.626386
1: upper CI: 5.601927
2: lower CI: 2.173173

Hope that helps,
-paul

Re: [Matplotlib-users] boxplot -- how (more)

From: Jeffrey B. <jbl...@al...> - 2012-08-21 15:52:39

On Aug 21, 2012, at 10:58 AM, Virgil Stokes wrote:

> In reference to my previous email.
>
> How can I find the outliers (samples points beyond the whiskers) in  
> the data
> used for the boxplot?
>
> Here is a code snippet that shows how it was used for the timings  
> data (a list
> of 4 sublists (y1,y2,y3,y4), each containing 400,000 real data  
> values),
>    ...
>    ...
>    ...
>    # Box Plots
>    plt.subplot(2,1,2)
>    timings = [y1,y2,y3,y4]
>    pos = np.array(range(len(timings)))+1
>    bp = plt.boxplot( timings, sym='k+', patch_artist=True,
>                     positions=pos, notch=1, bootstrap=5000 )
>
>    plt.xlabel('Algorithm')
>    plt.ylabel('Exection time (sec)')
>    plt.ylim(0.9*ymin,1.1*ymax)
>
>    plt.setp(bp['whiskers'], color='k',  linestyle='-' )
>    plt.setp(bp['fliers'], markersize=3.0)
>    plt.title('Box plots (%4d trials)' %(n))
>    plt.show()
>    ...
>    ...
>    ...
>
> Again my questions:
> 1) How to get the value of the median?

This is easily calculated from your data. Numpy will even do it for  
you: np.median(timings)

> 2) How to find the outliers (outside the whiskers)?

 From the boxplot documentation: the whiskers extend to the most  
extreme data point within distance X of the bottom or top of the box,  
where X is 1.5 times the extent of the box. Any points more extreme  
than that are the outliers. The box itself of course extends from the  
25th percentile to the 75th percentile of your data. Again, you can  
easily calculate these values from your data.

> 3) How to find the width of the notch?

Again, from the docs: with bootstrap=5000, it calculates the width of  
the notch by bootstrap resampling your data (the timings array) 5000  
times and finding the 95% confidence interval of the median, and uses  
that as the notch width. You can redo that yourself pretty easily.  
Here is some bootstrap code for you to adapt:
http://mail.scipy.org/pipermail/scipy-user/2009-July/021704.html

I encourage you to read the documentation! This page is very useful  
for reference:
http://matplotlib.sourceforge.net/api/pyplot_api.html

-Jeff

[Matplotlib-users] boxplot -- how (more)

From: Virgil S. <vs...@it...> - 2012-08-21 14:58:28

In reference to my previous email.

How can I find the outliers (samples points beyond the whiskers) in the data 
used for the boxplot?

Here is a code snippet that shows how it was used for the timings data (a list 
of 4 sublists (y1,y2,y3,y4), each containing 400,000 real data values),
   ...
   ...
   ...
   # Box Plots
   plt.subplot(2,1,2)
   timings = [y1,y2,y3,y4]
   pos = np.array(range(len(timings)))+1
   bp = plt.boxplot( timings, sym='k+', patch_artist=True,
                    positions=pos, notch=1, bootstrap=5000 )

   plt.xlabel('Algorithm')
   plt.ylabel('Exection time (sec)')
   plt.ylim(0.9*ymin,1.1*ymax)

   plt.setp(bp['whiskers'], color='k',  linestyle='-' )
   plt.setp(bp['fliers'], markersize=3.0)
   plt.title('Box plots (%4d trials)' %(n))
   plt.show()
   ...
   ...
   ...

Again my questions:
1) How to get the value of the median?
2) How to find the outliers (outside the whiskers)?
3) How to find the width of the notch?

Flat | Threaded

<< < 1 .. 3 4 5 6 7 .. 10 > >> (Page 5 of 10)

S	M	T	W	T	F	S
			1 (1)	2 (8)	3 (10)	4
5 (4)	6	7 (5)	8 (6)	9 (4)	10 (12)	11 (7)
12 (2)	13 (2)	14 (5)	15 (9)	16 (4)	17 (7)	18 (2)
19 (12)	20 (8)	21 (11)	22 (11)	23 (2)	24 (18)	25 (18)
26 (6)	27 (7)	28 (10)	29 (7)	30 (31)	31 (10)