You can subscribe to this list here.
| 2003 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(3) |
Jun
|
Jul
|
Aug
(12) |
Sep
(12) |
Oct
(56) |
Nov
(65) |
Dec
(37) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2004 |
Jan
(59) |
Feb
(78) |
Mar
(153) |
Apr
(205) |
May
(184) |
Jun
(123) |
Jul
(171) |
Aug
(156) |
Sep
(190) |
Oct
(120) |
Nov
(154) |
Dec
(223) |
| 2005 |
Jan
(184) |
Feb
(267) |
Mar
(214) |
Apr
(286) |
May
(320) |
Jun
(299) |
Jul
(348) |
Aug
(283) |
Sep
(355) |
Oct
(293) |
Nov
(232) |
Dec
(203) |
| 2006 |
Jan
(352) |
Feb
(358) |
Mar
(403) |
Apr
(313) |
May
(165) |
Jun
(281) |
Jul
(316) |
Aug
(228) |
Sep
(279) |
Oct
(243) |
Nov
(315) |
Dec
(345) |
| 2007 |
Jan
(260) |
Feb
(323) |
Mar
(340) |
Apr
(319) |
May
(290) |
Jun
(296) |
Jul
(221) |
Aug
(292) |
Sep
(242) |
Oct
(248) |
Nov
(242) |
Dec
(332) |
| 2008 |
Jan
(312) |
Feb
(359) |
Mar
(454) |
Apr
(287) |
May
(340) |
Jun
(450) |
Jul
(403) |
Aug
(324) |
Sep
(349) |
Oct
(385) |
Nov
(363) |
Dec
(437) |
| 2009 |
Jan
(500) |
Feb
(301) |
Mar
(409) |
Apr
(486) |
May
(545) |
Jun
(391) |
Jul
(518) |
Aug
(497) |
Sep
(492) |
Oct
(429) |
Nov
(357) |
Dec
(310) |
| 2010 |
Jan
(371) |
Feb
(657) |
Mar
(519) |
Apr
(432) |
May
(312) |
Jun
(416) |
Jul
(477) |
Aug
(386) |
Sep
(419) |
Oct
(435) |
Nov
(320) |
Dec
(202) |
| 2011 |
Jan
(321) |
Feb
(413) |
Mar
(299) |
Apr
(215) |
May
(284) |
Jun
(203) |
Jul
(207) |
Aug
(314) |
Sep
(321) |
Oct
(259) |
Nov
(347) |
Dec
(209) |
| 2012 |
Jan
(322) |
Feb
(414) |
Mar
(377) |
Apr
(179) |
May
(173) |
Jun
(234) |
Jul
(295) |
Aug
(239) |
Sep
(276) |
Oct
(355) |
Nov
(144) |
Dec
(108) |
| 2013 |
Jan
(170) |
Feb
(89) |
Mar
(204) |
Apr
(133) |
May
(142) |
Jun
(89) |
Jul
(160) |
Aug
(180) |
Sep
(69) |
Oct
(136) |
Nov
(83) |
Dec
(32) |
| 2014 |
Jan
(71) |
Feb
(90) |
Mar
(161) |
Apr
(117) |
May
(78) |
Jun
(94) |
Jul
(60) |
Aug
(83) |
Sep
(102) |
Oct
(132) |
Nov
(154) |
Dec
(96) |
| 2015 |
Jan
(45) |
Feb
(138) |
Mar
(176) |
Apr
(132) |
May
(119) |
Jun
(124) |
Jul
(77) |
Aug
(31) |
Sep
(34) |
Oct
(22) |
Nov
(23) |
Dec
(9) |
| 2016 |
Jan
(26) |
Feb
(17) |
Mar
(10) |
Apr
(8) |
May
(4) |
Jun
(8) |
Jul
(6) |
Aug
(5) |
Sep
(9) |
Oct
(4) |
Nov
|
Dec
|
| 2017 |
Jan
(5) |
Feb
(7) |
Mar
(1) |
Apr
(5) |
May
|
Jun
(3) |
Jul
(6) |
Aug
(1) |
Sep
|
Oct
(2) |
Nov
(1) |
Dec
|
| 2018 |
Jan
|
Feb
|
Mar
|
Apr
(1) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2020 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2025 |
Jan
(1) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| S | M | T | W | T | F | S |
|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
1
(1) |
|
2
|
3
(10) |
4
(17) |
5
(7) |
6
(21) |
7
(15) |
8
(6) |
|
9
(7) |
10
(8) |
11
(6) |
12
(11) |
13
(11) |
14
(13) |
15
(4) |
|
16
(5) |
17
(8) |
18
(8) |
19
(15) |
20
(3) |
21
(10) |
22
(5) |
|
23
(7) |
24
(8) |
25
(29) |
26
(26) |
27
(7) |
28
(2) |
29
(3) |
|
30
(3) |
|
|
|
|
|
|
|
From: Paul T. <pau...@gm...> - 2012-09-28 04:51:28
|
On 9/26/12 12:31 PM, Benjamin Root wrote:
>
>
> On Wed, Sep 26, 2012 at 12:10 PM, Paul Tremblay
> <pau...@gm... <mailto:pau...@gm...>> wrote:
>
> Thanks. I know when doing 8/9 in python 2.x you get 0. With python
> 3 you get a decimal (Hooray, Python 3!).
>
> I ran the script I submitted with python 3. Do I need to change
> the defects and totals from integers to floats to make my chart
> work universally?
>
> P.
>
>
Here is my latest version of the Pareto chart. If the developers feel it
is worthy, then they can include it in the gallery. Otherwise, it will
have to remain in these threads, useful when, needing such a chart,
someone searches the web and ends up here.
import matplotlib.pyplot as plt
import numpy as np
# the data to plot
defects = [32.0, 22.0, 15.0, 5.0, 2.0]
labels = ['vertical', 'horizontal', 'behind', 'left area', 'other']
the_sum = sum(defects) # ie, 32 + 22 + 15 + 5 + 2
the_cumsum = np.cumsum(defects) # 32, 32 + 22, 32 + 22 + 15, 32 + 22 +
15 + 5, 32 + 22, + 15 + 5 + 2
percent = (np.divide(the_cumsum, the_sum)) * 100
ind = np.arange(len(defects)) # the x locations for the groups
width = .98 # with do of the bars, where a width of 1 indidcates no
space between bars
x = ind + .5 * width # find the middle of the bar
fig = plt.figure() # create a figure
ax1 = fig.add_subplot(111) # and a subplot
ax2 = ax1.twinx() # create a duplicate y axis
rects1 = ax1.bar(ind, defects, width=width) # draw the chart
line, = ax2.plot(x, percent) # draw the line
ax1.set_ylim(ymax=the_sum) # without these limits, graphs will not work
ax2.set_ylim(0, 100)
ax1.set_xticks(x) # set ticks for middle of bars
ax1.set_xticklabels(labels) # create the labels for the bars
ax1.set_ylabel('Defects') # create the left y axis label
ax2.set_ylabel('Percentage') # create the right y axis label
plt.show()
|
|
From: Paul T. <pau...@gm...> - 2012-09-28 04:47:07
|
On 9/26/12 10:15 AM, Michael Droettboom wrote: > On 09/26/2012 09:33 AM, Benjamin Root wrote: >> >> >> On Wed, Sep 26, 2012 at 9:10 AM, Michael Droettboom <md...@st... >> <mailto:md...@st...>> wrote: >> >> On 09/26/2012 12:28 AM, jos...@gm... >> <mailto:jos...@gm...> wrote: >> > On Wed, Sep 26, 2012 at 12:05 AM, Paul Tremblay >> <pau...@gm... <mailto:pau...@gm...>> wrote: >> >> In R, there are many default data sets one can use to both >> illustrate code >> >> and explore the scripting language. Instead of having to fake >> data, one can >> >> pull from meaningful data sets, created in the real world. For >> example, this >> >> one liner actually produces a plot: >> >> >> >> plot(mtcars$hp~mtcars$mpg) >> >> >> >> where mtcars refers to a built-in data set taken from Motor >> Trend Magazine. >> >> I don't believe matplotlib has anything similar. I have >> started to download >> >> some of the R data sets and store them as pickles for my own >> use. Does >> >> anyone else have any interest in creating a repository for >> these data sets >> >> or otherwise sharing them in some way? >> > Vincent converted several R datasets back to csv, that can be >> easily >> > loaded from the web with, for example, pandas. >> > http://vincentarelbundock.github.com/Rdatasets/ >> > The collection is a bit random. >> > >> > statsmodels has some datasets that we use for examples and tests >> > http://statsmodels.sourceforge.net/devel/datasets/index.html >> > We were always a bit slow with adding datasets because we were too >> > cautious about licensing issues. But R seems to get away with >> > considering most datasets to be public domain. >> > We keep adding datasets to statsmodels as we need them for new >> models. >> > >> > The machine learning packages like sklearn have packaged the >> typical >> > machine learning datasets. >> > >> > If you are interested, you could join up with statsmodels or with >> > Vincent to expand on what's available. >> > >> It seems to me like contributing to (rather than duplicating) the >> work >> of one of these projects would be a great idea. It would also be >> nice >> to add functionality in matplotlib to make it easier to download >> these >> things as a one-off -- obviously not exactly the same syntax as >> with R, >> but ideally with a single function call. >> >> Mike >> >> >> We did have such a thing. matplotlib.cbook.get_sample_data(). I >> think we got rid of it for 1.2.0? > It was removed because the server side was a moving target and would > constantly break. It was based on pulling files out of the svn (and > later git) repository, and sourceforge and github have had a habit of > changing the urls used to do so. All of the data that was there was > moved into the main repository and is now installed alongside > matplotlib, so get_sample_data() still works. > > See this PR: https://github.com/matplotlib/matplotlib/pull/498 > > I should have mentioned it earlier, that we do have a very small set > of standard data sets included there -- but these other projects > linked to above are much better and more extensive. If we can rely on > them to have static urls over time, I think they are much better > options than anything matplotlib has had in the past. > > Mike Drawing on other posts, it is conceivable to download both the R sets and the stats models sets and include them in site-packages/matplotlib/mpl-data/sample_data/? I understand that pulling data sets not in this directory creates problems because of moving URLs, but why even try to do a web pull when the data can exists in a reliable place? I suppose one might raise reasonable objections to my suggestion, but at any rate, it doesn't seem I can add anything else to either project, since they both seem complete. I see only a small though significant problem with the R data sets in that it leaves out the header of the first column because of the structure of R data frames. Python needs this header. Paul |