2

I have some event times in a list and I would like to plot an exponentially weighted moving average of them. I can do this using the following code.

import numpy as np
import matplotlib.pyplot as plt

print "Code runnning"
a=0.01
l = [3.0,7.0,10.0,20.0,200.0]
y = np.zeros(1000)
for item in l:
        y[item]=1
s = np.zeros(1000)
x = np.linspace(0,1000,1000)
for i in xrange(1000):
    s[i] = a*y[i-1]+(1-a)*s[i-1]
plt.plot(x, s)
plt.show()

This is clearly a horrible way to use python however. What's the right way to do this? Is it possible to do it without making all these extra sparse arrays?

The output should look like this.

enter image description here

3 Answers 3

1

Pandas comes to mind for this task:

import pandas as pd

l = [3.0,7.0,10.0,20.0,200.0]
s = pd.Series(np.ones_like(l), index=l)
y = s.reindex(range(1000), fill_value=0)
pd.ewma(y, 199).plot()

The period 199 is related to your parameter alpha 0.01 as n=2/(a+1). Result: enter image description here

Sign up to request clarification or add additional context in comments.

7 Comments

I have never used pandas. What do you do to actually see the plot?
Well it just pops up in the iPython notebook, nothing else to do. Otherwise see here stackoverflow.com/questions/16522380/….
Thank you for the edit but the graph still doesn't look anything like mine. Does it look the same when you test it?
@felix, see my edit. I don't understand your y variable, I thought you had daily values in l.
Ah no. l is the list of times when an event occurs or in other words when a "plus one" event occurs. Can pandas somehow use l without having to make y?
|
0

AFAIK there's not a very good way to do this with numpy or the scipy.sparse module -- the sparse matrices in scipy.sparse are designed to be 2D matrices, and to create one in the first place you'd basically need to use the code you've already written in your first loop (i.e., to set all of the nonzero locations in a sparse matrix), with the additional complexity of always having to specify two index values.

As if that's not bad enough, np.convolve doesn't work with sparse arrays, so you'd still need to write out the computation in your second loop to compute the moving average.

My recommendation, which probably isn't much help if you're looking for a fancy numpy version, is to fall back on Python's excellent support as a general-purpose language :

import matplotlib.pyplot as plt

a=0.01
l = set([3, 7, 10, 20, 200])
s = np.zeros(1000)
for i in xrange(len(s)):
    s[i] = a * int(i-1 in l) + (1-a) * s[i-1]
plt.plot(s)
plt.show()

Here, I've stored the event index values in l, just as you did, but I used a set to make lookup times O(1) -- though if len(l) isn't very large, you might even be better off with a plain list or tuple, you'd need to measure it to be sure. Then you can avoid creating the y array and just rely on Iverson's convention to convert the Boolean value x in y into an int. You might not even need the explicit cast, but I find it helpful to be explicit.

Comments

0

I think you're looking for something like this:

import numpy as np
import matplotlib.pyplot as plt
from scikits.timeseries.lib.moving_funcs import mov_average_expw

l = [ 3.0, 7.0, 10.0, 20.0, 200.0 ]
y = np.zeros(1000)
y[[l]] = 1
emav = mov_average_expw(y, 199)
plt.plot(emav)
plt.show()

This makes use of mov_average_expw from scikits.timeseries. Check that method's documentation to see how I came up with the span parameter based on your code's a variable.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.