I've got the following code that takes historical prices for a single asset and calculated forecasts, and computes how you would have faired if you had really invested your money according to the forecast. In financial parlance, it's a back-test.
The main problem is that its very slow, and I'm not sure what the right strategy is for improving it. I need to run this thousands of times, so an order of magnitude speedup is required.
Where should I begin looking?
class accountCurve():
def __init__(self, forecasts, prices):
self.curve = pd.DataFrame(columns=['Capital','Holding','Cash','Trade', 'Position'], dtype=float)
forecasts.dropna(inplace=True)
self.curve['Forecast'] = forecasts
self.curve['Price'] = prices
self.curve.loc[self.curve.index[0],['Capital', 'Holding', 'Cash', 'Trade', 'Position']] = [10000, 0, 10000, 0, 0]
for date, forecast in forecasts.iteritems():
x=self.curve.loc[date]
previous = self.curve.shift(1).loc[date]
if previous.isnull()['Cash']==False:
x['Cash'] = previous['Cash'] - previous['Trade'] * x['Price']
x['Position'] = previous['Position'] + previous['Trade']
x['Holding'] = x['Position'] * x['Price']
x['Capital'] = x['Cash'] + x['Holding']
x['Trade'] = np.fix(x['Capital']/x['Price'] * x['Forecast']/20) - x['Position']
Edit:
Datasets as requested:
Prices:
import quandl
corn = quandl.get('CHRIS/CME_C2')
prices = corn['Open']
Forecasts:
def ewmac(d):
columns = pd.Series([2, 4, 8, 16, 32, 64])
g = lambda x: d.ewm(span = x, min_periods = x*4).mean() - d.ewm(span = x*4, min_periods=x*4).mean()
f = columns.apply(g).transpose()
f = f*10/f.abs().mean()
f.columns = columns
return f.clip(-20,20)
forecasts=ewmac(prices)
for date, forecast in forecasts.iteritems()loop? How to create a Minimal, Complete, and Verifiable example.x['Cash']isnan/null, all the other things becomenantoo, which is to say they aren't modified from their defaults, so you could have skipped the iteration completely..so usedropnamore effectively outside the loop...indeed you should loop overcurveitself rather thanforecasts.