I have been working on a stock trading algorithm using an LSTM model. The algorithm fetches real-time data, makes predictions, and decides whether to buy or sell stocks based on the predicted price. Here is the code:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.preprocessing import MinMaxScaler
from tensorflow.keras.models import Sequential
from keras.layers import LSTM, Dense
import time
import requests
# Define the stock symbol and your Alpha Vantage API key
symbol = 'AAPL'
api_key = 'MY_API'
# Define the initial capital
initial_capital = 10000
capital = initial_capital
# Brokerage fee per stock
brokerage_fee_per_stock = 0.02
# Minimum brokerage fee
min_brokerage_fee = 18
# SEC tax rate
sec_tax_rate = 0.0000278
# Create a dataframe to store the capital and actions over time
capital_df = pd.DataFrame(columns=['time', 'capital', 'action', 'price'])
capital_df.loc[0] = {'time': pd.Timestamp.now(), 'capital': capital, 'action': 'Initial', 'price': 0} # Initialize with initial capital
# Define lookback period
lookback = 60
# Define the LSTM model
model = Sequential()
model.add(LSTM(units=50, return_sequences=True, input_shape=(lookback, 1)))
model.add(LSTM(units=50))
model.add(Dense(1))
# Compile the LSTM model
model.compile(loss='mean_squared_error', optimizer='adam')
while True:
# Fetch real-time data using Alpha Vantage API
url = f'https://www.alphavantage.co/query?function=TIME_SERIES_INTRADAY&symbol={symbol}&interval=1min&apikey={api_key}'
response = requests.get(url)
data = response.json()
current_price = float(data['Time Series (1min)'][list(data['Time Series (1min)'].keys())[0]]['4. close'])
# Preprocess the data
scaler = MinMaxScaler(feature_range=(0, 1))
price_data = np.array([current_price for _ in range(lookback)]).reshape(-1, 1)
scaled_data = scaler.fit_transform(price_data)
# Prepare inputs for the LSTM model
inputs = scaled_data.reshape(1, lookback, 1)
# Fit the LSTM model
model.fit(inputs, np.array([current_price]), epochs=1, batch_size=1, verbose=2)
# Make predictions using the trained LSTM model
predicted_price = model.predict(inputs)
predicted_price = scaler.inverse_transform(predicted_price)[0][0]
# Calculate the number of stocks to buy or sell
num_stocks = capital // current_price
# Calculate the total brokerage fee for buying and selling
total_brokerage_fee = max(brokerage_fee_per_stock * num_stocks, min_brokerage_fee) * 2
# Calculate the SEC tax
sec_tax = sec_tax_rate * predicted_price * num_stocks
# Update capital based on predicted price and record the action
if predicted_price > current_price:
capital += num_stocks * current_price - total_brokerage_fee - sec_tax
action = 'Buy'
else:
capital -= num_stocks * current_price + total_brokerage_fee + sec_tax
action = 'Sell'
# Append the current capital and action to the dataframe
new_row = {'time': pd.Timestamp.now(), 'capital': capital, 'action': action, 'price': current_price}
capital_df = pd.concat([capital_df, pd.DataFrame([new_row])], ignore_index=True)
# Plot the capital over time
plt.figure(figsize=(10, 5))
plt.plot(capital_df['time'], capital_df['capital'])
plt.title('Capital Over Time')
plt.xlabel('Time')
plt.ylabel('Capital')
plt.xticks(rotation=45)
plt.gca().xaxis.set_major_locator(plt.MaxNLocator(10)) # Show 10 ticks
plt.show()
print(f'Current earning: {capital - initial_capital}, Action: {action} at price: {current_price}')
# Wait for 1 minute
time.sleep(60)
I have a few questions regarding this:
- Weird Plot: The plot of capital over time looks weird. It is literally just a striaght line. I'm not sure why this is happening, but I believe that it may commit some statistical issue. Could it be due to the way I'm updating and plotting the capital?
This is the weird plot in the first minute
This is the weird plot in the second minute
This is the weird plot in the third minute
This is the weird plot in the fourth minute
This is the weird plot in the fifth minute
This is the weird plot in the sixth minute
The weird thing is that, the plot seems not following the "investment rule". Intuitively the plot should start from 10000 and there would be a lot of fluctuations, rather than a single straight line.
(Update: I'm seeing that the capital doubles within the same minute, which seems very strange. I understand that this is a simplified simulation and doesn't take into account many factors present in real-world trading, but I'm still puzzled by this result.
Also, the x-axis of the plot, which represents time, shows the same label for all ticks. I believe this is because the loop is running so quickly that all actions within a single loop iteration are getting nearly the same timestamp, down to the minute.
Here are my questions:
- What could be causing the capital to double within the same minute?
- How can I modify the code so that the x-axis labels of the plot show the correct time for each action?
)
Exact Time and Action: I want my code to specify the exact time and the exact action (buy/sell) that the LSTM model calculates. How can I achieve this?
Speeding Up Code Execution: The code needs to work under a minute to update minute-wise. However, it's currently taking more than a minute. Are there any ways to optimize the code to make it run faster?
Updates: With @furas helps, here are my revised codes:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
from sklearn.preprocessing import MinMaxScaler
from tensorflow.keras.models import Sequential
from keras.layers import LSTM, Dense
import time
import requests
# Define the stock symbol and your Alpha Vantage API key
symbol = 'AAPL'
api_key = 'MY_API'
# Define the initial capital
initial_capital = 10000
capital = initial_capital
# Brokerage fee per stock
brokerage_fee_per_stock = 0.02
# Minimum brokerage fee
min_brokerage_fee = 18
# SEC tax rate
sec_tax_rate = 0.0000278
# Create a dataframe to store the capital and actions over time
capital_df = pd.DataFrame(columns=['time', 'capital', 'action', 'price'])
capital_df.loc[0] = {'time': pd.Timestamp.now(), 'capital': capital, 'action': 'Initial', 'price': 0} # Initialize with initial capital
# Define lookback period
lookback = 60
# Define the LSTM model
model = Sequential()
model.add(LSTM(units=50, return_sequences=True, input_shape=(lookback, 1)))
model.add(LSTM(units=50))
model.add(Dense(1))
# Compile the LSTM model
model.compile(loss='mean_squared_error', optimizer='adam')
# Create the plot before the loop
plt.figure(figsize=(10, 5))
plt.title('Capital Over Time')
plt.xlabel('Time')
plt.ylabel('Capital')
plt.xticks(rotation=45)
plt.gca().xaxis.set_major_locator(plt.MaxNLocator(10)) # Show 10 ticks
plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%Y-%m-%d %H:%M')) # Format the time up to minutes
# Initialize a flag for the first iteration and the action
first_iteration = True
action = 'Initial'
while True:
start_time = time.time() # Start the timer
# Fetch real-time data using Alpha Vantage API
url = f'https://www.alphavantage.co/query?function=TIME_SERIES_INTRADAY&symbol={symbol}&interval=1min&apikey={api_key}'
response = requests.get(url)
data = response.json()
current_price = float(data['Time Series (1min)'][list(data['Time Series (1min)'].keys())[0]]['4. close'])
# Preprocess the data
scaler = MinMaxScaler(feature_range=(0, 1))
price_data = np.array([current_price for _ in range(lookback)]).reshape(-1, 1)
scaled_data = scaler.fit_transform(price_data)
# Prepare inputs for the LSTM model
inputs = scaled_data.reshape(1, lookback, 1)
# Fit the LSTM model
model.fit(inputs, np.array([current_price]), epochs=1, batch_size=1, verbose=2)
# Make predictions using the trained LSTM model
predicted_price = model.predict(inputs)
predicted_price = scaler.inverse_transform(predicted_price)[0][0]
# Calculate the number of stocks to buy or sell
num_stocks = capital // current_price
# Calculate the total brokerage fee for buying and selling
total_brokerage_fee = max(brokerage_fee_per_stock * num_stocks, min_brokerage_fee) * 2
# Calculate the SEC tax
sec_tax = sec_tax_rate * predicted_price * num_stocks
# Update capital based on predicted price and record the action
if first_iteration:
first_iteration = False
elif predicted_price > current_price:
capital += num_stocks * current_price - total_brokerage_fee - sec_tax
action = 'Buy'
else:
capital -= num_stocks * current_price + total_brokerage_fee + sec_tax
action = 'Sell'
# Append the current capital and action to the dataframe
new_row = {'time': pd.Timestamp.now(), 'capital': capital, 'action': action, 'price': current_price}
capital_df = pd.concat([capital_df, pd.DataFrame([new_row])], ignore_index=True)
# Update the plot data without creating it all again
plt.plot(capital_df['time'], capital_df['capital'])
plt.gcf().autofmt_xdate() # Autoformat the time label for better display
plt.draw()
plt.pause(0.01) # Pause for the plot to update
print(f'Current earning: {capital - initial_capital}, Action: {action} at price: {current_price}')
# Calculate the elapsed time and wait for the remaining time to complete 60 seconds
elapsed_time = time.time() - start_time
time.sleep(max(60 - elapsed_time, 0))
Here are my questions now:
What could be causing the capital to double starting from the second minute?
How can I modify the code to prevent this from happening?
I used another API by IEX cloud, and had proven that the issue is not about the API:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
from sklearn.preprocessing import MinMaxScaler
from tensorflow.keras.models import Sequential
from keras.layers import LSTM, Dense
import time
import requests
# Define the stock symbol and your IEX Cloud API key
symbol = 'AAPL'
api_key = 'sk_98e272a5046941a2a4f7c3554bbecce1' # replace with your own API key
# Define the initial capital
initial_capital = 10000
capital = initial_capital
# Brokerage fee per stock
brokerage_fee_per_stock = 0.02
# Minimum brokerage fee
min_brokerage_fee = 18
# SEC tax rate
sec_tax_rate = 0.0000278
# Create a dataframe to store the capital and actions over time
capital_df = pd.DataFrame(columns=['time', 'capital', 'action', 'price'])
capital_df.loc[0] = {'time': pd.Timestamp.now(), 'capital': capital, 'action': 'Initial', 'price': 0} # Initialize with initial capital
# Define lookback period
lookback = 60
# Define the LSTM model
model = Sequential()
model.add(LSTM(units=50, return_sequences=True, input_shape=(lookback, 1)))
model.add(LSTM(units=50))
model.add(Dense(1))
# Compile the LSTM model
model.compile(loss='mean_squared_error', optimizer='adam')
# Create the plot before the loop
plt.figure(figsize=(10, 5))
plt.title('Capital Over Time')
plt.xlabel('Time')
plt.ylabel('Capital')
plt.xticks(rotation=45)
plt.gca().xaxis.set_major_locator(plt.MaxNLocator(10)) # Show 10 ticks
plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%Y-%m-%d %H:%M')) # Format the time up to minutes
# Initialize a flag for the first iteration and the action
first_iteration = True
action = 'Initial'
while True:
start_time = time.time() # Start the timer
print(f'Start time: {start_time}') # Print the start time
# Fetch real-time data using IEX Cloud API
url = f'https://cloud.iexapis.com/stable/stock/{symbol}/quote?token={api_key}'
response = requests.get(url)
data = response.json()
current_price = float(data['latestPrice'])
# Preprocess the data
scaler = MinMaxScaler(feature_range=(0, 1))
price_data = np.array([current_price for _ in range(lookback)]).reshape(-1, 1)
scaled_data = scaler.fit_transform(price_data)
# Prepare inputs for the LSTM model
inputs = scaled_data.reshape(1, lookback, 1)
# Fit the LSTM model
model.fit(inputs, np.array([current_price]), epochs=1, batch_size=1, verbose=2)
# Make predictions using the trained LSTM model
predicted_price = model.predict(inputs)
predicted_price = scaler.inverse_transform(predicted_price)[0][0]
# Calculate the number of stocks to buy or sell
num_stocks = capital // current_price
# Calculate the total brokerage fee for buying and selling
total_brokerage_fee = max(brokerage_fee_per_stock * num_stocks, min_brokerage_fee) * 2
# Calculate the SEC tax
sec_tax = sec_tax_rate * predicted_price * num_stocks
# Update capital based on predicted price and record the action
if first_iteration:
first_iteration = False
elif predicted_price > current_price:
capital += num_stocks * current_price - total_brokerage_fee - sec_tax
action = 'Buy'
else:
capital -= num_stocks * current_price + total_brokerage_fee + sec_tax
action = 'Sell'
print(f'Action: {action}, Capital: {capital}') # Print the action and the updated capital
# Append the current capital and action to the dataframe
new_row = {'time': pd.Timestamp.now(), 'capital': capital, 'action': action, 'price': current_price}
capital_df = pd.concat([capital_df, pd.DataFrame([new_row])], ignore_index=True)
# Update the plot data without creating it all again
plt.plot(capital_df['time'], capital_df['capital'])
plt.gcf().autofmt_xdate() # Autoformat the time label for better display
plt.draw()
plt.pause(0.01) # Pause for the plot to update
print(f'Current earning: {capital - initial_capital}, Action: {action} at price: {current_price}') # Print the current earning, action, and price
# Calculate the elapsed time and wait for the remaining time to complete 60 seconds
elapsed_time = time.time() - start_time
print(f'Elapsed time: {elapsed_time}') # Print the elapsed time
time.sleep(max(60 - elapsed_time, 0))
Any help would be greatly appreciated.
Thank you!
datain existing plot without creating all again.time(orprofilerinIDE) to see which part takes more time. You could also use it to get full time which it took and runwait(60-full_time)