0

I have the following function

import math
import pandas as pd
import pandas_datareader as web
import numpy as np    
import matplotlib.pyplot as plt
import os.path

from sklearn.preprocessing import MinMaxScaler

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, LSTM, Dropout
from tensorflow.keras.models import load_model

def predict_stock(stock_name, predict_days=30):
    start = dt.datetime(2021, 1, 1)
    end = dt.datetime.now()

    stock = web.DataReader(stock_name, data_source="yahoo", start=start, end=end)
    stock = stock.filter(["Adj Close"])
    stock_data = stock.values

    # splits the stock into training data and test data
    training_len = math.ceil(len(stock) - predict_days)

    scale = MinMaxScaler()
    scaled_data = scale.fit_transform(stock_data)
    train_data = scaled_data[:training_len]

    # sets train values
    x_train = []
    y_train = []

    # test starts at day 60 and ends at 80 % of day end (test data)
    for i in range(predict_days, len(train_data)):
        x_train.append(train_data[i - predict_days:i])
        y_train.append(train_data[i:i+predict_days])
    x_train = np.array(x_train)
    y_train = np.array(y_train)
    #y_train.reshape(y_train, x_train.shape)

predict_stock('ALB', 30)

while x_train is of shape (164, 30, 1), y_train is for some reason of shape (164,), whereby the generation was the same.

How can I reshape y_train to (164,30,1)?

I tried the command:

 y_train.reshape(y_train, x_train.shape)

but this gives me the error:

TypeError: only integer scalar arrays can be converted to a scalar index

How can I reshape the array correctly?

5
  • it's totally normal that your y_train is (164) it contains the label only, your training x_train contains the features which are 30 features and 164 row. Commented Nov 21, 2021 at 18:53
  • why are you using x_train as argument? Commented Nov 21, 2021 at 19:11
  • My aim is to give y_train the same shape as x_train Commented Nov 21, 2021 at 19:27
  • OK, that's what the x_train..shape argument was for. But why the whole x_train? Did you check the reshape docs to see what kinds of arguments it expected? Check the docs before running off to the web seeking help! Commented Nov 21, 2021 at 21:12
  • In reformatting the question I realized that you can't reshape y_train to match x_train. You can reshape it to (164,1), but you can't increase the total number of elements to match x_train. But if x_train has 164 "samples" and 30 "features", y_train shouldn't have that 30 dimension. It's just one value for each sample. Commented Nov 21, 2021 at 21:18

1 Answer 1

1

The very basic flaw in your code is that you passed y_train as the first parameter of reshape (it is a method of y_train array).

But spotting this is not enough.

If you want to rsehape y_train to the shape of x_train, then y_train must have the same number of elements as x_train. You can achieve it calling e.g. np.repeat:

np.repeat(y_train, x_train.shape[1])

i.e. "multily" occurrences of source elements, but so far the result is still a 1D array.

The second step is to reshape.

So the whole code can be:

result = np.repeat(y_train, x_train.shape[1]).reshape(x_train.shape)

I intentionally saved the result in another array, in order to keep the source array for any comparison.

But consider also another approach, probably better matching the computer learning methodology:

I suppose that it is enough to convert y_train to a single column shape. So try:

result2 = np.expand_dims(y_train, 1)

This time the shape of the result is (164, 1).

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.