1

My input is a CSV file and I made segments of about 400 samples. The feature is 3 (x, y,z). First, I applied CNN2D using model.add(Conv2D(16, (2, 2), activation = 'relu', input_shape = x_train[0].shape)). it perfactly worked, however in case of LSTM, input showed errors. So, I changed the input into model.add(LSTM(32, input_shape = (400,3), return_sequences=True)) then this code worked but below in model.fit I faced the problem. Please find the code and Error below:

x_train.shape, x_test.shape 

output of above code: ((836, 400, 3), (209, 400, 3))

x_train = x_train.reshape(836, 400, 3, 1)   
x_test = x_test.reshape(209, 400, 3, 1)

x_train[0].shape  #output of this line: (400, 3, 1)


model = Sequential()     
model.add(LSTM(32, input_shape = (400,3), return_sequences=True))

model.add(Dropout(0.5)) 
model.add(Dense(100, activation='relu')) 
model.add(Flatten())
#Then Here we have Dense Layer 
model.add(Dense(64, activation= 'relu')) 
model.add(Dropout(0.5)) 
model.add(Dense(3, activation='softmax'))
model.compile(optimizer=Adam(learning_rate = 0.001), loss = 'sparse_categorical_crossentropy', metrics=['accuracy'])
history = model.fit(x_train, y_train, epochs = 10, validation_data = (x_test, y_test), verbose=1) 

ERROR

    ---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-109-3ffd974b58e0> in <module>
      1 #Record this model tranning into a history
      2 
----> 3 history = model.fit(x_train, y_train, epochs = 10, validation_data = (x_test, y_test), verbose=1)
      4 #Below here you can see xthe training, here at the very first step 75% traning accuracy and 84% validation accuracy, After 10
      5 #epoc you see 91% of traning accuracy and 87% validaton accuracy, (As a complement, with accelrometer data, this is very good

c:\users\nafee\appdata\local\programs\python\python37\lib\site-packages\tensorflow_core\python\keras\engine\training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, validation_freq, max_queue_size, workers, use_multiprocessing, **kwargs)
    726         max_queue_size=max_queue_size,
    727         workers=workers,
--> 728         use_multiprocessing=use_multiprocessing)
    729 
    730   def evaluate(self,

c:\users\nafee\appdata\local\programs\python\python37\lib\site-packages\tensorflow_core\python\keras\engine\training_v2.py in fit(self, model, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, validation_freq, **kwargs)
    222           validation_data=validation_data,
    223           validation_steps=validation_steps,
--> 224           distribution_strategy=strategy)
    225 
    226       total_samples = _get_total_number_of_samples(training_data_adapter)

c:\users\nafee\appdata\local\programs\python\python37\lib\site-packages\tensorflow_core\python\keras\engine\training_v2.py in _process_training_inputs(model, x, y, batch_size, epochs, sample_weights, class_weights, steps_per_epoch, validation_split, validation_data, validation_steps, shuffle, distribution_strategy, max_queue_size, workers, use_multiprocessing)
    545         max_queue_size=max_queue_size,
    546         workers=workers,
--> 547         use_multiprocessing=use_multiprocessing)
    548     val_adapter = None
    549     if validation_data:

c:\users\nafee\appdata\local\programs\python\python37\lib\site-packages\tensorflow_core\python\keras\engine\training_v2.py in _process_inputs(model, x, y, batch_size, epochs, sample_weights, class_weights, shuffle, steps, distribution_strategy, max_queue_size, workers, use_multiprocessing)
    592         batch_size=batch_size,
    593         check_steps=False,
--> 594         steps=steps)
    595   adapter = adapter_cls(
    596       x,

c:\users\nafee\appdata\local\programs\python\python37\lib\site-packages\tensorflow_core\python\keras\engine\training.py in _standardize_user_data(self, x, y, sample_weight, class_weight, batch_size, check_steps, steps_name, steps, validation_split, shuffle, extract_tensors_from_dataset)
   2470           feed_input_shapes,
   2471           check_batch_axis=False,  # Don't enforce the batch size.
-> 2472           exception_prefix='input')
   2473 
   2474     # Get typespecs for the input data and sanitize it if necessary.

c:\users\nafee\appdata\local\programs\python\python37\lib\site-packages\tensorflow_core\python\keras\engine\training_utils.py in standardize_input_data(data, names, shapes, check_batch_axis, exception_prefix)
    563                            ': expected ' + names[i] + ' to have ' +
    564                            str(len(shape)) + ' dimensions, but got array '
--> 565                            'with shape ' + str(data_shape))
    566         if not check_batch_axis:
    567           data_shape = data_shape[1:]

ValueError: Error when checking input: expected lstm_16_input to have 3 dimensions, but got array with shape (836, 400, 3, 1)

Any idea to solve this problem?

2

1 Answer 1

1

The input shape of LSTM is batch_size X time_steps X input_size (when batch first). I.e an LSTM/recurrent network is unrolled time_steps times for each sample and each unrolling will get an input of input_size.

Lest see your model architecture:

Layer (type)                 Output Shape              Param #   
=================================================================
lstm_3 (LSTM)                (None, 400, 32)           4608      
_________________________________________________________________
dropout_5 (Dropout)          (None, 400, 32)           0         
_________________________________________________________________
dense_7 (Dense)              (None, 400, 100)          3300      
_________________________________________________________________
flatten_3 (Flatten)          (None, 40000)             0         
_________________________________________________________________
dense_8 (Dense)              (None, 64)                2560064   
_________________________________________________________________
dropout_6 (Dropout)          (None, 64)                0         
_________________________________________________________________
dense_9 (Dense)              (None, 3)                 195       
=================================================================
Total params: 2,568,167
Trainable params: 2,568,167
Non-trainable params: 0

The input size of LSTM is batch_size X 400 X 3 and the output size is ``batch_size X 400 X 32(since return_sequence is true). so you will have to pass your836train samples of400length and each having 3 features(x,y,z)` to the lstm. You can reshape your inputs by squeezing out the last dimension.

Code

from keras.layers import Dropout, Flatten, Dense, LSTM
from keras.models import Sequential
from keras.optimizers import Adam

x_train = np.random.randn(836, 400, 3, 1).squeeze() # This will reshape to (836, 400, 3)
x_test  = np.random.randn(209, 400, 3, 1).squeeze() # This will reshape to (209, 400, 3)

y_train = np.random.randint(0,3,size=(836))
y_test = np.random.randint(0,3,size=(209))

model = Sequential()     
model.add(LSTM(32, input_shape = (400,3), return_sequences=True))

model.add(Dropout(0.5)) 
model.add(Dense(100, activation='relu')) 
model.add(Flatten())
#Then Here we have Dense Layer 
model.add(Dense(64, activation= 'relu')) 
model.add(Dropout(0.5)) 
model.add(Dense(3, activation='softmax'))

model.compile(optimizer=Adam(learning_rate = 0.001), loss = 'sparse_categorical_crossentropy', metrics=['accuracy'])
model.fit(x_train, y_train, epochs = 2, verbose=1, validation_data = (x_test, y_test)) 

Output

Train on 836 samples, validate on 209 samples
Epoch 1/2
836/836 [==============================] - 6s 7ms/step - loss: 1.1725 - accuracy: 0.3469 - val_loss: 1.0996 - val_accuracy: 0.3301
Epoch 2/2
836/836 [==============================] - 5s 6ms/step - loss: 1.0893 - accuracy: 0.3947 - val_loss: 1.1026 - val_accuracy: 0.2727
Sign up to request clarification or add additional context in comments.

2 Comments

it is perfectly working. Thank you so much. However, I am at basic level to use CNN. Therefore, I could not completely understand about model architecture specifically regarding parameter calculation? Would you please explain? Further, in output shape, what "None" indicates?
You basically start with a well know architectures like VGG, Resnet etc based on the problem complexity. Hyper parameter tuning is mostly an empirical problem, try out different parameters.None (first dimension) indicates the batch size.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.