Keras Input_shape shape error

Question

When I run the following code:

from keras import models
from keras import layers
from keras import optimizers
model = models.Sequential()
model.add(layers.Dense(256, activation='relu', input_shape = (4, 4, 512)))
model.add(layers.Dropout(0.5))
model.add(layers.Dense(1, activation='sigmoid'))
model.compile(optimizer=optimizers.RMSprop(lr=2e-5),
                loss='binary_crossentropy',
                metrics=['acc'])
model.summary()
history = model.fit(train_features, train_labels,
                    epochs=30,
                    batch_size=20,
                    validation_data=(validation_features, validation_labels))

I get this error:

ValueError: Error when checking input: expected dense_40_input to have 2 dimensions, but got array with shape (2000, 4, 4, 512)

Here is the shape of training and validation data:

print(train_features.shape, train_labels.shape, validation_features.shape, validation_labels.shape)

Output:

(2000, 4, 4, 512) (2000,) (1000, 4, 4, 512) (1000,)

Whats happening here? My train and validation shape should be the same as what I just specified. Even when I change to input_dim = 4*4*512 I still get an error.

Output of model.summary():

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_42 (Dense)             (None, 4, 4, 256)         131328    
_________________________________________________________________
dropout_19 (Dropout)         (None, 4, 4, 256)         0         
_________________________________________________________________
dense_43 (Dense)             (None, 4, 4, 1)           257       
=================================================================
Total params: 131,585
Trainable params: 131,585
Non-trainable params: 0
_________________________________________________________________

My Keras version is 2.1.6.

What version of Keras are you using? Post the output of this code: import keras; print(keras.__version__). — today
– today, Commented Jul 11, 2018 at 17:00
model.summary(): Layer (type) Output Shape Param # =============================================================== dense_42 (Dense) (None, 4, 4, 256) 131328 dropout_19 (Dropout) (None, 4, 4, 256) 0 dense_43 (Dense) (None, 4, 4, 1) 257 Total params: 131,585 Trainable params: 131,585 Non-trainable params: 0 — Victor Zhang
– Victor Zhang, Commented Jul 11, 2018 at 18:06
Sorry, I'm new to this dont know how to make the output look good. — Victor Zhang
– Victor Zhang, Commented Jul 11, 2018 at 18:09

today · Accepted Answer · 2018-07-21 08:52:38Z

3

As you can see in the model summary the output shape of last layer is (None, 4, 4, 1) and since you have one single label for each sample therefore the output shape of last layer should be (None, 1) instead. So you must reshape the training data before feeding it to the network or flatten the output of first Dense layer (or maybe add a Reshape layer as the first layer).

Approach 1) Reshaping the training and validation data:

train_features = train_features.reshape((2000, -1))
validation_features = validation_features.reshape((1000, -1))

model = models.Sequential()
model.add(layers.Dense(256, activation='relu', input_dim=train_features.shape[-1]))
# ... the rest is the same

Approach 2) Adding a Flatten layer:

model = models.Sequential()
model.add(layers.Dense(256, activation='relu', input_shape = (4, 4, 512)))
model.add(layers.Flatten())
# ... the rest is the same

I recommend the first approach (unless you have good reasons for choosing the second approach), since according to the Dense layer documentation input of this layer with rank greater than 2 (i.e. 3D, 4D, etc.) is flattened before applying dot product. And considering that you apply another flatten operation in the second approach, this may be less efficient than feeding it with a 2D tensor directly (though, I have not confirmed this myself, it is just a wild guess!). It seems that the documentation is wrong and the input of Dense layer is not flattened, rather it is applied on the last axis.

As a side note: the error you got is a bit strange. I didn't get that when running your code on my machine. Instead, I got an error complaining about the output shape of the last layer being not compatible with the labels shape (which I addressed above).

edited Jul 21, 2018 at 8:52

answered Jul 11, 2018 at 22:19

today

33.6k8 gold badges100 silver badges122 bronze badges

Sign up to request clarification or add additional context in comments.

9 Comments

Victor Zhang Over a year ago

Thank you! But why would added a flatten layer after first layer help?

Victor Zhang Over a year ago

Plus, why would convert a (2000,4,4,512) feature to (2000,-1) be a good idea?

today Over a year ago

@ZhangXianhan Because you want to predict one value (i.e. label) for the each sample. Therefore the input shape of last Dense layer should be in the form (?, dim) in order for it to predict one value. Otherwise it would output a tensor of shape (4,4,1) for a training sample (as you can see in the model summary). This is not what you are looking for. You want it to output only one value (i.e. shape (1,)). Further, I am reshaping training data from (2000,4,4,512) to (2000, 4x4x512) (that -1 tells numpy to find the shape automatically). That way we don't need to use Flatten at all.

Victor Zhang Over a year ago

Got you, in order to use the last sigmoid function, the input cant have multiple dimensions. Thank you for the kind help!

today Over a year ago

@ZhangXianhan No, it is not related to sigmoid function (it can be applied on any tensor by computing it element-wise). It is related to the shape of labels. You have labels of shape (2000, 1) so the output of last layer should be also of shape (2000, 1) and not (2000, 4,4,1) (ignore those 2000; actually you should replace it with the batch size, I just used it for demonstrations). In other words if your labels had a shape of (2000, 4,4,1) then everything would be solved and your initial network would work without problems.

|

Benchur Wong · Accepted Answer · 2018-12-10 09:55:00Z

0

Before use Dense(), you should call Flatten(). the error is obvious, Dense net expect 2 dim while you fitted it with 4 dim.

model = models.Sequential()
model.add(Flattern(input_shape = (4, 4, 512)))
model.add(layers.Dense(256, activation='relu'))

answered Dec 10, 2018 at 9:55

Benchur Wong

2,6432 gold badges12 silver badges13 bronze badges

Collectives™ on Stack Overflow

Keras Input_shape shape error

2 Answers 2

9 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

9 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related