PyTorch RuntimeError Invalid argument 2 of size

Question

I am experimenting with a neural network (PyTorch) and I get this error.

RuntimeError: invalid argument 2: size '[32 x 9216]' is invalid for input with 8192 elements at /pytorch/aten/src/TH/THStorage.cpp:84

My task is about image classification with AlexNet and I have backtracked the error to be the size of the images supplied to the neural network. My question is, given the network architecture with its parameters, how does one determine the correct image size required by the network?

As per my code below, I first transform the training images before feeding into the neural network. But I noticed the neural network can only accept the size of 224 and or else it gives the error above. For instance, my instinct was to apply transforms.RandomResizedCrop of size 64 but apparently this is wrong. Is there a formula to determine the size required?

Code

# transformation to be done on images
transform_train = transforms.Compose([
    transforms.RandomResizedCrop(64),
    transforms.RandomHorizontalFlip(),
    transforms.ToTensor(),
    transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
])

class AlexNet(nn.Module):

    def __init__(self, num_classes=1000):
        super(AlexNet, self).__init__()
        self.features = nn.Sequential(
            nn.Conv2d(3, 64, kernel_size=11, stride=4, padding=2),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=3, stride=2),
            nn.Conv2d(64, 192, kernel_size=5, padding=2),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=3, stride=2),
            nn.Conv2d(192, 384, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.Conv2d(384, 256, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.Conv2d(256, 256, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=3, stride=2),
        )
        self.classifier = nn.Sequential(
            nn.Dropout(),
            nn.Linear(256 * 6 * 6, 4096),
            nn.ReLU(inplace=True),
            nn.Dropout(),
            nn.Linear(4096, 4096),
            nn.ReLU(inplace=True),
            nn.Linear(4096, num_classes),
        )

    def forward(self, x):
        x = self.features(x)
        x = x.view(x.size(0), 256 * 6 * 6)
        x = self.classifier(x)
        return x

Rex Low · Accepted Answer · 2018-10-12 13:43:47Z

2

I have figured out the algorithm of getting the right input size.

Out = float(((W−F+2P)/S)+1)

where

Out = Output shape
W = Image volume size (image size)
F = Receptive field (filter size)
P = Padding
S = Stride

Factoring in the given network hyperparameters,

The require Image size I need would be

W = (55 - 1) * 4 - 2(2) + 11
  =  223
  ⩰  224

edited Oct 12, 2018 at 13:43

answered Oct 12, 2018 at 2:43

Rex Low

2,1972 gold badges23 silver badges50 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Akhilesh Pandey · Accepted Answer · 2018-10-12 05:30:46Z

1

The actual formula to calculate the output shape after convolution layer is:

out_size= floor((in_size + 2p -f)/s + 1)

answered Oct 12, 2018 at 5:30

Akhilesh Pandey

8961 gold badge8 silver badges19 bronze badges

Collectives™ on Stack Overflow

PyTorch RuntimeError Invalid argument 2 of size

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related