Kernel Dies When Testing a Quantized ResNet101 Model in PyTorch

Ask Question

Asked 11 months ago

Modified 11 months ago

Viewed 26 times

Issue: I am encountering a kernel dies problem specifically during inference when using a quantized ResNet101 model in PyTorch. The model trains and quantized successfully, but the kernel dies when attempting to run inference on a test image. Here are the key details:

Error Message:

The Kernel crashed while executing code in the current cell or a previous cell. Please review the code in the cell(s) to identify a possible cause of the failure. Click here for more info. View Jupyter log for further details.

Jupyter Logs: [error] Disposing session as kernel process died ExitCode: undefined, Reason:

Observations:

Model Training: The model was trained with Quantization-Aware Training (QAT) and saved successfully.
Model Loading: The quantized model is loaded without any issues.

Code:

class QuantizedResNet101Classifier(nn.Module):
    def __init__(self, num_classes, pretrained=True):
        super(QuantizedResNet101Classifier, self).__init__()
        
        # Quantization stubs
        self.quant = torch.quantization.QuantStub()
        self.dequant = torch.quantization.DeQuantStub()
        
        # Load pretrained ResNet101
        self.backbone = models.resnet101(pretrained=pretrained)
        
        # Freeze backbone layers
        for param in self.backbone.parameters():
            param.requires_grad = False
        
        # Replace final fully connected layer
        num_ftrs = self.backbone.fc.in_features
        self.backbone.fc = nn.Sequential(
            nn.Linear(num_ftrs, 512),
            nn.ReLU(),
            nn.Dropout(0.3),
            nn.Linear(512, num_classes)
        )
    
    def forward(self, x):
        x = self.quant(x)
        x = self.backbone(x)
        x = self.dequant(x)
        return x

def prepare_model_for_qat(model):
    model.qconfig = torch.quantization.get_default_qat_qconfig('fbgemm')
    torch.quantization.prepare_qat(model, inplace=True)
    return model

# Load the model
model = QuantizedResNet101Classifier(num_classes=5)
model.eval()
model.qconfig = torch.quantization.get_default_qconfig('fbgemm')
prepared_model = torch.quantization.prepare(model)
quantized_model = torch.quantization.convert(prepared_model)
quantized_state_dict = torch.load('trained_quantized_model.pth')
quantized_model.load_state_dict(quantized_state_dict)

# Test the model
image = Image.open("../image.png").convert('RGB')
preprocess = transform = transforms.Compose([
            transforms.Resize((320, 320)),  
            transforms.ToTensor(),
])

input_tensor = preprocess(image)
input_batch = input_tensor.unsqueeze(0).to(device) 
quantized_model.eval()

with torch.no_grad():
    output = quantized_model(input_batch)

probabilities = torch.nn.functional.softmax(output, dim=1)[0]
predicted_index = probabilities.argmax().item()
index_to_label = {v: k for k, v in custom_dataset.class_labels.items()}
predicted_label = index_to_label[predicted_index]
print(f"Predicted label: {predicted_label}")

Environment:

PyTorch version: 2.5.1
PyTorch cuda version: 12.4
Python version: 3.12.7
Hardware: NVIDIA GeForce GTX 1650
Image Size: Images used are 320x320 RGB.

Debugging Attempts:

Verified the loaded state dictionary matches the model architecture.
Checked that the model can run in evaluation mode without quantization.

Questions:

What could cause the kernel to crash during inference with the quantized model?
Are there any specific debugging steps or configurations I should check for quantized inference?

Any insights or suggestions would be greatly appreciated!

asked Dec 13, 2024 at 5:28

Pavan Pandya

12 bronze badges

Add a comment |

0 Your Answer

Sign up or log in

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

Collectives™ on Stack Overflow

Kernel Dies When Testing a Quantized ResNet101 Model in PyTorch

0

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

0

Know someone who can answer? Share a link to this question via email, Twitter, or Facebook.

Your Answer

Sign up or log in

Post as a guest