0

Issue: I am encountering a kernel dies problem specifically during inference when using a quantized ResNet101 model in PyTorch. The model trains and quantized successfully, but the kernel dies when attempting to run inference on a test image. Here are the key details:

Error Message:

The Kernel crashed while executing code in the current cell or a previous cell. Please review the code in the cell(s) to identify a possible cause of the failure. Click here for more info. View Jupyter log for further details.

Jupyter Logs: [error] Disposing session as kernel process died ExitCode: undefined, Reason:

Observations:

  • Model Training: The model was trained with Quantization-Aware Training (QAT) and saved successfully.
  • Model Loading: The quantized model is loaded without any issues.

Code:

class QuantizedResNet101Classifier(nn.Module):
    def __init__(self, num_classes, pretrained=True):
        super(QuantizedResNet101Classifier, self).__init__()
        
        # Quantization stubs
        self.quant = torch.quantization.QuantStub()
        self.dequant = torch.quantization.DeQuantStub()
        
        # Load pretrained ResNet101
        self.backbone = models.resnet101(pretrained=pretrained)
        
        # Freeze backbone layers
        for param in self.backbone.parameters():
            param.requires_grad = False
        
        # Replace final fully connected layer
        num_ftrs = self.backbone.fc.in_features
        self.backbone.fc = nn.Sequential(
            nn.Linear(num_ftrs, 512),
            nn.ReLU(),
            nn.Dropout(0.3),
            nn.Linear(512, num_classes)
        )
    
    def forward(self, x):
        x = self.quant(x)
        x = self.backbone(x)
        x = self.dequant(x)
        return x

def prepare_model_for_qat(model):
    model.qconfig = torch.quantization.get_default_qat_qconfig('fbgemm')
    torch.quantization.prepare_qat(model, inplace=True)
    return model

# Load the model
model = QuantizedResNet101Classifier(num_classes=5)
model.eval()
model.qconfig = torch.quantization.get_default_qconfig('fbgemm')
prepared_model = torch.quantization.prepare(model)
quantized_model = torch.quantization.convert(prepared_model)
quantized_state_dict = torch.load('trained_quantized_model.pth')
quantized_model.load_state_dict(quantized_state_dict)

# Test the model
image = Image.open("../image.png").convert('RGB')
preprocess = transform = transforms.Compose([
            transforms.Resize((320, 320)),  
            transforms.ToTensor(),
])

input_tensor = preprocess(image)
input_batch = input_tensor.unsqueeze(0).to(device) 
quantized_model.eval()

with torch.no_grad():
    output = quantized_model(input_batch)

probabilities = torch.nn.functional.softmax(output, dim=1)[0]
predicted_index = probabilities.argmax().item()
index_to_label = {v: k for k, v in custom_dataset.class_labels.items()}
predicted_label = index_to_label[predicted_index]
print(f"Predicted label: {predicted_label}")

Environment:

  • PyTorch version: 2.5.1
  • PyTorch cuda version: 12.4
  • Python version: 3.12.7
  • Hardware: NVIDIA GeForce GTX 1650
  • Image Size: Images used are 320x320 RGB.

Debugging Attempts:

  • Verified the loaded state dictionary matches the model architecture.
  • Checked that the model can run in evaluation mode without quantization.

Questions:

  1. What could cause the kernel to crash during inference with the quantized model?
  2. Are there any specific debugging steps or configurations I should check for quantized inference?

Any insights or suggestions would be greatly appreciated!

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.