Error while converting quantized Torch model to ONNX

Ask Question

Asked 2 months ago

Modified 2 months ago

Viewed 33 times

I’m applying QAT to YOLOv8n model with the following configuration:

QConfig(
    activation=FakeQuantize.with_args(
        observer=MovingAverageMinMaxObserver,
        quant_min=0,
        quant_max=255,
        dtype=torch.quint8,
        qscheme=torch.per_tensor_affine,
        averaging_constant=0.005,
        reduce_range=False
    ),
    weight=FakeQuantize.with_args(
        observer=PerChannelMinMaxObserver,
        quant_min=-127,
        quant_max=127,
        dtype=torch.qint8,
        qscheme=torch.per_channel_symmetric,
        ch_axis=0
    )
)

With backend set to qnnpack:

torch.backends.quantized.engine = "qnnpack"

Target device only supports ONNX format, so I have to convert the model (after QAT) to ONNX.

To do so, I am using the following procedure:

    def save(self, quantized_onnx_path: str):
        import torch.ao.quantization as quant

        self.quantized_model.eval()
        torch.backends.quantized.engine = 'qnnpack'
        self.quantized_model.apply(.apply(torch.ao.quantization.disable_observer))
        model_to_export = quant.convert(self.quantized_model.cpu(), inplace=False)

        dummy_input = torch.randn(1, 3, 25, 256).cpu()
        
        torch.onnx.export(
            model_to_export,
            dummy_input,
            quantized_onnx_path,
            opset_version=13, 
            input_names=['images'],
            output_names=['output'],
            dynamic_axes = {
                'images' : {0 : 'batch_size'},
                'output' : {0 : 'batch_size'}
            }
        )

But I keep getting this error during export:

NotImplementedError: Could not run 'quantized::conv2d.new' with arguments from the 'CPU' backend.

This could be because the operator doesn't exist for this backend... On the official Torch documentation page, I’ve seen that it might occour because input is not quantized, but even by wrapping the model inside the suggested class:

class QuantWrapper(torch.nn.Module):
    def __init__(self, model):
        super().__init__()
        self.quant = torch.ao.quantization.QuantStub()
        self.model = model
        self.dequant = torch.ao.quantization.DeQuantStub()

    def forward(self, x):
        x = self.quant(x)      
        x = self.model(x)       
        x = self.dequant(x) 
        return x

model_to_export = QuantWrapper(model_to_export)

the error remains the same.

How can I solve?

asked Sep 5 at 14:39

Matteo

1113 bronze badges

Add a comment |

0 Your Answer

Sign up or log in

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

Collectives™ on Stack Overflow

Error while converting quantized Torch model to ONNX

0

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

0

Know someone who can answer? Share a link to this question via email, Twitter, or Facebook.

Your Answer

Sign up or log in

Post as a guest