How can I implement Pytorch Linear layer forward calculation precisely?

Question

I am trying to implement F.linear function from scratch using Pytorch, but it returns a slightly different value from Pytorch library function. In my case, the difference is about 1e-5 to 1e-9. 1e-5 is a too-large error when implementing Neural Networks for vision tasks.

How can I simulate F.linear without the errors? The below code is what I have tried so far. It evaluates the error. I would really grateful if you help me solve the problem.

input = torch.randn(64,7,7, 120).to('cuda:0')
weight= torch.randn(360,120).to('cuda:0')
pytorch_output =  F.linear(input, weight)

B,H,W,_ = input.shape
BHW =  B*H*W
Cout,_ = weight.shape

input = input.flatten(end_dim = -2)
my_output = torch.empty(BHW,Cout).to('cuda:0')
for i in range(BHW):
    u = input[i,:]*weight
    my_output[i,:] = u.transpose(0,1).sum(0)
my_output = my_output.reshape(B,H,W,-1)
diff = my_output - pytorch_output
diff.max()

Try using optimal matrix operations from torch, numpy, etc. (e. g. numpy.dot, torch.matmul) — K0mp0t
– K0mp0t, Commented Apr 17, 2023 at 11:15

TanjiroLL · Accepted Answer · 2023-04-17 11:25:57Z

The difference is due to float32 precision, to see this, try to use double precision as follows:

import torch
from torch.nn import functional as F
dtype = torch.double
input = torch.randn(64,7,7, 120, device='cuda:0', dtype=dtype)
weight= torch.randn(360,120, device='cuda:0', dtype=dtype)
pytorch_output =  F.linear(input, weight)

B,H,W,_ = input.shape
BHW =  B*H*W
Cout,_ = weight.shape

input = input.flatten(end_dim = -2)
my_output = torch.empty(BHW,Cout, device='cuda:0', dtype=dtype)
for i in range(BHW):
    u = input[i,:]*weight
    my_output[i,:] = u.transpose(0,1).sum(0)
my_output = my_output.reshape(B,H,W,-1)
diff = my_output - pytorch_output
diff.max(), torch.allclose(my_output, pytorch_output)

The difference of this order can be ignored in practice, I am not sure which CV application will suffer from such difference. You can refer to this link https://discuss.pytorch.org/t/numerical-difference-in-matrix-multiplication-and-summation/28359

Collectives™ on Stack Overflow

How can I implement Pytorch Linear layer forward calculation precisely?

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related