Same weights, implementation but different results n Keras and Pytorch

Question

I have an encoder and a decoder model (monodepth2). I try convert them from Pytorch to Keras using Onnx2Keras, but :

Encoder(ResNet-18) succeeds
I build the decoder myself in Keras (with TF2.3), and copy the weights (numpy array, including weight and bias) for each layer from Pytorch to Keras, without any modification.

But it turns out both Onnx2Keras-converted Encoder and self-built Decoder fails to reproduce the same results. The cross-comparison pictures are below, but I'd first introduce the code of Decoder.

First the core Layer, all the conv2d layer (Conv3x3, ConvBlock) is based on this, but different dims or add an activation:

# Conv3x3 (normal conv2d without BN nor activation)
# There's also a ConvBlock, which is just "Conv3x3 + ELU activation", so I don't list it here.
def TF_Conv3x3(input_channel, filter_num, pad_mode='reflect', activate_type=None):

    # Actually it's 'reflect, but I implement it with tf.pad() outside this
    padding = 'valid'  

    # if TF_ConvBlock, then activate_type=='elu
    conv = tf.keras.layers.Conv2D(filters=filter_num, kernel_size=3, activation=activate_type,
                                  strides=1, padding=padding)
    return conv

Then the structure. Note that the definition is EXACTLY the same as the original code. I think it must be some details about the implementation.

def DepthDecoder_keras(num_ch_enc=np.array([64, 64, 128, 256, 512]), channel_first=False,
                       scales=range(4), num_output_channels=1):
    num_ch_dec = np.array([16, 32, 64, 128, 256])
    convs = OrderedDict()
    for i in range(4, -1, -1):
        # upconv_0
        num_ch_in = num_ch_enc[-1] if i == 4 else num_ch_dec[i + 1]
        num_ch_out = num_ch_dec[i]

        # convs[("upconv", i, 0)] = ConvBlock(num_ch_in, num_ch_out)
        convs[("upconv", i, 0)] = TF_ConvBlock(num_ch_in, num_ch_out, pad_mode='reflect')


        # upconv_1
        num_ch_in = num_ch_dec[i]
        if i > 0:
            num_ch_in += num_ch_enc[i - 1]
        num_ch_out = num_ch_dec[i]
        convs[("upconv", i, 1)] = TF_ConvBlock(num_ch_in, num_ch_out, pad_mode='reflect')  # Just Conv3x3 with ELU-activation

    for s in scales:
        convs[("dispconv", s)] = TF_Conv3x3(num_ch_dec[s], num_output_channels, pad_mode='reflect')

    """
    Input_layer dims: (64, 96, 320), (64, 48, 160),  (128, 24, 80), (256, 12, 40), (512, 6, 20)
    """
    x0 = tf.keras.layers.Input(shape=(96, 320, 64))
    # then define the the rest input layers
    input_features = [x0, x1, x2, x3, x4]

    """
    # connect layers
    """
    outputs = []
    ch = 1 if channel_first else 3
    x = input_features[-1]
    for i in range(4, -1, -1):
        x = tf.pad(x, paddings=[[0, 0], [1, 1], [1, 1], [0, 0]], mode='REFLECT')
        x = convs[("upconv", i, 0)](x)
        x = [tf.keras.layers.UpSampling2D()(x)]
        if i > 0:
            x += [input_features[i - 1]]
        x = tf.concat(x, ch)
        x = tf.pad(x, paddings=[[0, 0], [1, 1], [1, 1], [0, 0]], mode='REFLECT')
        x = convs[("upconv", i, 1)](x)
    x = TF_ReflectPad2D_1()(x)
    x = convs[("dispconv", 0)](x)
    disp0 = tf.math.sigmoid(x)

    """
    build keras Model ([input0, ...], [output0, ...])
    """
    # decoder = tf.keras.Model(input_features, outputs)
    decoder = tf.keras.Model(input_features, disp0)

    return decoder

The cross-comparison is as follows... I would really appreciate it if anyone could offer some insights. Thanks!!!

Original results:

Original Encoder + Self-build Decoder:

ONNX-converted Enc + Original Dec (Texture is good, but the contrast is not enough, the car should be very close, i.e. very bright color):

ONNX-converted Enc + Self-built Dec:

dexter2406 · Accepted Answer · 2021-03-22 12:06:39Z

5

Solved!

It turns out there's indeed no problem with implementation (at least not significant ones). It's the problem with weights copying.

The original weights has (H, W, 3, 3), but TF-model requires dim of (3, 3, W, H), so I permuted it by [3,2,1,0], overlooking the (3, 3) also have their own sequence.

So it should be weights.permute([2,3,1,0]), and all is well!

answered Mar 22, 2021 at 12:06

dexter2406

4717 silver badges16 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Same weights, implementation but different results n Keras and Pytorch

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related