3

I am trying to use the crc32_combine function from zlib in Python. Although various other zlib functions are available, this one isn't part of the "batteries included" standard library. I've tried two approaches: a port from the C code to Python and calling zlib from Python with ctypes. Both give me different results, although not the result I'm expecting. I'm presenting the ctypes code since I think this executes faster and has a smaller chance for additional programmer errors.

The algorithm can combine two CRC32 hashes when the length of the data of the second hash is provided. crc32_combine is defined as follows:

crc32(crc32(0, seq1, len1), seq2, len2) == crc32_combine(
    crc32(0, seq1, len1), crc32(0, seq2, len2), len2)

This is the output:

Expected CRC: 45E57586
Combined CRC: 567EE4E4

The second line is always different when ran with Python 3.5.1 on win32. Not with Python 2, but the result is never what I expect either. Put the zlib1.dll in the same directory as the script to try it out.

import zlib

def crc32_combine_ctypes(crc1, crc2, len2):
    import ctypes
    from ctypes import util

    lib = util.find_library('zlib1')
    _zlib = ctypes.CDLL(lib)
    assert _zlib._name, "Can't find zlib"

    _zlib.crc32_combine.argtypes = [
        ctypes.c_ulong, ctypes.c_ulong, ctypes.c_ulong]
    _zlib.crc32_combine.restype = ctypes.c_ulong

    return _zlib.crc32_combine(crc1, crc2, len2)

testfile = "zlib1.dll"

with open(testfile, "rb") as tf:
    data = tf.read()

print("Expected CRC: %0.8X" % (zlib.crc32(data) & 0xFFFFFFFF))

cut = len(data) // 2 - 100
p1 = data[0:cut]
p2 = data[cut:]

crc1 = zlib.crc32(p1)
crc2 = zlib.crc32(p2)
len1 = len(p1)
len2 = len(p2)

combined = crc32_combine_ctypes(crc1, crc2, len2)
print("Combined CRC: %0.8X" % (combined & 0xFFFFFFFF))

What am I doing wrong?

3
  • 2
    There's something wrong with this 32-bit build of zlib1.dll. On my own 64-bit build (of just this function), the combined result matches the expected result. To build the DLL, I downloaded the source from your link, copied out the definitions of crc32_combine, gf2_matrix_times, and gf2_matrix_square from crc32.c, and built it as a 64-bit DLL. Commented Feb 8, 2016 at 0:24
  • This 32-bit build also works as expected. Commented Feb 8, 2016 at 0:49
  • The bad DLL was the problem! Somehow I didn't follow/see the other links on the zlib home page and picked out the bad one :) Commented Feb 9, 2016 at 21:09

1 Answer 1

8

eryksun had the right idea: I used a bad and old DLL. The last zlib version with a 32 bit dll included: https://sourceforge.net/projects/libpng/files/zlib/1.2.8/

My port to pure Python code is a couple hundred times slower than calling the library with ctypes. (numbers from using timeit with 1k iterations and 50m as length parameter)

31.729 (function provided below)
 0.120 (just the _zlib.crc32_combine() call: no library loading included)

The pure Python code:

def crc32_combine(crc1, crc2, len2):
    """Explanation algorithm: https://stackoverflow.com/a/23126768/654160
    crc32(crc32(0, seq1, len1), seq2, len2) == crc32_combine(
        crc32(0, seq1, len1), crc32(0, seq2, len2), len2)"""
    # degenerate case (also disallow negative lengths)
    if len2 <= 0:
        return crc1

    # put operator for one zero bit in odd
    # CRC-32 polynomial, 1, 2, 4, 8, ..., 1073741824
    odd = [0xedb88320] + [1 << i for i in range(0, 31)]
    even = [0] * 32

    def matrix_times(matrix, vector):
        number_sum = 0
        matrix_index = 0
        while vector != 0:
            if vector & 1:
                number_sum ^= matrix[matrix_index]
            vector = vector >> 1 & 0x7FFFFFFF
            matrix_index += 1
        return number_sum

    # put operator for two zero bits in even - gf2_matrix_square(even, odd)
    even[:] = [matrix_times(odd, odd[n]) for n in range(0, 32)]

    # put operator for four zero bits in odd
    odd[:] = [matrix_times(even, even[n]) for n in range(0, 32)]

    # apply len2 zeros to crc1 (first square will put the operator for one
    # zero byte, eight zero bits, in even)
    while len2 != 0:
        # apply zeros operator for this bit of len2
        even[:] = [matrix_times(odd, odd[n]) for n in range(0, 32)]
        if len2 & 1:
            crc1 = matrix_times(even, crc1)
        len2 >>= 1

        # if no more bits set, then done
        if len2 == 0:
            break

        # another iteration of the loop with odd and even swapped
        odd[:] = [matrix_times(even, even[n]) for n in range(0, 32)]
        if len2 & 1:
            crc1 = matrix_times(odd, crc1)
        len2 >>= 1

        # if no more bits set, then done
    # return combined crc
    crc1 ^= crc2
    return crc1
Sign up to request clarification or add additional context in comments.

2 Comments

To calculate the CRC, (simple CRC calculation) is this enough? CRC ^= 0x741B8CD7
I mean for showing the pseudo-code of simple CRC calculation

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.