0

I want to write a hash function returning a hash from 3 integers a, b, c. I want to be able to choose the number of bits with which each integer is encoded and concatenate them. For instance:

a=60  (8 bits) -> 00111100
b=113 (8 bits) -> 01110001
c=5   (4 bits) -> 0101

should give

00111100011100010101

i.e. 20 bits.

Given a, b and c as integers (60, 113 and 5) and the number of bits allowed for each (8, 8 and 4), how can I get the hash, store it in a python object of the total size (20 bits), and write/load it to a file?

2
  • 1
    Anything you store to a file must be a multiple of 8 bits. If it isn't, you need to wait until you collect more bits, or pad it with some dummy data. Commented Apr 24, 2015 at 15:05
  • You can use this answer to a related question to read and write bits to a file. Commented Apr 24, 2015 at 15:29

2 Answers 2

0

I think this does what you want. It uses the referenced bitio module from another answer of mine to write/read the bits to/from a file.

Operating systems generally require files to be a multiple of 8 bits in size, so this would end up creating a 24-bit (3 byte) file to store a single 20-bit value. This 16.7% of overhead per 20-bit value wouldn't occur, of course, if you wrote several of them, one immediately after the another, and didn't call flush() until after the last.

import bitio  # see https://stackoverflow.com/a/10691412/355230

# hash function configuration
BW = 8, 8, 4  # bit widths of each integer
HW = sum(BW)  # total bit width of hash

def myhash(a, b, c):
    return (((((a & (2**BW[0]-1)) << BW[1]) |
                b & (2**BW[1]-1)) << BW[2]) |
                c & (2**BW[2]-1))

hashed = myhash(60, 113, 5)
print('{:0{}b}'.format(hashed, HW))  # --> 00111100011100010101

with open('test.bits', 'wb') as outf:
    bw = bitio.BitWriter(outf)
    bw.writebits(hashed, HW)
    bw.flush()

with open('test.bits', 'rb') as inf:
    br = bitio.BitReader(inf)
    val = br.readbits(HW)

print('{:0{}b}'.format(val, HW))  # --> 00111100011100010101
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks, that's useful. The file written is 3 bytes, the smallest number of bytes that can contain my 20 bits, which is what I need. Now myhash and readbits in your example return an int (24 bytes) which is huge for a 20bits object Is it possible to only allocate 3 bytes for this object, so that I can save a lot of memory? I asked another question here: stackoverflow.com/q/29894071/326849.
It's possible to only allocate 20 bits (2½ bytes) for these values. ;-) I've posted a way to do it as an answer to your other question.
0

Here's a class that will write an arbitrary number of bits to a file-like object. Call flush when done.

class bitwriter():
    def __init__(self, f):
        self.f = f
        self.bits = 0
        self.count = 0
    def write(self, value, bitcount):
        mask = (1 << bitcount) - 1
        self.bits = (self.bits << bitcount) | (value & mask)
        self.count += bitcount
        while self.count >= 8:
            byte = self.bits >> (self.count - 8)
            self.f.write(byte, 1)
            self.count -= 8
            self.bits &= (1 << self.count ) - 1
    def flush(self):
        if self.count != 0:
            byte = self.bits << (8 - count)
            self.f.write(byte, 1)
        self.bits = self.count = 0
        self.f.flush()

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.