2

We are counting photons and time-tagging with this FPGA counter.We got about 500MB of data per minutes. I am getting 32bits of data in hex string *32-bit signed integers stored using little-endian byte order. Currently I am doing like:

def getall(file):
    data1 = np.memmap(file, dtype='<i4', mode='r')

    d0=0
    raw_counts=[]
    for i in data1:

        binary = bin(i)[2:].zfill(8)
        decimal = int(binary[5:],2)

        if binary[:1] == '1':
            raw_counts.append(decimal)

    counter=collections.Counter(raw_counts)
    sorted_counts=sorted(counter.items(), key=lambda pair: pair[0], reverse=False)
    return counter,counter.keys(),counter.values()

I think this part (binary = bin(i)[2:].zfill(8);decimal = int(binary[5:],2)) is slowing down the process. ( No it is not. I found out by profiling my program.) Is there any way to speed it up? So far I only need the binary bits from [5:]. I don't need all 32bits. So I think the parsing the 32bits to last 27bits is taking much of the time. Thanks,

*Update 1

J.F.Sebastian pointed me it is not in hex string.

*Update 2

Here is the final code if any one needs it. I ended up using np.unique instead of collection counter. At the end , I converted back to collection counter because I want to get accumulative counting.

#http://stackoverflow.com/questions/10741346/numpy-most-efficient-frequency-counts-for-unique-values-in-an-array
def myc(x):
    unique, counts = np.unique(x, return_counts=True)
    return np.asarray((unique, counts)).T


def getallfast(file):
    data1 = np.memmap(file, dtype='<i4', mode='r')
    data2=data1[np.nonzero((~data1 & (31 <<1)))] & 0x7ffffff #See J.F.Sebastian's comment.
    counter=myc(data2)
    raw_counts=dict(zip(counter[:,0],counter[:,1]))
    counter=collections.Counter(raw_counts)

    return counter,counter.keys(),counter.values()

However this one looks like the fastest version for me. data1[np.nonzero((~data1 & (31 <<1)))] & 0x7ffffff is slowing down compared to counting first and convert the data later binary = bin(counter[i,0])[2:].zfill(8)

def myc(x):
    unique, counts = np.unique(x, return_counts=True)
    return np.asarray((unique, counts)).T

def getallfast(file):
    data1 = np.memmap(file, dtype='<i4', mode='r')
    counter=myc(data1)
    xnew=[]
    ynew=[]
    raw_counts=dict()
    for i in range(len(counter)):
        binary = bin(counter[i,0])[2:].zfill(8)
        decimal = int(binary[5:],2)
        xnew.append(decimal)
        ynew.append(counter[i,1])
        raw_counts[decimal]=counter[i,1]


    counter=collections.Counter(raw_counts)
    return counter,xnew,ynew
7
  • 1
    Have you profiled it? Commented Oct 9, 2015 at 18:16
  • actually from what I have found converting it to a string is quite performant ... moreso than other methods ... (at least when taking multiple slices) Commented Oct 9, 2015 at 18:25
  • your code implies that the input is not "hex string". Your input contains 32-bit signed integers stored using little-endian byte order. To get the 27 least-significant bits, you could use bitwise operations: i & 0x7ffffff (to do it efficiently, use vectorized numpy operations). If you are doing everything right then you task should be I/O bound (limited by the speed of your hard disk where the input files are stored). Counter() is slow on Python 2. Commented Oct 9, 2015 at 21:14
  • Here's an example of vectorized bitwise numpy operations Commented Oct 9, 2015 at 21:22
  • 1
    @J.F.Sebastian You are right. My input is 32-bit signed integers stored using little-endian byte order. I will take a look into vectorized numpy. Thanks Commented Oct 9, 2015 at 22:35

1 Answer 1

1

I guess you could try one of these 2

could just take the bits with binary and fivebits=my_int&0x1f

if you want the five bits at the other end just fivebits = my_int >> (32-5)

but really in my experience converting it to a string is quite fast ... I thought that was a bottle neck many years ago ... after profiling it I found it wasnt

Sign up to request clarification or add additional context in comments.

1 Comment

it looks like OP wants (32-5) bits i.e., my_int & 0x7ffffff

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.