1

I've a strange behaviour that I don't understand :

If I open my file , I find my bytes , but only once at a time :

f = open('d:\BB.ki', "rb")
f10 = re.findall( b'\x03\x00\x00\x10''(.*?)''\xF7\x00\xF0', f.read() )
print f10
['1BBBAAAABBBBAAAABBBBAAAABBBBAAAA\x00']

f = open('d:\BB.ki', "rb")
f11 = re.findall( b'\x03\x00\x00\x11''(.*?)''\xF7\x00\xF0', f.read() )
print f11
['2AAABBBBAAAABBBBAAAA\x00']

If I try to opening the file and getting severall bytes , I only get the 1st one (f11 is empty )

f = open('d:\BB.ki', "rb")
f10 = re.findall( b'\x03\x00\x00\x10''(.*?)''\xF7\x00\xF0', f.read() )
f11 = re.findall( b'\x03\x00\x00\x11''(.*?)''\xF7\x00\xF0', f.read() )
print f10,f11
['1BBBAAAABBBBAAAABBBBAAAABBBBAAAA\x00'] **[]**

May I use a loop , or something similar ?

Thanks

1
  • In addition to the answers below, you can always do f.seek(0) to reset the file stream pointer to the beginning of the file, and then the second read() will work :) Commented Jul 5, 2012 at 15:38

2 Answers 2

1

After you call f.read() there are no more bytes available to be read so a second call to f.read() will return an empty string. Store the result of f.read() instead of reading twice:

s = f.read()
f10 = re.findall( b'\x03\x00\x00\x10''(.*?)''\xF7\x00\xF0', s)
f11 = re.findall( b'\x03\x00\x00\x11''(.*?)''\xF7\x00\xF0', s) 

You may also want to scan the data just a single time, finding both expressions:

matches = re.findall( b'\x03\x00\x00[\x10\x11]''(.*?)''\xF7\x00\xF0', s)

If your file contains the bytes '\x03\x00\x00\x10\x03\x00\x00\x11_\xF7\x00\xF0' the method you proposed will find two overlapping matches (\x03\x00\x00\x11_ and _), whereas the single scan approach finds only a single match.

Sign up to request clarification or add additional context in comments.

1 Comment

Great ! the 1st solution is what I needed , THX :)
0

f.read() consumes the entire file. only f10 will seen.

try this maybe.

 for line in open('d:\BB.ki', "rb").readlines():
    f10 = re.findall( b'\x03\x00\x00\x10''(.*?)''\xF7\x00\xF0', line )
    f11 = re.findall( b'\x03\x00\x00\x11''(.*?)''\xF7\x00\xF0', line )

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.