1

I am writing a python program which parses zip (currently only zlib, using DEFLATE compression) files and verifies the correctness of their headers and data. One of the things I'm trying to achieve is calculating the uncompressed size of a compressed (DEFLATE-d) file inside a zip archive, without actually uncompressing the file and, obviously, not relying on the uncompressed size field found in the file record's headers. This is so that I can ensure that none of the zip record's fields have been tampered with (in this case, the uncompressed size field).

I've gone through the ZIP specification (https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT) over and over but am in sort of a brain fart and don't see any way to do this without completely parsing the huffman trees and calculating the corresponding stream size, which is what I don't want to do. I will appreciate any idea or direction regarding how to do this.

To clarify, I'm not looking for a library\module to do this for me, rather a direction how it can be done.

Much thanks.

9
  • Read it more carefully. Just about every structure that describes a file has its uncompressed size. Commented Apr 6, 2015 at 22:04
  • @Blrfl Of course, but I'm intentionally not relying on that field - I want to calculate it myself and compare the result to the given uncompressed size (this can be an indicator of an invalid zip archive). Commented Apr 6, 2015 at 22:11
  • This might help: stackoverflow.com/questions/10908877/… I doubt you can do this without uncompress this to some place. Commented Apr 6, 2015 at 22:22
  • 1
    In other words, the person doing the compression must claim the values of checksums before and after compression, and you must verify those facts. Furthermore, if you do not trust the correctness of the decompressor code (i.e. it might harbor some defects that lead to arbitrary code execution), you simply can't do anything other than refusing any compressed data you don't trust. Commented Apr 6, 2015 at 22:37
  • 1
    If your question is indeed security-related, please (1) first, read a lot of articles to get an overall sense of it, (2) ask at security.stackexchange Commented Apr 6, 2015 at 22:38

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.