Just FWIW, gzip can be accessed randomly, if a previous index file has been created...
I've developed a command line tool that can quickly and (almost-)randomly access a gzip if an index is provided (if it is not provided, it is automatically created):
https://github.com/circulosmeos/gztool
gztool can be used to access chunks of the original gzip file, if those chunks are retrieved at the specific byte points the index has pointed to (-1 byte to be sure, because gzip is a stream of bits, not bytes), or better, after them.
For example if an index points starts (gztool -ll index.gzi provides this data) at compressed-byte 1508611 of the gzip file and we want 1M compressed bytes after that:
$ curl -r 1508610-2508611 https://example.com/db/backups/db.sql.gz > chunk.gz
- Note that
chunk.gz will occupy on disk only the chunk size!
- Also note that it is not a valid gzip file, as it is incomplete.
- Also take into account that we have retrieve from desired index-point-position minus 1 byte.
Now the complete index (previously only-once created: for example with gztool -i *.gz to create indexes for all your already gzipped files, or gztool -c * to both compress and create index) must also be retrieved. Note that indexes are ~0.3% of gzip size (or much smaller if gztool compresses the data itself).
$ curl https://example.com/db/backups/db.sql.gzi -o chunk.gzi
And now the extraction can be done with gztool. The corresponding uncompressed byte (or a byte passed that one) of compressed-1508610 must be known, but the index can show this info with gztool -ll. See examples here. Let's suppose it is byte 9009009. Or the uncompressed byte we want is just passed the corresponding first index point that is contained in chunk.gz. Let's suppose again that this byte would also be 9009009 for this case.
$ gztool -n 1508610 -b 9009009 chunk.gz > extracted_chunk.sql
gztool will stop extracting data when the chunk.gz file ends.
Maybe tricky, but would run without changing compression method nor already compressed files. But indexes would need to be created for them.
NOTES:
Another way to do the extraction without using the -n parameter is filling the gzip file with sparse zeroes: this is done for the example with a dd command before the first curl for retrieving the chunk.gz file, so:
$ dd if=/dev/zero of=chunk.gz seek=1508609 bs=1 count=0
$ curl -r 1508610-2508611 https://example.com/db/backups/db.sql.gz >> chunk.gz
$ curl https://example.com/db/backups/db.sql.gzi -o chunk.gzi
This way, the first 1508609 bytes of the file are zeroes, but they don't occupy space in disk. Without seek in dd command, the zeroes are all written to disk, which will be also valid for gzip, but this way we don't occupy unnecessary space on disk. Then, the gztool command doesn't need the -n parameter. The data zeroed is not needed because as the index exists, gztool will use it to jump to the index point just before the uncompressed 9009009 byte position, so all previous data is just ignored:
$ gztool -b 9009009 chunk.gz > extracted_chunk.sql