1

I'm using np.savez_compressed() to compress and save a single large 4D NumPy array, but it uses only one CPU core. Is there an alternative, which can use many cores? Preferably something simple without need to code array split and compression of its pieces in multiple processes.

15
  • 1
    numpy.org/doc/stable/reference/generated/… says the .npz format is a ZIPed archive, which implies the compression algorithm is DEFLATE, aka LZ77, same as gzip / zlib implement. It is possible to split data into blocks you compress separately with only a small loss of compression ratio, and still get a valid stream that gunzip can decompress (that's what pigz does; zlib.net/pigz), so in theory it should be possible without breaking file-format compatibility. Commented Oct 10, 2023 at 20:28
  • But if you're looking for speed, something based on zstd (en.wikipedia.org/wiki/Zstd) compresses about 10x faster than zlib (per core) with similar compression ratio for most data, and the standard implementation of it was designed with threading in mind, chunking data so separate threads can work on separate chunks. On an 8-core machine, optimistically a good implementation using that might be 80x faster than current np.savez_compressed, rather than just 8x. Commented Oct 10, 2023 at 20:34
  • (I have no idea if anyone's already written anything faster, but yes it should be very possible to do better than using a single thread to generate a ZIP file, with similar compression ratio.) Commented Oct 10, 2023 at 20:47
  • 1
    @PeterCordes Hmm, I had already tried zstd in their previous question (here) and with random data of 98% 1-bytes and 2% 2-bytes (similar to what they described), zstd was disappointing. At default level (3) its compression was ~20% worse, and at high enough level (13) to match the compression, it was about equally fast with two threads. Commented Oct 11, 2023 at 5:24
  • 1
    @PeterCordes Yes, I've used zstd in the past for real-world data and it was very good. If their data is still like 98% 1-bytes and 2% 2-bytes, maybe a simple custom preprocessing in numpy (like representing streaks of 1-bytes as a single byte telling the length) would be both very fast and compress very well. I'd need some code from them to generate realistic data, though. Commented Oct 11, 2023 at 6:01

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.