I know they say premature optimization is the root of all evil... but it's about that time.
I have a slow, but working procedure that performs the following operations:
- Read chunk (sequential) from file.input
- Transform chunk
- Write (append) transformed chunk to file.output
file.input and file.output end up being in the same size ballpark (10-100+ GB). A chunk is typically about 10K. The transform step is just a conversion between proprietary formats. For discussion's sake, we can consider it to be computationally on par with a real-time compression algorithm.
These steps are currently done in a single thread.
My question: How do I make this better performing?
I realize this will never get "fast" based on the pure volume of data being processed, but I have to believe there are some relatively simple and standard techniques to get this faster.
I've tried adding buffering to reading step (1). That is, reading in much larger blocks than the chunk size and reading from the buffer. This helped. However, I'm a bit stuck on the if there's anything that can be done for transform step (2) and the appending (3).
According to Resource Monitor, my CPU usage fluctuates between 30-45% and Disk I/O has some sustained periods of low usage.
I'm using C# with a a bunch of P/invoke interop to native libraries.