Efficiently streaming large documents as they are constructed is a common problem. SXSSFWorkbook provides a good framework using a sliding window model. As a practical matter, finding the optimal approach will require profiling in a particular context.
For reference below, I have updated the example cited to allow adjusting the window size. A variation of this example, used on the command line, can be used to asses the output. The result appears to work seamlessly on any network mounted volume, but the result may depend on the details of this vendor's offering. The experimental DeferredSXSSFWorkbook may be relevant going forward; @PJ Fanning elaborates here.
import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import org.apache.poi.ss.usermodel.Cell;
import org.apache.poi.ss.usermodel.Row;
import org.apache.poi.ss.util.CellReference;
import org.apache.poi.xssf.streaming.SXSSFWorkbook;
import org.junit.Assert;
/**
* @see https://stackoverflow.com/q/78456843/230513
*/
public class StreamTest {
private static final int N = 1000;
private static final int W = 10;
public static void main(String[] args) throws IOException {
// keep W rows in memory, N - W rows will be flushed to disk
var wb = new SXSSFWorkbook(W);
var sh = wb.createSheet();
for (int rownum = 0; rownum < N; rownum++) {
Row row = sh.createRow(rownum);
for (int cellnum = 0; cellnum < 10; cellnum++) {
Cell cell = row.createCell(cellnum);
String address = new CellReference(cell).formatAsString();
cell.setCellValue(address);
}
}
// Verify that W rows before N - W are flushed and inaccessible
for (int rownum = N - (2 * W); rownum < N - W; rownum++) {
Assert.assertNull(sh.getRow(rownum));
}
// The last W rows are still in memory
for (int rownum = N - W; rownum < N; rownum++) {
Assert.assertNotNull(sh.getRow(rownum));
}
var file = new File("sxssf.xlsx");
try (FileOutputStream out = new FileOutputStream(file)) {
wb.write(out);
} finally {
// dispose of temporary files backing this workbook on disk
wb.dispose();
}
}
}
SXSSFWorkbookmanages anXSSFWorkbook, so I'd expect the resultingxlsxfile to be a compressed zip file containing XML text.