A legacy PHP system reads a huge log file (~5gb) directly to a variable in memory and doing some processing.
EDIT: About reading 5gb to memory being highly not recommended and other suggestions please trust that this has to stay the same due to some legacy design that we can not change.
Now I need to process the data by another service that takes max 1000 lines per call.
I have tried following two approaches and both are working.
1- Split the whole string at new line char into an array, then use array_chunk to split that array into sub-arrays and then take each sub array and implode to generate a string
$logFileStr; // a variable that already contains 5gb file as string
$logLines = explode(PHP_EOL, $logFileStr);
$lineGroups = array_chunk($logLines, 1000);
foreach($lineGroups as $lineGroup)
{
$linesChunk = implode(PHP_EOL, $lineGroup);
$archiveService->store($linesChunk);
}
Pros: it is fast as everything works in memory Cons: A lot of overwork involved & needs a lot of memory
2- initially write the contents of string variable to a local temp file. Then use exec function to split the file
split -l 1000 localfile
that produces a large number of files 1000 lines each. Then I can simply read the files recursively and process each file as a single string.
Pros: it is simpler and easier to maintain
Cons: Disk I/O gets involved which is slow and a lot of write read overhead
My question is, as I already have a variable with whole string in memory, how can I read chunks of 1000 lines each from that variable in an iteratable way so that I can avoid the writing to disk or generating a new array and re-merging overheads?