0

I have seemingly harmless while loop that goes through the result-set of a mysql query and compares the id returned from mysql, to one in a very large multidimensional array:

//mysqli query here
while($row = fetch_assoc())
{
    if(!in_array($row['id'], $multiDArray['dimensionOne']))
    {
        //do something
    }
}

When the script first executes, it is running through the results at about 2-5k per second. Sometimes more, rarely less. The result set brings back 7million rows, and the script peaks at 2.8GB of memory.

In terms of big data, this is not a lot.

The problem is, around the 600k mark, the loop starts to slow down, and by 800k, it is processing a few records a second.

In terms of server load and memory use, there are no issues.

This is behaviour I have noticed before in other scripts dealing with large data sets.

Is array seek time progressively slower as the internal pointer moves deeper?

11
  • Does this occur with an empty loop (i.e are you allocating any memory within the loop where you do something) Commented Mar 5, 2013 at 16:59
  • It gets slower because you are using so much memory. You need to cleanup as much as possible at the end of each iteration. You may even need to pull results back in batches. Commented Mar 5, 2013 at 17:00
  • @datasage I would have assumed memory if it peaked somewhere in the middle of the loop, but the memory peak happens way before the while loop executes. No new memory (non-trivial) is allocated during the loop. Commented Mar 5, 2013 at 17:05
  • @will No non-trivial memory allocation is made during the loop. Commented Mar 5, 2013 at 17:05
  • 1
    Are you sure there is no problem with server memory use when this function is running? They symptoms very much point to swapping (at the 600K mark) and then possibly capping-out your swap partition at the 800K mark Commented Mar 5, 2013 at 17:28

1 Answer 1

2

That really depends on what happens inside the loop. I know you are convinced it's not a memory issue but it looks like one. Program usually get very slow when system tries to get extra RAM by using SWAP. Using hard drive is obviously very slow and that's what you might be experiencing. It's very easy to benchmark it.

In one terminal run

vmstat 3 100

Run you scrip and observe vmstat. Look into IO and SWAP. If that is really not the case then profile execution with XDEBUG. It might be tricky because you do many iterations and this will also cause major IO.

Sign up to request clarification or add additional context in comments.

1 Comment

I will experiment and let you know.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.