I have basically two vectors one for a large number of elements and a second for a small number of probes used to sample data of the elements. I stumbled upon the question in which order to implement the two loops. Naturally I thought having the outer loop over the larger vector would be beneficially
Implementation 1:
for(auto& elem: elements) {
for(auto& probe: probes) {
probe.insertParticleData(elem);
}
}
However it seems that the second implementation takes only half of the time
Implementation 2:
for(auto& probe: probes) {
for(auto& elem: elements) {
probe.insertParticleData(elem);
}
}
What could be the reason for that?
Edit:
Timings were generated by the following code
clock_t t_begin_ps = std::clock();
... // timed code
clock_t t_end_ps = std::clock();
double elapsed_secs_ps = double(t_end_ps - t_begin_ps) / CLOCKS_PER_SEC;
and on inserting the elements data I do basically two things, testing if the distance to the probe is below a limit and the computing an average
probe::insertParticleData (const elem& pP) {
if (!isInside(pP.position())) {return false;}
... // compute alpha and beta
avg_vel = alpha*avg_vel + beta*pP.getVel();
return true;
}
To get an idea of the memory usage I have approx. 10k elements which are objects with 30 double data members. For the test I used 10 probes containing 15 doubles.
probe.insertParticleData(elem);does.