C++ should I loop and set an array or set all by hand for performance

Question

So as an java developer I've always known to avoid for loops but as new to C++ I'm wondering if due to lack of overhead it wont matter performance wise to use a for loop.

Example:

I'm making a matrix (4*4) class and I want to set all elements to 0. Should I use:

for(int i = 0; i < 16;i++)<br>
elements[i] = 0; <br>

or just set them all by hand

elements[0] = 0; elements [1] = 0; elements [1] = 0; etc....

When its only 16 elements this don't matter much for me to write I've just learned as a java developer that never use for loops if you don't have to. But in C++ its native and perhaps will be faster?

If you want to make sure, inline your code. Though usually such an optimisation is done by the compiler. This one is called loop unrolling — BeyelerStudios
– BeyelerStudios, Commented Aug 17, 2015 at 16:12
I've just learned as a java developer that never use for loops if you don't have to Who taught you to avoid for loops? Loops are an important part in programming as it can cut down on code — NathanOliver
– NathanOliver, Commented Aug 17, 2015 at 16:14
OK, that is some really bad advice you've received there. You should use a loop whenever looping is what you intend to do. Code for clarity first, optimize later. You should not avoid using loops and writing horrible repeated code before you've even profiled to see if you have a bottleneck due to the loop overhead (increment and comparison). Your specific example looks like an initialization. Are you doing this every frame or just once? If it's just once you shouldn't even bother about optimizing it. — odyss-jii
– odyss-jii, Commented Aug 17, 2015 at 16:15

Cheers and hth. - Alf · Accepted Answer · 2015-08-17 16:27:56Z

A raw array is not assignable (as a whole), but a struct with a raw array in it is assignable. And std::array is such a struct. Then you can just assign a default very zero instance, like this:

#include <array>

class Mat4
{
private:
    std::array<double, 16>  data_;
public:
    void clear() { data_ = {}; }        // ← how to clear.

    auto item( int const row, int const col ) const
        -> double
    { return data_[4*row + col]; }

    auto item( int const row, int const col )
        -> double&
    { return data_[4*row + col]; }

    Mat4() : data_() {}
};

#include <iostream>
using namespace std;

void display( Mat4 const& m )
{
    for( int row = 0; row < 4; ++row )
    {
        for( int col = 0; col < 4; ++ col )
        {
            cout << m.item( row, col ) << ' ';
        }
        cout << endl;
    }
}

auto main() -> int
{
    Mat4 m;
    for( int i = 0; i < 4; ++i )
        m.item( i, i ) = 1;
    display( m );
    m.clear();
    cout << endl;
    display( m );
}

There is no need to do loops, or std::fill, or (gulp) memset, or extraordinary verbose assignments to each array item.

Just assign a zeroed array, like above, and let the compiler deal with how to do that efficiently.

Amit · Accepted Answer · 2015-08-17 16:15:28Z

1

You can't rely on the optimizer to unfold a loop for you.

Having said that, this is one of the most bizarre situations I've seen regarding performance optimization motivation.

You'll be "paying" in code, which will result on more memory usage and potentially more cache faults in your CPU.

And as always, don't fix what isn't broken! Only optimize after you find you have a problem!

answered Aug 17, 2015 at 16:15

Amit

46.5k9 gold badges84 silver badges114 bronze badges

Comments

Fantastic Mr Fox · Accepted Answer · 2015-08-17 16:14:10Z

0

Using a static array like that, in all likeliness both methods are the same. Take a look at optimizations like loop unrolling.

https://gcc.gnu.org/onlinedocs/gcc-3.3.5/gcc/Optimize-Options.html

answered Aug 17, 2015 at 16:14

Fantastic Mr Fox

34.5k28 gold badges105 silver badges193 bronze badges

2 Comments

user2132977 Over a year ago

Well i guessed the compiler would unroll if it would be better, but i mean theoreticly. Then you say that yes unrolled is better?

ex-bart Over a year ago

Unrolled is not always better. If you unroll a loop your code size increases. So compilers will sometimes even "re-roll" loops that you have explicitly unrolled, if they notice that they can. In short: trust your compiler to do reasonable optimizations, until your profiler says otherwise.

Lawrence Aiello · Accepted Answer · 2015-08-17 16:14:59Z

0

Compilers will usually unroll the loop anyway (but this depends of course), so you should loop it. This will also make your code more readable and thus maintainable.

answered Aug 17, 2015 at 16:14

Lawrence Aiello

4,6785 gold badges23 silver badges39 bronze badges

Comments

Chris Beck · Accepted Answer · 2015-08-17 16:31:19Z

You can't really make an assumption like "unrolling the loop is always better". It will be very slightly faster to actually go through the loop, but then your actual list of chip instructions will be longer. Loading the program will be slower then, among other things. Think of it this way: if all of your function take 16 times as much space in the binary then you will get more cache misses just because of that during normal execution of the program. If you start compiling to a device with very little ram, or say that you start compiling the code into javascript via emscripten and your client is trying to JIT compile it in their web browser, in both cases you are surely better off if that loop is not unrolled, and if functions are compressed rather than inlined basically whenever possible.

If you really want effectively to "force" the compiler to inline the loop, wihtout making assumptions about what compiler options are passed, you could change it so that the loop body is a template function and the loop variable is a template parameter. But there's no reason that you should want that, you should let the compiler decide whether to inline the code or not -- if you do this in production code I expect that your colleagues would hit you :p

template<int i> void loop_body(int * array);
template<int i> void loop_body<i>(int * array) { array[i-1] = 0; loop_body<i-1>(array); }
template<> void loop_body<0>(int * array) {}

// use it like
loop_body<16>(elements);

Collectives™ on Stack Overflow

C++ should I loop and set an array or set all by hand for performance

5 Answers 5

Comments

Comments

2 Comments

Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

Comments

Comments

2 Comments

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related