0

This sounds too much of a simple question to not be answered already somewhere, but I tried to look around and I couldn't find any simple answer. Take the following example:

class vec
{
   double x;
   double y;
};

inline void sum_x(vec & result, vec & a, vec & b)
{
   result.x = a.x + b.x;
}

inline void sum(vec & result, vec & a, vec & b)
{
   sum_x(result, a, b);
   result.y = a.y + b.y;
}

What happens when I call sum and compile? Will both sum and sum_x be inlined, so that it will just translate to an inline assembly code to sum the two components?

This looks like a trivial example, but I am working with a vector class that has the dimensionality defined in a template, so iterating over operations on vectors looks a bit like this.

5
  • 1
    Try it and find out? And please stop writing tags in titles; as it happens, I removed them from a few of your older questions just yesterday. Commented May 19, 2015 at 10:43
  • 1
    This has nothing to do with your question.. but i would declare a and b as const Commented May 19, 2015 at 10:49
  • @LightnessRacesinOrbit I am very bad at interpreting assembly! I should make up my mind and study it once and for all. Sorry about the tags, thought it would make my titles more easily understandable. Won't happen any more! Commented May 19, 2015 at 10:56
  • @Quest yeah, definitely, they are const in my actual code ;) Commented May 19, 2015 at 10:56
  • @MatteoMonti: Me too! Commented May 19, 2015 at 11:07

4 Answers 4

2

inline is just a hint to the compiler. Whether the compiler actually inlines the function or not is a different question. For gcc there is an always inline attribute to force this.

 __attribute__((always_inline));

With always inlining you should achieve what you described (code generate as if it where written in one function).

However, with all the optimizations and transformations applied by compilers you can only be sure if you check the generated code (assembly)

Sign up to request clarification or add additional context in comments.

Comments

1

Yes, inlining may be applied recursively.

The entire set of operations that you're performing here can be inlined at the call site.

Note that this has very little to do with your use of the inline keyword, which (other than its effect on the ODR — which can be very noticeable) is just a hint and nowadays mostly ignored for purposes of actually inlining. The functions will be inlined because your clever compiler can see that they are good candidates for it.

The only way you can actually tell whether it's doing this is to inspect the resulting assembly yourself.

Comments

0

It depends. inline is just a hint to the compiler that it might want to think about inlining that function. It's entirely possible for a compiler to inline both calls, but that's up to the implementation.

As an example, here's some prettified assembly output from GCC with and without those inlines of this simple program:

int main()                                                                      
{           
  vec a;
  vec b;
  std::cin >> a.x;
  std::cin >> a.y;

  sum(b,a,a);
  std::cout << b.x << b.y;
    return 0;
}

With inlining:

main:
    subq    $40, %rsp
    leaq    16(%rsp), %rsi
    movl    std::cin, %edi
    call    std::basic_istream<char, std::char_traits<char> >& std::basic_istream<char, std::char_traits<char> >::_M_extract<double>(double&)
    leaq    24(%rsp), %rsi
    movl    std::cin, %edi
    call    std::basic_istream<char, std::char_traits<char> >& std::basic_istream<char, std::char_traits<char> >::_M_extract<double>(double&)
    movsd   24(%rsp), %xmm0
    movapd  %xmm0, %xmm1
    addsd   %xmm0, %xmm1
    movsd   %xmm1, 8(%rsp)
    movsd   16(%rsp), %xmm0
    addsd   %xmm0, %xmm0
    movl    std::cout, %edi
    call    std::basic_ostream<char, std::char_traits<char> >& std::basic_ostream<char, std::char_traits<char> >::_M_insert<double>(double)
    movsd   8(%rsp), %xmm0
    movq    %rax, %rdi
    call    std::basic_ostream<char, std::char_traits<char> >& std::basic_ostream<char, std::char_traits<char> >::_M_insert<double>(double)
    movl    $0, %eax
    addq    $40, %rsp
    ret
    subq    $8, %rsp
    movl    std::__ioinit, %edi
    call    std::ios_base::Init::Init()
    movl    $__dso_handle, %edx
    movl    std::__ioinit, %esi
    movl    std::ios_base::Init::~Init(), %edi
    call    __cxa_atexit
    addq    $8, %rsp
    ret

Without:

sum_x(vec&, vec&, vec&):
    movsd   (%rsi), %xmm0
    addsd   (%rdx), %xmm0
    movsd   %xmm0, (%rdi)
    ret
sum(vec&, vec&, vec&):
    movsd   (%rsi), %xmm0
    addsd   (%rdx), %xmm0
    movsd   %xmm0, (%rdi)
    movsd   8(%rsi), %xmm0
    addsd   8(%rdx), %xmm0
    movsd   %xmm0, 8(%rdi)
    ret
main:
    pushq   %rbx
    subq    $48, %rsp
    leaq    32(%rsp), %rsi
    movl    std::cin, %edi
    call    std::basic_istream<char, std::char_traits<char> >& std::basic_istream<char, std::char_traits<char> >::_M_extract<double>(double&)
    leaq    40(%rsp), %rsi
    movl    std::cin, %edi
    call    std::basic_istream<char, std::char_traits<char> >& std::basic_istream<char, std::char_traits<char> >::_M_extract<double>(double&)
    leaq    32(%rsp), %rdx
    movq    %rdx, %rsi
    leaq    16(%rsp), %rdi
    call    sum(vec&, vec&, vec&)
    movq    24(%rsp), %rbx
    movsd   16(%rsp), %xmm0
    movl    std::cout, %edi
    call    std::basic_ostream<char, std::char_traits<char> >& std::basic_ostream<char, std::char_traits<char> >::_M_insert<double>(double)
    movq    %rbx, 8(%rsp)
    movsd   8(%rsp), %xmm0
    movq    %rax, %rdi
    call    std::basic_ostream<char, std::char_traits<char> >& std::basic_ostream<char, std::char_traits<char> >::_M_insert<double>(double)
    movl    $0, %eax
    addq    $48, %rsp
    popq    %rbx
    ret
    subq    $8, %rsp
    movl    std::__ioinit, %edi
    call    std::ios_base::Init::Init()
    movl    $__dso_handle, %edx
    movl    std::__ioinit, %esi
    movl    std::ios_base::Init::~Init(), %edi
    call    __cxa_atexit
    addq    $8, %rsp
    ret

As you can see, GCC inlined both functions when asked to.

If your assembly is a bit rusty, simply note that sum is present and called in the second version, but not in the first.

Comments

0

As mentioned, the inline keyword is just a hint. However, compilers do an amazing job here (even without your hints), and they do inline recursively.

If you're really interested in this stuff, I recommend learning a bit about compiler design. I've been studying it recently and it blew my mind what complex beasts our production-quality compilers are today.

About inlining, this is one of the things that compilers tend to do an extremely good job at. It was so by necessity, since if you look at how we write code in C++, we often write accessor functions (methods) just to do nothing more than return the value of a single variable. C++'s popularity hinged in large part on the idea that we can write this kind of code utilizing concepts like information hiding without being forced into creating software that is slower than its C-like equivalent, so you often found optimizers as early as the 90s doing a really good job at inlining (and recursively).

For this next part, it's somewhat speculative as I'm somewhat assuming that what I've been reading and studying about compiler design is applicable towards the production-quality compilers we're using today. Who knows exactly what kind of advanced tricks they're all applying?

... but I believe compilers typically inline code before you get to the kind of machine code level. This is because one of the keys to an optimizer is efficient instruction selection and register allocation. To do that, it needs to know all the memory (variables) the code is going to be working with inside a procedure. It wants that in a form that is somewhat abstract where specific registers haven't been chosen yet but are ready to be assigned. So inlining is usually done at this intermediate representation stage, before you get to the kind of assembly realm of specific machine instructions and registers, so that the compiler can gather up all that information before it does its magical optimizations. It might even apply some heuristics here to kind of 'try' inlining or unrolling away branches of code prior to actually doing it.

A lot of linkers can even inline code, and I'm not sure how that works. I think when they can do that, the object code is actually still in an intermediate representation form, still somewhat abstracted away from specific machine-level instructions and registers. Then the linker can still move that code between object files and inline it, deferring that code generation/optimization process until after.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.