Inline function call from inline function?

Question

This sounds too much of a simple question to not be answered already somewhere, but I tried to look around and I couldn't find any simple answer. Take the following example:

class vec
{
   double x;
   double y;
};

inline void sum_x(vec & result, vec & a, vec & b)
{
   result.x = a.x + b.x;
}

inline void sum(vec & result, vec & a, vec & b)
{
   sum_x(result, a, b);
   result.y = a.y + b.y;
}

What happens when I call sum and compile? Will both sum and sum_x be inlined, so that it will just translate to an inline assembly code to sum the two components?

This looks like a trivial example, but I am working with a vector class that has the dimensionality defined in a template, so iterating over operations on vectors looks a bit like this.

Try it and find out? And please stop writing tags in titles; as it happens, I removed them from a few of your older questions just yesterday. — Lightness Races in Orbit
– Lightness Races in Orbit, Commented May 19, 2015 at 10:43
This has nothing to do with your question.. but i would declare a and b as const — Quest
– Quest, Commented May 19, 2015 at 10:49
@LightnessRacesinOrbit I am very bad at interpreting assembly! I should make up my mind and study it once and for all. Sorry about the tags, thought it would make my titles more easily understandable. Won't happen any more! — Matteo Monti
– Matteo Monti, Commented May 19, 2015 at 10:56
@Quest yeah, definitely, they are const in my actual code ;) — Matteo Monti
– Matteo Monti, Commented May 19, 2015 at 10:56

laurisvr · Accepted Answer · 2015-05-19 11:12:12Z

2

inline is just a hint to the compiler. Whether the compiler actually inlines the function or not is a different question. For gcc there is an always inline attribute to force this.

 __attribute__((always_inline));

With always inlining you should achieve what you described (code generate as if it where written in one function).

However, with all the optimizations and transformations applied by compilers you can only be sure if you check the generated code (assembly)

edited May 19, 2015 at 11:12

laurisvr

2,8926 gold badges27 silver badges44 bronze badges

answered May 19, 2015 at 10:46

ted

5,0135 gold badges46 silver badges89 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Lightness Races in Orbit · Accepted Answer · 2015-05-19 10:45:13Z

1

Yes, inlining may be applied recursively.

The entire set of operations that you're performing here can be inlined at the call site.

Note that this has very little to do with your use of the inline keyword, which (other than its effect on the ODR — which can be very noticeable) is just a hint and nowadays mostly ignored for purposes of actually inlining. The functions will be inlined because your clever compiler can see that they are good candidates for it.

The only way you can actually tell whether it's doing this is to inspect the resulting assembly yourself.

answered May 19, 2015 at 10:45

Lightness Races in Orbit

387k77 gold badges670 silver badges1.1k bronze badges

Comments

TartanLlama · Accepted Answer · 2015-05-19 11:16:06Z

It depends. inline is just a hint to the compiler that it might want to think about inlining that function. It's entirely possible for a compiler to inline both calls, but that's up to the implementation.

As an example, here's some prettified assembly output from GCC with and without those inlines of this simple program:

int main()                                                                      
{           
  vec a;
  vec b;
  std::cin >> a.x;
  std::cin >> a.y;

  sum(b,a,a);
  std::cout << b.x << b.y;
    return 0;
}

With inlining:

main:
    subq    $40, %rsp
    leaq    16(%rsp), %rsi
    movl    std::cin, %edi
    call    std::basic_istream<char, std::char_traits<char> >& std::basic_istream<char, std::char_traits<char> >::_M_extract<double>(double&)
    leaq    24(%rsp), %rsi
    movl    std::cin, %edi
    call    std::basic_istream<char, std::char_traits<char> >& std::basic_istream<char, std::char_traits<char> >::_M_extract<double>(double&)
    movsd   24(%rsp), %xmm0
    movapd  %xmm0, %xmm1
    addsd   %xmm0, %xmm1
    movsd   %xmm1, 8(%rsp)
    movsd   16(%rsp), %xmm0
    addsd   %xmm0, %xmm0
    movl    std::cout, %edi
    call    std::basic_ostream<char, std::char_traits<char> >& std::basic_ostream<char, std::char_traits<char> >::_M_insert<double>(double)
    movsd   8(%rsp), %xmm0
    movq    %rax, %rdi
    call    std::basic_ostream<char, std::char_traits<char> >& std::basic_ostream<char, std::char_traits<char> >::_M_insert<double>(double)
    movl    $0, %eax
    addq    $40, %rsp
    ret
    subq    $8, %rsp
    movl    std::__ioinit, %edi
    call    std::ios_base::Init::Init()
    movl    $__dso_handle, %edx
    movl    std::__ioinit, %esi
    movl    std::ios_base::Init::~Init(), %edi
    call    __cxa_atexit
    addq    $8, %rsp
    ret

Without:

sum_x(vec&, vec&, vec&):
    movsd   (%rsi), %xmm0
    addsd   (%rdx), %xmm0
    movsd   %xmm0, (%rdi)
    ret
sum(vec&, vec&, vec&):
    movsd   (%rsi), %xmm0
    addsd   (%rdx), %xmm0
    movsd   %xmm0, (%rdi)
    movsd   8(%rsi), %xmm0
    addsd   8(%rdx), %xmm0
    movsd   %xmm0, 8(%rdi)
    ret
main:
    pushq   %rbx
    subq    $48, %rsp
    leaq    32(%rsp), %rsi
    movl    std::cin, %edi
    call    std::basic_istream<char, std::char_traits<char> >& std::basic_istream<char, std::char_traits<char> >::_M_extract<double>(double&)
    leaq    40(%rsp), %rsi
    movl    std::cin, %edi
    call    std::basic_istream<char, std::char_traits<char> >& std::basic_istream<char, std::char_traits<char> >::_M_extract<double>(double&)
    leaq    32(%rsp), %rdx
    movq    %rdx, %rsi
    leaq    16(%rsp), %rdi
    call    sum(vec&, vec&, vec&)
    movq    24(%rsp), %rbx
    movsd   16(%rsp), %xmm0
    movl    std::cout, %edi
    call    std::basic_ostream<char, std::char_traits<char> >& std::basic_ostream<char, std::char_traits<char> >::_M_insert<double>(double)
    movq    %rbx, 8(%rsp)
    movsd   8(%rsp), %xmm0
    movq    %rax, %rdi
    call    std::basic_ostream<char, std::char_traits<char> >& std::basic_ostream<char, std::char_traits<char> >::_M_insert<double>(double)
    movl    $0, %eax
    addq    $48, %rsp
    popq    %rbx
    ret
    subq    $8, %rsp
    movl    std::__ioinit, %edi
    call    std::ios_base::Init::Init()
    movl    $__dso_handle, %edx
    movl    std::__ioinit, %esi
    movl    std::ios_base::Init::~Init(), %edi
    call    __cxa_atexit
    addq    $8, %rsp
    ret

As you can see, GCC inlined both functions when asked to.

If your assembly is a bit rusty, simply note that sum is present and called in the second version, but not in the first.

score 0 · Accepted Answer · 2015-05-19 11:33:05Z

As mentioned, the inline keyword is just a hint. However, compilers do an amazing job here (even without your hints), and they do inline recursively.

If you're really interested in this stuff, I recommend learning a bit about compiler design. I've been studying it recently and it blew my mind what complex beasts our production-quality compilers are today.

About inlining, this is one of the things that compilers tend to do an extremely good job at. It was so by necessity, since if you look at how we write code in C++, we often write accessor functions (methods) just to do nothing more than return the value of a single variable. C++'s popularity hinged in large part on the idea that we can write this kind of code utilizing concepts like information hiding without being forced into creating software that is slower than its C-like equivalent, so you often found optimizers as early as the 90s doing a really good job at inlining (and recursively).

For this next part, it's somewhat speculative as I'm somewhat assuming that what I've been reading and studying about compiler design is applicable towards the production-quality compilers we're using today. Who knows exactly what kind of advanced tricks they're all applying?

... but I believe compilers typically inline code before you get to the kind of machine code level. This is because one of the keys to an optimizer is efficient instruction selection and register allocation. To do that, it needs to know all the memory (variables) the code is going to be working with inside a procedure. It wants that in a form that is somewhat abstract where specific registers haven't been chosen yet but are ready to be assigned. So inlining is usually done at this intermediate representation stage, before you get to the kind of assembly realm of specific machine instructions and registers, so that the compiler can gather up all that information before it does its magical optimizations. It might even apply some heuristics here to kind of 'try' inlining or unrolling away branches of code prior to actually doing it.

A lot of linkers can even inline code, and I'm not sure how that works. I think when they can do that, the object code is actually still in an intermediate representation form, still somewhat abstracted away from specific machine-level instructions and registers. Then the linker can still move that code between object files and inline it, deferring that code generation/optimization process until after.

Collectives™ on Stack Overflow

Inline function call from inline function?

4 Answers 4

Comments

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related