20

If you have a C file, compiled with a C compiler and have defined behavior for C but not C++, can you link it with a C++ file and not have undefined behavior?

in blah.c (the file compiled as C)

struct x {
    int blah;
    char buf[];
};

extern char * get_buf(struct x * base);
extern struct x * make_struct(int blah, int size);

blah_if.h

extern "C"
{
    struct x;

    char * get_buf(struct x * base);
    struct x * make_struct(int blah, int size);
}

some_random.cpp (Compiled with a C++ compiler)

#include "blah_if.h"

...

x * data=make_struct(7, 12);
std::strcpy(get_buf(data), "hello");

Is using the defined behavior in C's flexible array member in a file compiled with a C compiler, defined behavior when used by a file compiled as C++ and linked with the object from the C compiler?

Note that because a C compiler is used and struct x is opaque, this is different than:

Does extern C with C++ avoid undefined behavior that is legal in C but not C++?

9
  • As long as well defined source code went into the relevant compiler you're fine linking the object code. Commented Aug 7, 2015 at 16:45
  • 1
    @GlennTeitelbaum You should have improved your original question in that direction, instead of posting a new one. Commented Aug 7, 2015 at 16:46
  • @πάνταῥεῖ Can you and Drew Dormann please discuss, He said leave that question as extern C, you are saying edit thart question to be linkage There are two questions, extern "C" and linkage, and drew felt I asked the former and should leave it Commented Aug 7, 2015 at 16:48
  • 2
    I for one think it should be two separate questions as he is asking about two techniques. Commented Aug 7, 2015 at 16:49
  • 1
    @πάνταῥεῖ I did suggest the action that Glenn took. If you're interested, see my comments on the linked question. Commented Aug 7, 2015 at 16:51

2 Answers 2

19

The behavior is implementation-defined.

[dcl.link] Linkage from C++ to objects defined in other languages and to objects defined in C++ from other languages is implementation-defined and language-dependent.

It continues:

Only where the object layout strategies of two language implementations are similar enough can such linkage be achieved.

That sentence in the standard really should be an annotation, since it doesn't specify what counts as "similar enough".

Sign up to request clarification or add additional context in comments.

5 Comments

I think this is the correct answer, and at least the safest, but I can't imagine how the subroutine as compiled by the C compiler could break just because the entry was from c++ (with extern 'C' to use C calling conventions etc.)
odd that the language supports extern C for linkage, but doesn't count C as a defined linkage
I'm not sure this quote from the spec is relevant. I see linkage from C++ to functions written in C. The C++ spec requires this to work. I don't see any objects being linked.
@arx do you have the part of the spec that requires linkage from C++ to functions written in C to work, if so, that would be a good answer
Even if functions are exempt from the quoted section of [dcl.link], I don't see any requirement that struct x* in C and struct x* in C++ be identically-represented.
12

As Raymond has already said, this is implementation-defined at the formal, language level.

But it's important to remember what your compiled code is. It's not C++ code any more, nor is it C code. The rules about the behaviour of code in those languages apply to code written in those languages. They are taken into consideration during the parsing and translation process. But, once your code has been translated into assembly, or machine code, or whatever else you translated it to, those rules no longer apply.

So it's effectively meaningless to ask whether compiled C code has UB. If you had a well-defined C program, and compiled it, that's that. You're out of the realm of being able to discuss whether the compiled program is well-defined or not. It's a meaningless distinction, unless you have somehow managed to generate a program that is dictated to have UB by the specification for your assembly or machine language dialect.

The upshot of all this is that the premise of your question is unsound. You can't "avoid undefined behaviour" when linking to the compiled result of a C program, because the very notion of "undefined behaviour" does not exist there. But, as long as the original source code was well-defined when you translated it, you will be fine.

4 Comments

Eloquently put. Did you mean to imply that there can be no problems once you have reached the target language? Linking between different calling conventions probably qualifies as a variety of undefined behaviour
This describes one model of compilation, but not the only one. Another model is link-time code generation - under this model, source code gets compiled to an intermediate language, and it is at link time that actual assembly code gets generated. That intermediate language could have UB.
@RaymondChen: Where is this model defined, and is it within the scope of either C or C++?
@LightnessRacesinOrbit The standard does not address when code generation occurs. It is an implementation detail. Both gcc and Visual Studio support it. It sometimes goes by the name "whole program optimization".

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.