1

Suppose I'm working a library; say it's in C for simplicity. This library exposes and defines two functions:

  • f(), which calls an auxiliary function aux(), declared but not defined by my library.
  • g() is implemented with no calls to aux(), nor to f(), nor the use of any global data which refers to aux() etc. - that is, it is independent of aux().

The above is known to library users as well.

Now, a user writes a program of theirs which calls g(), and tries to link against my library. Annoyingly, they get:

lib/mylibso.1.0: undefined reference to `aux'

"Damn it!" cries the user, "why do I need to provide a reference to aux()? I don't want to provide you with that, I only use the aux-independent part of the library!"

what can the user do, and what can/should I do, as a library author, to facilitate their being able to link their program? (That is, other than splitting the library in two or making the compilation of f() conditional at compile-time.)

Notes:

  • If it matters, assume a recent GNU/Linux environment (but portable solutions/suggestions are preferable).
  • f() does not receive aux() as an argument, nor perform any complex voodoo to derive aux().
  • I am hoping for a solution involving the toolchain only (hence the title of this question...)
18
  • When you say an auxiliary function, is the mechanism something like qsort() which requires a callback function? Or was f() reliant on an aux() function which doesn't appear in any compilation unit? Commented Nov 12 at 19:09
  • It looks like you require users of f to have to know what library to link with to get the definition of aux or else I suppose that user would get the same linking error? If you don't want to split the f and g library in two, add a dependency to the library containing aux so that linking with the f and g library automatically links with the aux library. target_link_libraries(f_and_g_lib PUBLIC aux_lib) in cmake lingo. Commented Nov 12 at 19:23
  • @WeatherVane: If you have an answer for one of those two possibilties - it would be a good answer. Commented Nov 12 at 19:51
  • They are fundamentally different (prev comment was edited) because if you aren't calling f() with an argument providing the user's auxiliary function, then no callback function is needed. Is this an X-Y question? Commented Nov 12 at 19:54
  • @WeatherVane: Ah, I see what you mean. I'll clarify that point. But it could still be kind-of-like-qsort, except that the function is made available via symbol rather than via argument. Commented Nov 12 at 19:55

6 Answers 6

5

what can the user do, and what can/should I do, as a library author, to facilitate their being able to link their program? (That is, other than splitting the library in two or making the compilation of f() conditional at compile-time.)

If you don't want your shared library to depend on aux() then splitting out any aux()-dependent functions is exactly what you should do, protestations notwithstanding. Dependencies for such libraries should be viewed as properties of the library, not of individual functions within.

You do, however, have the alternative of building static libraries instead of shared ones. These are effectively just containers for object files, so if your f() and g() are defined in different source files, each compiled to its own object file, which is added to libmylib.a, then statically linking libmylib to get only a definition of g() will not incorporate f() or its dependency on aux() into your program.

On Linux, you also have the option of providing a weak, stub definition of aux() in your library, so that the symbol is always defined, but it will be superseded by the real aux() if that is linked in. You can make the stub fail loudly at runtime if you like, to help library users recognize when they need to link the additional library that provides the working definition of aux().

Or you can tell library users that, at least as long as they are using the GNU toolchain, they can link with -Wl,--allow-shlib-undefined to make the linker ignore the unresolved symbol. They will want lazy function resolution too (-z lazy), but this is the default. Of course, this will also cause the linker to ignore other unresolved symbols that perhaps it should not. And this is something library users have to do; you as library provider cannot do it for them.

Sign up to request clarification or add additional context in comments.

2 Comments

Is --allow-shlib-undefined limited to shared objects?
It applies only to building executables that link shared libraries, and I think it pertains only to the unresolved external symbols of the shared libraries that are linked, not to those of the executable itself. Executables that link shared libraries are necessarily shared objects themselves, so in that sense, yes, --allow-shlib-undefined is limited to shared objects. It is not needed for building shared libraries that are not executables.
4

You can compile the library that contains f and g with -ffunction-sections -fdata-sections to put each function in its own ELF section:

gcc -ffunction-sections -fdata-sections -c fg.c
ar rcs libfg.a fg.o

When linking, use Wl,--gc-sections to garbage collect unused sections:

gcc -o prog main.c -Wl,--gc-sections -L. -lfg

4 Comments

This doesn't work for me for a shared library such as seems to be the OP's focus. It does work for me for a static library, such as is shown here, but so does simply putting f() and g() into separate source files.
@JohnBollinger Yes, this only works if OP is willing to switch to a static library. You can put f and g in separate source files instead of compiling with -ffunction-sections -fdata-sections to get them in separate ELF sections for sure. The main thing is to link with --gc-sections, which is needed in both cases.
No, --gc-sections is not necessary in the static lib case, as long as the definitions of f and g are in different files within the lib. Unless instructed otherwise, the linker chooses only those objects from the lib that are necessary to resolve dependencies.
@JohnBollinger You are correct. I must have been unlucky when thinking :-)
3

tl;dr: Use a 'weak symbol' implementation

Some executable/object formats include support for weak symbols: Symbol definition with low priority, so that they don't clash with a definition provided elsewhere. See also this StackOverflow question:

What are weak functions and what are their uses? I am using an STM32F429 microcontroller

You wrote that a relevant platform for you is GNU/Linux; well, you're in luck, because executables on Linux use the ELF binary format; and ELF supports weak symbols. So, you could provide a dummy definition of the aux symbol within your library, that would be marked "weak". Either you modify your ELF to mark a definition as weak, using some relevant ELF-format tools (not trivial but should be possible), or use a compiler which itself supports setting this attribute on function definitions. And you are again in luck if you use GCC, since it lets you write:

#inclde <stdlib.h>

__weak int aux(int some_argument) // or whatever the signature is
{
  // fputs("Perhaps some kind of error message here\n", stderr);
  exit(EXIT_FAILURE);
}

Now, if your user doesn't provide an aux() implementation and doesn't call f() - the linker will plug the weak aux() into the implementation of f(), but that won't hurt anybody. And if, on the other hand, your user does provide an implementation of f() - that (non-weak, or strong) definition of the symbol will be preferred over the weak one.

This approach has the benefit of not burdening the user with needing special linking switches; nor to adapt to some plugin framework; nor to sacrifice potential optimization or code size benefits in compiling f() and g() in the same section.

The downside is that weakness-of-symbols is not expressible in standard C; and only a subset of platforms and toolchains support this.

Comments

1

The most portable and flexible way is to change the code to use a callback function.

In order to do this, change aux from a declared function with no definition, into a function pointer of the expected format, either initialized to NULL or to a default behavior function. Provide a separate setter function for it, so that the user may pass on a pointer to their defined function. Function pointers have strong type safety and so the user may only pass a function of the correct format.

Pseudo code example:

// header:
typedef void aux_t (void);

void set_aux (aux_t* a);

void f (void);
// .c file:

static void default_aux (void)
{ 
  // ...
}

static aux_t* aux = default_aux;

void set_aux (aux_t* a)
{
  aux = a;
}

void f (void)
{
  // f does not care that aux is a function pointer nor where it points
  aux();
}

Other ways:

Using a "weak linkage" default definition is another way to do it, but it is non-portable.

You could also consider splitting your library up in several files, where one of the files is optional to use.

Comments

0

As you provide no source code is it difficult to give you a point to the solution. I suggest you the following to get to the traitor:

  1. Provide a fake version for the aux() function, so the linker can continue linking your code.

  2. Run it in the debugger. Put a breakpoint on aux sy mbol.

  3. Run the code, when the debugger stops at aux() show a trace of the function calls to that point using the debugger command where.

This should help you find the path to the aux function without any imprecision.

If the aux() function is not called at all using the debugger, then there's another alternative that allows you to check why de linker decided to include de aux() function:

  • Generate a map file, for that just include a

    --map=your_program.map
    

    to the linker command, or using

    -Wl,--map=your_program.map
    

    if you invoke the linker through the compiler.

    The linker generates a list of library dependencies satisfied, indicating the module that triggered the dependency to be included. This will help without any chance of failing.

Remember, the linker never generates a dependency error if there''s no reference to be solved. The linker will never know anything about your aux() function if it is actually never mentioned in the main program. Si it must be mentioned somewhere.


By the way, if you want to put (as it seems to by your question) two sets of functions that are unrelated (f() -> aux() and unrelated g() -> nothing) then never put them in the same compilation unit (this means they will never be linked into a .so shared unit) use f.c and g.c You will only require aux() when you have selected f.c in your program.

The linker cannot subdivide a compilation unit to include only part of the functions. (and more when creating a C --this is not C++-- project) You need to structure your library to use compilation unit that will represent the sufficient granularity to allow you to do what you want. If you have included two functions in the compilation unit, they will be included both, or none of them. This is most probably where the unknown dependence appears. The language is flexible enough to allow you to subdivide the library into a set of different .o objects. If you are building a shared library, it is linked as a shared object (not a library archive, where the linker can select the appropiate compilation units to include only the ones referenced in the main program) and the linker doesn't know which functions will the program to be using the library will need, so the shared objects are loaded at runtime in full, as they are shared between all programs that use the library.

The linker doesn't know anything about the internal relationships of the functions included in a compilation library (this doesn't need to be a callable symbol, but a reference to a global variable shared by both functions, which makes impossible to detect where to split the compilation unit without knows how the source was compiled)

A solution would be to put f() in a separate compilation unit, so you will need to include the compilation unit providing aux() only in the case you have included also f() and put both in a .a library archive. This makes the linker to select from the .a archive only the objects (.o or .so) that are required to solve references.

It's not clear what you try to solve, because IMHO you are not understanding the use of a shared library. You cannot link it including only the functions you need in your exact program, you link all, because the library doesn't actually know how many programs will make use of it (the text and read only data of a library are shared between all processes that load that shared library, so there's no penalty on having it replicated to all complete, probably a different program making use of that library will use the function you try to evict)

3 Comments

"The linker will never know anything about your aux() function if it is actually never mentioned in the main program. " - I do believe that is not the true. The linker knows about aux() because it's mentioned as an unresolved symbol in the library, which my program is linked with. –
And that’s one reason for it to be mentioned. Only shared objects are linked properly if some symbols are missing. This is controlled by the linker -shared flag. But I think this is not desired in this case. Read my answer, which is complete in this sense.
You can have a static variable (invisible elsewhere, that is initialized with a pointer to aux, inside an inner block of g(). That will also be an undefined reference. If you put undefined references in a compilation unit, all those need to be solved by the linker. Divide yourself appropriately the compilation unit in two separate compilation units and you will see how a dependency on g() stops being a dependency on aux()
0

tl;dr: Don't do it, and tell the users to handle it.

Have you considered "Masterly Inactivity"?

Probably the simplest most portable way to handle your situation is to just leave things as they are, and simply tell your users they must define an aux() function to use the library. Tell users, in the library documentation, that if they are interested in f() but not in g(), they can and should define a trivial aux() function, and provide the source code for it, to make it extra-simple for them to do it.

Yes, this means a slight inconvenience for the user, but assuming your note regarding aux() is prominent enough, they will notice it and follow it.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.