How do I get a function name in the symbol table to point to a different function?

Question

On MacOS Ventura, obtaining a handle to the dynamic loader using dlopen(NULL, 0) returns a handle containing the entire executable's symbol table. Using this handle, one can obtain pointers to symbol data, access and modify their contents, and these changes will permeate across the program. However, attempting this with functions pointers does not work the same way.

For example, the following code:

#include <dlfcn.h>
#include <stdio.h>
#include <stdlib.h>

int my_var = 0;

int main() {
void *handle = dlopen(NULL, 0);
int *a = dlsym(handle, "my_var");
*a = 5;
printf("%d", my_var);
return 0;
}

will print 5 instead of 0. However, when attempting something similiar with function pointers:

#include <dlfcn.h>
#include <stdio.h>
#include <stdlib.h>

typedef void (*ftype)(void);

void fun1() {
printf("a");
}

void fun2() {
printf("b");
}

int main() {
void *handle = dlopen(NULL, 0);
ftype f1ptr;
 *(void **)(&f1ptr) = dlsym(handle, "fun1");
f1ptr = fun2;
fun1();
return 0;
}

will print a instead of b. Is there anyway to perform an operation similar to the first code segment with function pointers? Essentially, I would like fun1 to now point to the code that fun2 points to. Swapping f1ptr's type to "ftype *" and then performing "*f1ptr = fun2" causes a bus fault.

Maybe call the function via the pointer? f1ptr(); (you are calling fun1 directly) You can't change the executable code once it's been loaded (because virus writers would love it if you could). — robthebloke
– robthebloke, Commented Jan 4, 2023 at 6:11
i see why you cannot change the executable code, but i would really just like the symbol "fun1" to refer to the same symbol table contents as the symbol "fun2" — jungon
– jungon, Commented Jan 4, 2023 at 6:20
The executable code is in the same block of memory as the symbol table. It is read-only. Exported variables are handled differently. Typically behind the scenes, most OS's will allocate some memory, and in the case above, splat the whole lot with a memset(0). The symbol pointer you request for that var will point into that allocation. That is why you can modify the variable. In theory you could mmap the base address with PROT_WRITE, and manually step to the function table. That might work on a microcontroller, but will probably throw SIGSEGV on macOS if you attempt to write there. — robthebloke
– robthebloke, Commented Jan 4, 2023 at 6:49
That makes sense. Is there any way to dereference f1ptr and set that value to fun2? — jungon
– jungon, Commented Jan 4, 2023 at 6:53
Your examples don't do the same thing. In your first program you change the value of my_var, not the address of it. Since the code of fun1 is read-only, you cannot change it. -- So, what do you really want to achieve? Please note that the call fun1() does not use the entry in the symbol table to find the entry point. — the busybee
– the busybee, Commented Jan 4, 2023 at 12:43

Eric Postpischil · Accepted Answer · 2023-01-04 12:44:29Z

However, attempting this with functions pointers does not work the same way.

An identifier names an object or a function. The identifier for a function is not a function pointer, and your example does not show the code working differently for objects or functions.

In int *a = dlsym(handle, "my_var");, your code obtains a pointer to my_var. Then it uses *a = 5; to change the value of my_var.

In *(void **)(&f1ptr) = dlsym(handle, "fun1");, your code obtains a pointer to fun1 (although it mishandles it, discussed below). So f1ptr is a pointer to fun1. It is a pointer to the function, not a pointer to a function pointer. If functions were somehow modifiable, then *f1ptr = …; would modify the function.

However, you do not use *f1ptr = …;. You use f1ptr = fun2;. This simply changes f1ptr. It does not change what f1ptr points to. It does not change fun1.

fun1 is an identifier for the function. It is not a pointer. So there is no pointer to change. Nothing can be done in the program to make fun1 be a different function or point to a different function.

Regarding why *(void **)(&f1ptr) = dlsym(handle, "fun1"); is wrong, this says to take the address of f1ptr, convert the address to a void **, and to use that memory location as if it were a void *. Then that memory location is assigned the value of the void * returned by dlsym. In other words, you are accessing the pointer-to-function object f1ptr using the type pointer to void. That violates C 2018 6.5 7, which says that an object shall be accessed only with its defined type or certain other types, none of which allowing accessing a pointer to a function with void *.

The C standard allows a pointer to a function and a pointer to void to have different representations in memory and even different sizes, in which case this assignment would fail “mechanically”; the bytes written into f1ptr would not be suitable for use as a pointer-to-function. But even in C implementations where all pointers have the same representation, this assignment can fall afoul of optimization by the compiler. It should never be used.

The proper way to assign the result of dlsym to f1ptr is to convert the result to the necessary type:

f1ptr = (ftype) dlsym(handle, "fun1");

(This still will not let you change fun1; it just shows you how to use dlsym correctly.)

Collectives™ on Stack Overflow

How do I get a function name in the symbol table to point to a different function?

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related