ELF Shared Object in x86-64 Assembly language

Question

I'm trying to create a Shared library (*.so) in ASM and I'm not sure that i do it correct...

My code is:

    .section .data
    .globl var1
var1:
    .quad     0x012345

    .section .text
    .globl func1
func1:
    xor %rax, %rax
  # mov var1, %rcx       # this is commented
    ret

To compile it i run

gcc ker.s -g -fPIC -m64 -o ker.o
gcc ker.o -shared -fPIC -m64 -o libker.so

I can access variable var1 and call func1 with dlopen() and dlsym() from a program in C.

The problem is in variable var1. When i try to access it from func1, i.e. uncomment that line, the compiler generates an error:

/usr/bin/ld: ker.o: relocation R_X86_64_32S against `var1' can not be used when making a shared object; recompile with -fPIC
ker.o: could not read symbols: Bad value
collect2: ld returned 1 exit status

I don't understand. I've already compiled with -fPIC, so what's wrong?

Add a global function that returns the address of var1 instead. — Hans Passant
– Hans Passant, Commented Feb 18, 2012 at 13:23
@HansPassant: Actually there is already a pointer variable which holds the address of var1, it's called var1@GOTCREL and it's reachable with rip relative position independent code. — Gunther Piez
– Gunther Piez, Commented Feb 18, 2012 at 14:59
Related: Call a function in another object file without using PLT within a shared library? for calling functions in other object files (but not in other shared libraries). You also want .hidden for those — Peter Cordes
– Peter Cordes, Commented Sep 24, 2023 at 20:09

Peter Cordes · Accepted Answer · 2019-01-24 17:37:34Z

16

I've already compiled with -fPIC, so what's wrong?

That part of the error message is for people who are linking compiler-generated code.

You're writing asm by hand, so as datenwolf correctly wrote, when writing a shared library in assembly, you have to take care for yourself that the code is position independent.

This means file must not contain any 32-bit absolute addresses (because relocation to an arbitrary 64-bit base is impossible). 64-bit absolute relocations are supported, but normally you should only use that for jump tables.

mov var1, %rcx uses a 32-bit absolute addressing mode. You should normally never do this, even in position-dependent x86-64 code. The normal use-cases for 32-bit absolute addresses are: putting an address into a 64-bit register withmov $var1, %edi (zero-extends into RDI)
and indexing static arrays: mov arr(,%rdx,4), %edx

mov var1(%rip), %rcx uses a RIP-relative 32-bit offset. It's the efficient way to address static data, and compilers always use this even without -fPIE or -fPIC for static/global variables.

You have basically two possibilities:

Normal library-private static data, like C compilers will make for __attribute__((visibility("hidden"))) long var1;, same as for -fno-PIC.

.data
    .globl var1       # linkable from other .o files in the same shared object / library
    .hidden var1      # not visible for *dynamic* linking outside the library
var1:
    .quad     0x012345

.text
    .globl func1
func1:
    xor  %eax, %eax             # return 0
    mov  var1(%rip), %rcx   
    ret

full symbol-interposition-aware code like compilers generate for -fPIC.

You have to use the Global Offset Table. This is how a compiler does it, if you tell him to produce code for a shared library. Note that this comes with a performance hit because of the additional indirection.

See Sorry state of dynamic libraries on Linux for more about symbol-interposition and the overheads it imposes on code-gen for shared libraries if you're not careful about restricting symbol visibility to allow inlining.

var1@GOTPCREL is the address of a pointer to your var1, the pointer itself is reachable with rip-relative addressing, while the content (the address of var1) is filled by the linker during loading of the library. This supports the case where the program using your library defined var1, so var1 in your library should resolve to that memory location instead of the one in the .data or .bss (or .text) of your .so.
```
    .section .data
    .globl var1
    # without .hidden
var1:
    .quad     0x012345

    .section .text
    .globl func1
func1:
    xor %eax, %eax
    mov var1@GOTPCREL(%rip), %rcx
    mov (%rcx), %rcx
    ret
```

See some additional information at http://www.bottomupcs.com/global_offset_tables.html

An example on the Godbolt compiler explorer of -fPIC vs. -fPIE shows the difference that symbol-interposition makes for getting the address of non-hidden global variables:

movl $x, %eax 5 bytes, -fno-pie
leaq x(%rip), %rax 7 bytes, -fPIE and hidden globals or static with -fPIC
y@GOTPCREL(%rip), %rax 7 bytes and a load instead of just ALU, -fPIC with non-hidden globals.

Actually loading always uses x(%rip), except for non-hidden / non-static vars with -fPIC where it has to get the runtime address from the GOT first, because it's not a link-time constant offset relative to the code.

Related: 32-bit absolute addresses no longer allowed in x86-64 Linux? (PIE executables).

A previous version of this answer stated that the DATA and BSS segments could move relative to TEXT when loading a dynamic library. This is incorrect, only the library base address is relocatable. RIP-relative access to other segments within the same library is guaranteed to be ok, and compilers emit code that does this. The ELF headers specify how the segments (which contain the sections) need to be loaded/mapped into memory.

edited Jan 24, 2019 at 17:37

Peter Cordes

377k50 gold badges742 silver badges1k bronze badges

answered Feb 18, 2012 at 14:18

Gunther Piez

30.6k6 gold badges73 silver badges105 bronze badges

Sign up to request clarification or add additional context in comments.

7 Comments

zorgit Over a year ago

Well, first gives the same error except the relocation type R_X86_64_PC32 instead of R_X86_64_32S. Second gives an Error: junk 'CREL' after expression and Error: non-pc-relative relocation for pc-relative field. Removing CREL removes first error but not second.

Gunther Piez Over a year ago

Strange. I am able to assemble both examples I gave, using binutils-2.22. Do you have a old version of binutils maybe? The names of special symbols may have changed

zorgit Over a year ago

I have Debian Squeeze with binutils 2.20.1-16 and gcc version 4.4.5 (Debian 4.4.5-8).

Peter Cordes Over a year ago

Sorry for the intrusive edit, but I thought that was more useful to future readers than posting this as my own answer; it would take a long time for down/upvotes to bring the right answer to the top. I'm 100% certain that code can access .data / .bss variables without going through the GOT, because gcc does that for -fPIC for static int foo;. Thus we can be sure that the dynamic linker won't move segments relative to each other.

Gunther Piez Over a year ago

@PeterCordes No Problem, you vastly improved my answer

|

datenwolf · Accepted Answer · 2012-02-18 12:43:54Z

4

I don't understand. I've already compiled with -fPIC, so what's wrong?

-fPIC is a flag concerning the creation of machine code from non-machine code, i.e. which operations to use. In the compilation stage. Assembly is not compiled, though! Each assembly mnemonic maps directly to a machine instruction, your code is not compiled. It's just transcribed into a slightly different format.

Since you're writing it in assembly, your assembly code must be position independent to be linkable into a shared library. -fPIC has not effect in your case, because it only affects code generation.

answered Feb 18, 2012 at 12:43

datenwolf

163k13 gold badges197 silver badges316 bronze badges

1 Comment

zorgit Over a year ago

Yes, this is because of my bad english and differencies between it and my native language. In last only one word is commonly used for meaning 'to make a program' and i mechanically used it here...

zorgit · Accepted Answer · 2012-02-19 07:26:58Z

-1

Ok, i think i found something...

First solution from drhirsch gives almost the same error but the relocation type is changed. And type is always ended with 32. Why is it? Why 64 bit program uses 32-bit relocation?

I found this from googling: http://www.technovelty.org/code/c/relocation-truncated.html

It says:

For code optimisation purposes, the default immediate size to the mov instructions is a 32-bit value

So that's the case. I use 64-bit program but relocation is 32-bit and all i need is to force it to be 64 bit with movabs instruction.

This code is assembling and working (access to var1 from internal function func1 and from external C program via dlsym()):

    .section .data 
    .globl var1 
var1: 
    .quad     0x012345

    .section .text 
    .globl func1 
func1: 
    movabs var1, %rax       # if one is symbol, other must be %rax
    inc %rax
    movabs %rax, var1
    ret

But i'm in doubt about Global Offset Table. Must i use it, or this "direct" access is absolutely correct?

answered Feb 19, 2012 at 7:26

zorgit

3173 silver badges9 bronze badges

3 Comments

Gunther Piez Over a year ago

movabs will work, but this is the so called large code model. It is limited because you can use only rax and no other addressing modes.

Ivan Black Over a year ago

@hirschhornsalz, movabs works fine for %rsi and %rdx pastebin

Peter Cordes Over a year ago

@IvanBlack: you're using movabs $imm64, %r64, which is available for any register. This code is using the load-into-al/ax/eax/rax from 64-bit absolute address (moffs) form (felixcloutier.com/x86/mov), which only has one opcode for RAX, not for each possible register (8 opcodes, the REX.W form of 5-byte mov $imm32, %r32 with no ModRM byte).

Collectives™ on Stack Overflow

ELF Shared Object in x86-64 Assembly language

3 Answers 3

7 Comments

1 Comment

3 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

7 Comments

1 Comment

3 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related