Segmentation fault when calling x86 Assembly function from C program

Question

I am writing a C program that calls an x86 Assembly function which adds two numbers. Below are the contents of my C program (CallAssemblyFromC.c):

#include <stdio.h>
#include <stdlib.h>

int addition(int a, int b);

int main(void) {
    int sum = addition(3, 4);
    printf("%d", sum);
    return EXIT_SUCCESS;
}

Below is the code of the Assembly function (my idea is to code from scratch the stack frame prologue and epilogue, I have added comments to explain the logic of my code) (addition.s):

.text

# Here, we define a function addition
.global addition
addition:
    # Prologue:
    # Push the current EBP (base pointer) to the stack, so that we
    # can reset the EBP to its original state after the function's
    # execution
    push %ebp
    # Move the EBP (base pointer) to the current position of the ESP
    # register
    movl %esp, %ebp

    # Read in the parameters of the addition function
    # addition(a, b)
    #
    # Since we are pushing to the stack, we need to obtain the parameters
    # in reverse order:
    # EBP (return address) | EBP + 4 (return value) | EBP + 8 (b) | EBP + 4 (a)
    #
    # Utilize advanced indexing in order to obtain the parameters, and
    # store them in the CPU's registers
    movzbl 8(%ebp), %ebx
    movzbl 12(%ebp), %ecx

    # Clear the EAX register to store the sum
    xorl %eax, %eax
    # Add the values into the section of memory storing the return value
    addl %ebx, %eax
    addl %ecx, %eax

I am getting a segmentation fault error, which seems strange considering that I think I am allocating memory in accordance with the x86 calling conventions (e.x. allocating the correct memory sections to the function's parameters). Furthermore, if any of you have a solution, it would be greatly appreciated if you could provide some advice as to how to debug an Assembly program embedded with C (I have been using the GDB debugger but it simply points to the line of the C program where the segmentation fault happens instead of the line in the Assembly program).

Stepping through assembly code: stackoverflow.com/questions/2420813/… — Jabberwocky
– Jabberwocky, Commented Nov 13, 2020 at 10:03
movl %ebp, %esp in AT&T syntax this moves the value in ebp to the register esp. You want the reverse. — ecm
– ecm, Commented Nov 13, 2020 at 11:48
Just to check - you're sure you're correctly compiling and assembling the whole program as 32-bit code? On a 64-bit system that normally means using -m32 when compiling and linking. — Nate Eldredge
– Nate Eldredge, Commented Nov 14, 2020 at 0:21
@NateEldredge Yes, I am compiling the program as 32-bit code. — Adam Lee
– Adam Lee, Commented Nov 14, 2020 at 1:25

Nate Eldredge · Accepted Answer · 2020-11-14 01:01:35Z

2

Your function has no epilogue. You need to restore %ebp and pop the stack back to where it was, and then ret. If that's really missing from your code, then that explains your segfault: the CPU will go on executing whatever garbage happens to be after the end of your code in memory.
You clobber (i.e. overwrite) the %ebx register which is supposed to be callee-saved. (You mention following the x86 calling conventions, but you seem to have missed that detail.) That would be the cause of your next segfault, after you fixed the first one. If you use %ebx, you need to save and restore it, e.g. with push %ebx after your prologue and pop %ebx before your epilogue. But in this case it is better to rewrite your code so as not to use it at all; see below.
movzbl loads an 8-bit value from memory and zero-extends it into a 32-bit register. Here the parameters are int so they are already 32 bits, so plain movl is correct. As it stands your function would give incorrect results for any arguments which are negative or larger than 255.
You're using an unnecessary number of registers. You could move the first operand for the addition directly into %eax rather than putting it into %ebx and adding it to zero. And on x86 it is not necessary to get both operands into registers before adding; arithmetic instructions have a mem, reg form where one operand can be loaded directly from memory. With this approach we don't need any registers other than %eax itself, and in particular we don't have to worry about %ebx anymore.

I would write:

.text

# Here, we define a function addition
.global addition
addition:
    # Prologue:
    push %ebp
    movl %esp, %ebp

    # load first argument
    movl 8(%ebp), %eax 
    # add second argument
    addl 12(%ebp), %eax

    # epilogue
    movl %ebp, %esp  # redundant since we haven't touched esp, but will be needed in more complex functions 
    pop %ebp
    ret

In fact, you don't need a stack frame for this function at all, though I understand if you want to include it for educational value. But if you omit it, the function can be reduced to

.text
.global addition
addition:
    movl 4(%esp), %eax
    addl 8(%esp), %eax
    ret

edited Nov 14, 2020 at 1:01

answered Nov 14, 2020 at 0:38

Nate Eldredge

62.4k7 gold badges78 silver badges121 bronze badges

Sign up to request clarification or add additional context in comments.

7 Comments

Adam Lee Over a year ago

Thank you so much for this amazing answer that was so detailed. I was wondering, do you have any tips on debugging C programs that are meshed with Assembly code (in the GDB debugger I can only see the seg fault line in the C program instead of the Assembly program)?

Nate Eldredge Over a year ago

The answers in the link that Jabberwocky posted cover most of what there is. Useful commands: display/i $eip info registers si ni break disassemble print $eax x/8xw $ebp

Adam Lee Over a year ago

I just had one clarifying question about the second point you made: when you say the %ebx register is supposed to be callee-saved, what does that precisely mean? Furthermore, is the %ebx register the only register in x86 that is callee-saved?

Nate Eldredge Over a year ago

It means that your function (the one that is called, i.e. the "callee") needs to ensure that its value upon return is the same as upon entry, so that whatever value your caller may have been keeping there is still there, as if it had never changed. See stackoverflow.com/questions/9268586/…. The caller/callee saved registers are part of the calling conventions; if your reference didn't explain this, you could see wiki.osdev.org/Calling_Conventions. The registers %ebx, %esi, %edi, %ebp are the callee-saved registers on 32-bit x86.

Adam Lee Over a year ago

Gotcha. Last question: Is the EAX register always supposed to store the return value of a function in Assembly? I thought EBP + 4 stores the return value?

|

Devolus · Accepted Answer · 2020-11-13 10:33:14Z

0

You are corrupting the stacke here:

movb %al, 4(%ebp)

To return the value, simply put it in eax. Also why do you need to clear eax? that's inefficient as you can load the first value directly into eax and then add to it.

Also EBX must be saved if you intend to use it, but you don't really need it anyway.

edited Nov 13, 2020 at 10:33

answered Nov 13, 2020 at 10:26

Devolus

22.2k15 gold badges70 silver badges122 bronze badges

1 Comment

Adam Lee Over a year ago

I made that revision in my code, but am still unfortunately getting a segmentation fault.

Collectives™ on Stack Overflow

Segmentation fault when calling x86 Assembly function from C program

2 Answers 2

7 Comments

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

7 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related