0

I am having a bit of trouble writing a recursive function in assembly. I would like to model the C function here:

int power(int base, int exponent)
{
    if (exponent==0)
        return 1;
    else {
        /* return base * power(base, --exponent); -- in normal form, writing in long-form below*/
        exponent--;
        int tmp1 = power(base, exponent);
        int tmp2 = base * tmp1;
        return tmp2;
    }
}

And what I have so far in assembly is as follows:

power:
    // int power(int base=rdi, int exponent=rsi)
    push %rbp
    mov %rsp, %rbp

    // if (exponent==0) return 1
    cmp $0, %rsi
    jnz exponentiate

 return_one:
    mov $1, %eax
    jmp cleanup

 exponentiate:
    dec %rsi
    push %rax
    call power
    pop %rbx
    imul %rbx, %rax

 cleanup:
    mov %rbp, %rsp
    pop %rbp
    ret

This produces the result zero. I think my error is in the exponentiate function, but am having a hard time debugging it. I know I can 'find the answer' using Compiler Explorer, but I'm wondering if someone could help point out the mistake I have in my code.


I've been able to simplify this to the following where I don't even have to set up a stack frame for the recursive case:

 exponentiate:
    dec %rsi
    call power
    imul %rdi, %rax

Why is this allowed? I thought for all recursive functions you need to set up its own stack frame, but why isn't it necessary in the above case?

9
  • 2
    The first time you do push %rax, what value do you expect %rax to hold? Commented Mar 25, 2021 at 3:56
  • @JosephSible-ReinstateMonica I think 1 if that's the return from the return_one ? Commented Mar 25, 2021 at 3:57
  • What order do you think your code runs in? Commented Mar 25, 2021 at 3:58
  • 1
    Of course it is, you ignored everything I said about using -O1, which makes nice clean asm in this case that only does one push, no other stack space. godbolt.org/z/EKbcTf15a. Turns out you don't need any noinline tricks at -O1, that alone works to make recursive asm without converting it to iterative. How to remove "noise" from GCC/clang assembly output? Commented Mar 25, 2021 at 4:30
  • 1
    re: stack alignment or not: yes GCC sometimes reserves 16 bytes more than it needs for alignment. But that's not really important, the fact that it reserves some and uses it is all that actually matters. When you write your own function, you can do your own layout of your stack frame and choices of what to have in registers when. Use -fverbose-asm to have it comment its asm with var names. Commented Mar 25, 2021 at 4:34

2 Answers 2

4

Here is the cleanest conversion I could come up with for keeping the recursive case:

// int power(int base=rdi, int exponent=rsi)
power:
    cmp $0, %esi                  # if (exponent==0)
    jnz exponentiate                # // 
    mov $1, %eax                    # tmp = 1
    ret                             # return tmp
  exponentiate:                  # else
    dec %esi                       # exponent--
    call power                     # int tmp = power(base, exponent)
    imul %edi, %eax                # tmp = tmp * base
    ret                            # return tmp

It ends up looking relatively close to the C code you have.

Sign up to request clarification or add additional context in comments.

7 Comments

Ha, very close to mine. You save one instruction on the common path by keeping mov $1, %eax conditional, but I save one byte by only having ret in one place :-)
@NateEldredge lol the only difference between yours and mine is mine took five hours to come up with ;)
Oh, I see you had already mentioned the version without saving %edi in your question... I didn't mean to plagiarize it, I actually just didn't read that part of your question carefully. But I've edited now to point out that it was really your idea first.
@NateEldredge oh no worries, actually I had one version and then I used your suggestions to go from rxx to exx on things like cmp $0 %esi. Thanks again for your help!
@samuelbrody1249: What are good examples that actually motivate the study of recursion? (and other answers on that question) have some good examples where it wouldn't be easier just to write a loop. In this question, you're already half-way to making a loop by taking advantage of power not modifying EDI. It's arguably still pure recursion, just with a calling convention where RDI is call-preserved, and obviously it would be easier to write iteratively.
|
3

You want to multiply the return value of the recursive call by the value of base. That's in %edi. For some reason, in your original version, you push %rax even though it only contains garbage on (the first) entry to the function. (The mov $1, %eax is followed by a jump to cleanup so it does not run before exponentiate.) And you pop into %rbx, which is a call-preserved register; if you want to modify it you must save its value and restore it later.

The more natural thing to do would be to simply push and pop %rdi around the call, since it's what contains the value you'll need afterwards. However, as you eventually noticed, you don't actually have any need to modify base, so you might as well just keep it intact, and think of %rdi as being "read only" throughout your function. Then you don't need to use the stack at all, except implicitly for the return address. (Here you can beat the compiler which I guess has to assume that %rdi is clobbered by any call, including a call to the very function it's compiling, even though it knows its own register usage...)

Thus you can indeed write the exponentiate section as:

exponentiate:
    dec %esi
    call power
    imul %edi, %eax

(You're right to use 32-bit registers because you have everything declared as int in your C example. If you want it to work on int64_t instead then feel free to change back to 64-bit registers.)

Other notes:

  • The stack frame setup is never really needed, especially not when you don't need the stack for anything at all. In particular there is no such rule that it's required for recursive functions (if you think you'll see it would not do anything). You can skip it. The worst that will happen is that a debugger will have trouble giving you tracebacks.

  • If using 32-bit values, you want to compare %esi against zero, not %rsi; it also saves a byte of REX prefix. And you can save another byte by writing test %esi, %esi in place of cmp $0, %esi since all you need to look at is the zero flag.

  • You can simplify the jumps a little bit by doing

power:
    mov $1, %eax
    test %esi, %esi
    jz cleanup
    // former label exponentiate was here, no longer needed
    dec %esi
    call power
    imul %edi, %eax
cleanup:
    ret

If you run the exponentiate section it will just overwrite the 1 that was in eax, so no harm done. Running a very cheap mov instruction unconditionally is probably better than taking an extra jump.

2 Comments

since all you need to look at is the zero flag. - test reg,reg sets all flags identically to cmp $0, reg, except for possibly AF, the half-carry flag which you can only read via pushf. Test whether a register is zero with CMP reg,0 vs OR reg,reg?. So it doesn't matter what condition you want to check, you can always use test to compare a register against zero.
Taking advantage of the fact that power leaves RDI unmodified is half way into turning the recursion into iteration. In some ways that's good because recursion is 100% pointless for this function, making everything more complicated for zero benefit. (Unlike a tree traversal, or Ackermann, where it's actually natural and easier than the alternative, which would involve a manual stack data structure - What are good examples that actually motivate the study of recursion?)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.