Recursion without recursive call?

Question

Found this on /prog/. I actually GDB'd it, and yes, it was truly a recursion. But how did it happen?

// This works on 32-bit x86 Linux with gcc as long as you don't enable optimization.

#include <stdio.h>
#include <stdlib.h>

static void factorial(int in, int *out)
{
  *(&in-1)-=5-5*(1/in);
  *out*=in--;
}

int main(int argc, char **argv) 
{
  int result=1;
  int number=0;

  if (argc!=2) 
    exit(1);

  number=atoi(argv[1]);
  if (number<1)
    exit(2);

  factorial(number, &result);
  printf("%d! = %d\n", number, result);
  return 0;
}


$ ./factorial 3
3! = 6

$ ./factorial 5
5! = 120

Martin B · Accepted Answer · 2011-06-16 10:52:14Z

22

Sweet. ;)

This is extremely non-portable code that works only on x86. What it's doing is changing the return address on the stack so that if in>1, the function returns not to the instruction following the call instruction, but to the call instruction itself. A call instruction on x86 is five bytes (one opcode plus the 4-byte address of the call destination), so five needs to be subtracted from the return address.

This

*(&in-1)-=5-5*(1/in);

is just an obfuscated way of saying

if(in>1)
    *(&in-1)-=5;

And &in-1 is where the return address lives on the stack.

answered Jun 16, 2011 at 10:52

Martin B

24.3k7 gold badges55 silver badges73 bronze badges

Sign up to request clarification or add additional context in comments.

8 Comments

Bakudan Over a year ago

@Martin B why it is only available on x86?

Martin B Over a year ago

@Bakudan: You could probably do similar things on other architectures, but this particular "party trick" is specific to a) the x86 calling convention (i.e. where does the return address live) and b) the fact that an x86 call instruction is 5 bytes.

Abhimanyu Srivastava Over a year ago

@Martin,...ca u please give me an explaination.i m not able to get the clear scenario...

R.. GitHub STOP HELPING ICE Over a year ago

This code does not even work reliably on x86. A good optimizing compiler will choose its own calling convention for static functions which are never called through function pointers.

R.. GitHub STOP HELPING ICE Over a year ago

Also in-- should be optimized just to in because the value of in is never used after that.

|

Roddy · Accepted Answer · 2011-06-16 10:50:08Z

8

It's corrupting return addresses on the stack to cause the recursion to happen.

*(&in-1)-=5-5*(1/in);

&in-1 is probably the pushed return address. the rest is just unpleasant magic.

answered Jun 16, 2011 at 10:50

Roddy

68.5k46 gold badges173 silver badges282 bronze badges

Comments

aalku · Accepted Answer · 2011-06-17 08:37:55Z

0

As said before *(&in-1) is the return address~~, 5 (I guess) is the size of the function (in words?) and it substract both to get the address where the function begins.~~

~~EDIT: Since I think the change to the return address is permanent I don't understand why 5 is subtracted while in > 1;~~

EDIT2: See the comments

The assembler return instruction only jumps to the address that is on the top of the stack.

Due to the calling convention the parameters are pushed to the stack and removed from it by the caller. Since there is only one real call there is no recursion, the parameters and the return address (pointer) are shared among the iterations.

It is more like a loop than a recursive function.

edited Jun 17, 2011 at 8:37

answered Jun 16, 2011 at 13:31

aalku

2,8783 gold badges24 silver badges44 bronze badges

3 Comments

Martin B Over a year ago

True, this is more like a loop than a real recursion, because no nested stack frames ever get built.

Martin B Over a year ago

To answer your question about whether the change to the return address is permament... no, it's not, since the return address gets popped back off the stack when factorial returns. The next call instruction from main() then pushes a new, pristine return address onto the stack... so the return address needs to be modified again. All of this goes on until in==1, at which point we actually want to return from factorial() the normal way -- so we don't modify the return address.

aalku Over a year ago

So the call is executed again. I did't think that. That explains why 5, it just changes the return point one instruction back. Thank you.

Collectives™ on Stack Overflow

Recursion without recursive call?

3 Answers 3

8 Comments

Comments

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

8 Comments

Comments

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related