Create an arg array for execve on the stack

Question

I want to write an assembly program that executes via EXECVE (syscall #0x3C) the program /bin/ls with the switches -al.

The man page (man 2 execve) states that the call requires three values:

int execve(const char *filename, char *const argv[], char *const envp[]);

I don't quite understand how to build the three arguments. As far as I know, the first argument goes into RDI, the second into RSI, and the third into RDX. I believe that to set up the first one, it suffices doing

    push 0x736c2f2f         ;sl//
    push 0x6e69622f         ;nib/
    mov rdi, rsp

For the third one, the thing is quite easy:

    xor r11, r11
    mov rdx, r11

My problem is that I don't know how to build the second argument, which should be an array containing ['/bin//ls', '-aal']

I need to write it for x86-64, so please no int 0x80 suggestions.

Do you know C? You need to put your strings into memory somewhere and then put their addresses into the array, finally pass the address of the array to the syscall. — Jester
– Jester, Commented Jan 20, 2020 at 14:25
@Jester, although it's somewhat unconventional, It's not uncommon nor wrong to store strings on the stack. — h0r53
– h0r53, Commented Jan 20, 2020 at 14:30
I did not say anything about the stack. Stack is memory and will work fine. Anyway, another issue is that the push imm32 still pushes 8 bytes so the string won't be correct, it will be /bin<0><0><0><0>//ls — Jester
– Jester, Commented Jan 20, 2020 at 14:32
@h0r53 There is no push instruction with an 8 byte immediate. You have to move into a register and then push. — fuz
– fuz, Commented Jan 20, 2020 at 14:44
Wait, so you don't need your code to be a contiguous block that doesn't contain an 0 bytes (i.e. shellcode)? Then it's trivial and you should just put the strings in memory with their terminating 0 bytes, and use RIP-relative LEA to get pointers to them. Does it even have to be position-independent? If not, the arrays of pointers can be static as well instead of writing instructions to get addresses and store them to the stack. i.e. you can basically just use compiler output. Why would you waste your time pushing strings if you aren't aiming for code-injection (shellcode)? — Peter Cordes
– Peter Cordes, Commented Jan 21, 2020 at 21:42

the Tin Man · Accepted Answer · 2020-01-21 00:39:18Z

5

You can put the argv array onto the stack and load the address of it into rsi. The first member of argv is a pointer to the program name, so we can use the same address that we load into rdi.

xor edx, edx        ; Load NULL to be used both as the third
                    ; parameter to execve as well as
                    ; to push 0 onto the stack later.
push "-aal"         ; Put second argument string onto the stack.
mov rax, rsp        ; Load the address of the second argument.
mov rcx, "/bin//ls" ; Load the file name string
push rdx            ; and place a null character
push rcx            ; and the string onto the stack.
mov rdi, rsp        ; Load the address of "/bin//ls". This is
                    ; used as both the first member of argv
                    ; and as the first parameter to execve.

; Now create argv.
push rdx            ; argv must be terminated by a NULL pointer.
push rax            ; Second arg is a pointer to "-aal".
push rdi            ; First arg is a pointer to "/bin//ls"
mov rsi, rsp        ; Load the address of argv into the second
                    ; parameter to execve.

This also corrects a couple of other problems with the code in the question. It uses an 8-byte push for the file name, since x86-64 doesn't support 4-byte push, and it makes sure that the file name has a null terminator.

This code does use a 64-bit push with a 4-byte immediate to push "-aal" since the string fits in 4 bytes. This also makes it null terminated without needing a null byte in the code.

I used strings with doubled characters as they are in the question to avoid null bytes in the code, but my preference would be this:

mov ecx, "X-al"     ; Load second argument string,
shr ecx, 8          ; shift out the dummy character,
push rcx            ; and write the string to the stack.
mov rax, rsp        ; Load the address of the second argument.
mov rcx, "X/bin/ls" ; Load file name string,
shr rcx, 8          ; shift out the dummy character,
push rcx            ; and write the string onto the stack.

Note that the file name string gets a null terminator via the shift, avoiding the extra push. This pattern works with strings where a doubled character wouldn't work, and it can be used with shorter strings, too.

edited Jan 21, 2020 at 0:39

the Tin Man

161k44 gold badges222 silver badges308 bronze badges

answered Jan 20, 2020 at 19:44

prl

12.5k2 gold badges16 silver badges37 bronze badges

Sign up to request clarification or add additional context in comments.

7 Comments

miken32 Over a year ago

Please provide some context or commentary to this code to make it more useful.

the Tin Man Over a year ago

See "Explaining entirely code-based answers". While this might be technically correct it doesn't explain why it solves the problem or should be the selected answer. We should educate in addition to help solve the problem.

Peter Cordes Over a year ago

Yup, that's much better. I'd have use push "-aal" to make that instruction self-documenting because this looks like NASM syntax.

the Tin Man Over a year ago

Yes, it's better, though don't add "edited" or "updated" or "addendum". The goal of SO is to create something like an online reference book, not a message board or forum. Readability is much more important than tagging changes as we can see what's changed if we need to. "Should “Edit:” in edits be discouraged?"

prl Over a year ago

@theTinMan, I put "addendum" because it was additional information not directly related to answering the question, not because it was an edit. Do you think that should not be indicated in some way?

|

Peter Cordes · Accepted Answer · 2023-05-26 09:48:21Z

You can write push '/bin' in NASM to get the bytes into memory in that order. (Padded with 4 bytes of zeros, for a total width of qword; dword pushes are impossible in 64-bit mode.) No need to mess around with manually encoding ASCII characters; unlike some assemblers NASM doesn't suck at multi-character literals and can make your life easier.

You could use use mov dword [rsp+4], '//ls' to store the high half. (Or make it a qword store to write another 4 bytes of zeroes past that, with a mov r/m64, sign_extended_imm32.) Or just zero-terminate it with an earlier push before doing mov rsi, '/bin//ls' / push rsi if you want to store exactly 8 bytes.

Or mov eax, '//ls' ; shr eax, 8 to get EAX="/ls\0" in a register ready to store to make an 8-byte 0-terminated string.

Or use the same trick of shifting out a byte after mov r64, imm64 (like in @prl's answer) instead of separate push / mov. Or NOT your literal data so you do mov rax, imm64 / not rax / push rax, producing zeros in your register without zeros in the machine code. For example:

 mov  rsi, ~`/bin/ls\0`   ; mov rsi, 0xff8c93d091969dd0
 not  rsi
 push rsi                 ; RSP points to  "/bin/ls", 0

If you want to leave the trailing byte implicit, instead of an explicit \0, you can write mov rsi, ~'/bin/ls' which assembles to the same mov rsi, 0xff8c93d091969dd0. Backticks in NASM syntax process C-style escape sequences, unlike single or double quotes. I'd recommend using \0 to remind yourself why you're going to the trouble of using this NOT, and the ~ bitwise-negation assemble-time operator. (In NASM, multi-character literals work as integer constants.)

I believe that to set up the first one, it suffices doing
  push 0x736c2f2f         ;sl//
  push 0x6e69622f         ;nib/
  mov rdi, rsp

No, push 0x736c2f2f is an 8-byte push, of that value sign-extended to 64-bit. So you've pushed '/bin\0\0\0\0//ls\0\0\0\0'.

Probably you copied that from 32-bit code where push 0x736c2f2f is a 4-byte push, but 64-bit code is different.

x86-64 can't encode a 4-byte push, only 2 or 8 byte operand-size. The standard technique is to push 8 bytes at a time:

  mov   rdi, '/bin//ls'     ; 10-byte mov r64, imm64
  push  rdi
  mov   rdi, rsp

If you have an odd number of 4-byte chunks, the first one can be push imm32, then use 8-byte pairs. If it's not a multiple of 4, and you can't pad with redundant characters like /, mov dword [mem], imm32 that partially overlaps might help, or put a value in a register and shift to introduce a zero byte.

See

fpmurphy · Accepted Answer · 2020-01-21 04:58:51Z

-1

Load the following C example (modify if needed) into the Godbolt compiler explorer and you can see how various compilers typically generate assembly for a call to execve on the AMD64 (or other) architecture.

#include <stdio.h>
#include <unistd.h>

int 
main(void) {
   char* argv[] = { "/bin/ls", "-al", NULL };
   // char* argv[] = { "-al", NULL };
   // char* argv[] = { "/bin/lsxxx", "-al", NULL };
   // char* argv[] = { "", "-al", NULL };
   char* envp[] = { "PATH=/bin", NULL };

   if (execve("/bin/ls", argv, envp) == -1) {
      perror("Could not execve");
      return 1;
   }  
}

edited Jan 21, 2020 at 4:58

answered Jan 20, 2020 at 16:36

fpmurphy

2,5771 gold badge19 silver badges24 bronze badges

5 Comments

Peter Cordes Over a year ago

That's basically useless for shellcode; gcc will just store pointers to string literals in the .rodata section. Without -fPIE you'll get mov [mem64], imm32 with an absolute address: useless for shellcode. With -fPIE you'll get a RIP-relative LEA into a register which only works in shellcode if the string is before the code (so the rel32 doesn't contain any 00 bytes).

Peter Cordes Over a year ago

Actually with multiple strings this just doesn't work at all. You need each of them 0-terminated, so you can't just take the compilers rodata and jump over it because it has to contain zeros.

fpmurphy Over a year ago

@prl. Try compiling and running all 4 versions of argv[] in my example. On, Linux, argv[0] can actually be anything, including an empty string. Not saying that this 'feature' is good or bad but it is so.

fpmurphy Over a year ago

@PeterCordes. Of course it is useless for shellcode! However, where did the OP say they were dealing with shellcode?

Peter Cordes Over a year ago

It's tagged with shellcode, otherwise sure, maybe you could guess that the attempt at pushing parts of strings was just something they happened to try. Regardless, removed my downvote now that the answer is at least useful for non-shellcode usage. Of course, it will compile to a library function call, not an inline syscall, but fortunately for x86-64 Sys V the system calling convention matches the user-space function-calling convention except for RCX so you can replace call with syscall directly here.

Collectives™ on Stack Overflow

Create an arg array for execve on the stack

3 Answers 3

7 Comments

Comments

5 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

7 Comments

Comments

5 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related