Location>code7788 >text

Linux kernel source code reading: AArch64's exception handling mechanism in detail (kernel version 6.11)

Popularity:605 ℃/2024-10-10 08:03:25

Anyone who has played with the Arm64 architecture knows that our ARM64 architecture has exceptions: the Exception Levels, ELs, which are a core component of its exception handling mechanism and allow the system to execute code at different privilege levels.ARM64 defines four exception levels, each with different privileges, functions, and access rights. The following is a detailed description of each exception level:

Four layers of anomalies and their switching

1. EL0 (user mode)

  • privilege level: Minimum.
  • functionality: Our programs are generally located here, EL0 runs ordinary user applications, there is no direct access to hardware resources, all access to resources must be forwarded to EL1 through the system call.
  • access restriction: There is no direct access to key hardware components such as system control registers, memory management units, and so on.

2. EL1 (kernel mode)

  • privilege level: Moderate privilege.
  • functionality: Executes the operating system kernel (where Linux runs) and device drivers.EL1 can manage hardware resources and provide support for applications.
  • access authority: Can access most system resources, including interrupt controllers, timers, etc. EL1 can handle system calls and exceptions from EL0.

3. EL2 (virtualization model)

  • privilege level: Highly privileged.
  • functionality: Mainly used in virtualized environments that allow multiple virtual machines to share hardware resources. the EL2 can manage virtual machine monitors (hypervisors) that provide management and resource allocation to virtual machines.
  • access authority: In addition to accessing EL1 resources, you can control and manage the virtualized environments of EL0 and EL1.

4. EL3 (security monitoring mode)

  • privilege level: the highest privilege.
  • functionality: Handles security-related tasks such as the management of the Trusted Execution Environment (TEE). the EL3 is primarily used to handle exceptions related to system security.
  • access authority: Has full access to all resources, including all other exception-level resources.

Exception level switching

When an exception occurs in the system, the ARM64 automatically transfers to the appropriate higher exception level based on the current exception level and exception type. This switching process usually includes:

  • Save Context: Saves the context of the current handler (e.g., PC and PSTATE).
  • Moving to a higher exception level: Enter the appropriate handler according to the type of exception.

two kinds of anomaly (math.)

Our exceptions are divided into two broad categories: synchronous exceptions and asynchronous exceptions, depending on how the exception occurs during our execution.

1. Synchronization exceptions

  • system call: When a user mode (EL0) application needs to request an operating system service, an exception is triggered by a system call to the kernel mode (EL1) for processing.
  • page fault: A page error is raised when a program attempts to access an unmapped or invalid memory address. Handling this exception usually involves the operation of the Memory Management Unit (MMU).
  • illegal instruction: If the CPU executes an invalid or undefined instruction, the system raises this exception and goes to the exception handler.
  • floating point anomaly (math.): Errors related to floating-point operations, such as division by zero or floating-point overflow, trigger this type of exception.
  • breakpoint exception: The exception is triggered when a breakpoint instruction used for debugging is executed and goes to the debug handler.

2. Asynchronous exceptions

  • external interrupt: Generated by a hardware device (e.g., keyboard, network adapter, etc.) to notify the CPU of an event that needs to be processed. External interrupts usually have high priority and can interrupt the current execution flow.
  • timer interrupt: Generated by the system timer to enable task scheduling and time management. Timer interrupts ensure that the system can perform periodic task scheduling.
  • Power Management Interrupt: Interrupts associated with power state changes, such as signals generated when going to sleep or waking up.

Approach to processing (arduous!)

The author spent some time reading the document and came to this conclusion.

Outside first lock to here and find this file: locate this code segment:

arch/arm64/kernel/

SYM_CODE_START(vectors)
	kernel_ventry	1, t, 64, sync		// Synchronous EL1t
	kernel_ventry	1, t, 64, irq		// IRQ EL1t
	kernel_ventry	1, t, 64, fiq		// FIQ EL1t
	kernel_ventry	1, t, 64, error		// Error EL1t

	kernel_ventry	1, h, 64, sync		// Synchronous EL1h
	kernel_ventry	1, h, 64, irq		// IRQ EL1h
	kernel_ventry	1, h, 64, fiq		// FIQ EL1h
	kernel_ventry	1, h, 64, error		// Error EL1h

	kernel_ventry	0, t, 64, sync		// Synchronous 64-bit EL0
	kernel_ventry	0, t, 64, irq		// IRQ 64-bit EL0
	kernel_ventry	0, t, 64, fiq		// FIQ 64-bit EL0
	kernel_ventry	0, t, 64, error		// Error 64-bit EL0

	kernel_ventry	0, t, 32, sync		// Synchronous 32-bit EL0
	kernel_ventry	0, t, 32, irq		// IRQ 32-bit EL0
	kernel_ventry	0, t, 32, fiq		// FIQ 32-bit EL0
	kernel_ventry	0, t, 32, error		// Error 32-bit EL0
SYM_CODE_END(vectors)

Processing entry points (vector entry points) for different exception vectors in the ARM64 architecture. Eachkernel_ventryRows represent how a particular exception type is handled under different exception levels and modes. The following table illustrates this:

  1. Exception level
    • EL1t(EL1 Synchronous): Synchronized exception handling in kernel mode.
    • EL1h(EL1 Hypervisor): Synchronized Exception Handling in Virtualized Environments.
    • EL0: Exception handling in user mode, divided into 64-bit and 32-bit modes.
  2. Type of Exception
    • sync:: Synchronization exceptions, such as system calls or page errors.
    • irq: Hardware interrupt request (IRQ), an interrupt generated by an external device.
    • fiq: Fast Interrupt Request (FIQ) for higher priority interrupt processing.
    • error: Handles other types of errors, such as undefined commands or faults.
  3. Explanation of instructions
    • kernel_ventry: This is a macro or function that represents the entry point for defining an exception handler.
    • The first parameter (e.g.1maybe0) usually indicates the exception level (1 for EL1, 0 for EL0).
    • The second parameter (e.g.tmaybeh) indicates the type of exception (tindicates an exception.h(denotes virtualization).
    • The third parameter (e.g.64maybe32) indicates the number of bits in the processor (64-bit or 32-bit).
    • The fourth parameter is the specific exception type (e.g.syncirq(etc.).

That is to say that this assembly file actually describes the handler tabulation for different exception types at different Exception Levels.

That's great, but when you browse most blogs ready to see where our exception exception vector handler is, you'll be surprised to find yourself unable to find the handler! Where is it? Here it is!

This is, of course, to simplify the process of defining exception handlers in the ARM64 kernel by making it easy to create multiple handlers by parameterizing the exception level, handler type, and register size. This structured approach helps keep the code neat and maintainable. (But it's a pain for people like me who like to read RECENT source code)

	.macro entry_handler el:req, ht:req, regsize:req, label:req
SYM_CODE_START_LOCAL(el\el\ht\()_\regsize\()_\label)
	kernel_entry \el, \regsize
	mov x0, sp
	bl el\el\ht\()_\regsize\()_\label\()_handler # Jump to this exceptionhandler
	.if \el == 0
	b ret_to_user
	.else
	b ret_to_kernel
	.endif
SYM_CODE_END(el\el\ht\()_\regsize\()_\label)
	.endm

/*
 * Early exception handlers
 */
	entry_handler 1, t, 64, sync
	entry_handler 1, t, 64, irq
	entry_handler 1, t, 64, fiq
	entry_handler 1, t, 64, error

	entry_handler 1, h, 64, sync
	entry_handler 1, h, 64, irq
	entry_handler 1, h, 64, fiq
	entry_handler 1, h, 64, error

	entry_handler 0, t, 64, sync
	entry_handler 0, t, 64, irq
	entry_handler 0, t, 64, fiq
	entry_handler 0, t, 64, error

	entry_handler 0, t, 32, sync
	entry_handler 0, t, 32, irq
	entry_handler 0, t, 32, fiq
	entry_handler 0, t, 32, error

There's no rush to watch a little bit:

macro-defined structure

SYM_CODE_START_LOCAL(el\el\ht\()_\regsize\()_\label)
  • SYM_CODE_START_LOCAL: This macro is used to start defining a local symbol, typically used as the entry point to an exception handler.
  • el: Indicates the exception level (e.g., EL0, EL1, etc.).
  • ht: Indicates the type of processing (may mean synchronous or asynchronous).
  • regsize: Indicates the register size (e.g., 32-bit or 64-bit).
  • label: Defines the label name of the handler.

Handler entry logic

kernel_entry \el, \regsize
  • kernel_entry: This is usually another macro to set up the kernel entry environment according to theelcap (a poem)regsizeInitializes the context for exception handling.
mov x0, sp
  • Move the value of the stack pointer (SP) to a registerx0This is usually used to pass parameters or save contexts.
bl el\el\ht\()_\regsize\()_\label\()_handler
  • bldirective is used to call the specified exception handler. Here, it calls the actual exception handler function based on the name of the handler constructed from the arguments passed in.

return logic

if \el == 0
    b ret_to_user
.else
    b ret_to_kernel
.endif
  • This part of the code determines the returned handling logic based on the current exception level.
    • in the event thatelIt's zero., which indicates user mode, jumps to theret_to_user, this tag usually handles the logic for returning to the user state.
    • in the event thatelNot 0.If you are in kernel mode, then jump to theret_to_kernel, the logic used to return to the kernel state.

macro termination

SYM_CODE_END(el\el\ht\()_\regsize\()_\label)
.endm
  • SYM_CODE_END: End symbol definition to mark the end of the macro.

Very good! We get the idea: the handler is actually a way to determine which level (kernel or user) to call back to after processing, and finally, we need to reveal the following kernel_entry mystery.

Go back to the beginning of this document:

	.macro kernel_ventry, el:req, ht:req, regsize:req, label:req
	.align 7
.Lventry_start\@:
	.if \el == 0
	/*
	 * This must be the first instruction of the EL0 vector entries. It is
	 * skipped by the trampoline vectors, to trigger the cleanup.
	 */
	b .Lskip_tramp_vectors_cleanup\@
	.if \regsize == 64
	mrs x30, tpidrro_el0
	msr tpidrro_el0, xzr
	.else
	mov x30, xzr
	.endif
.Lskip_tramp_vectors_cleanup\@:
	.endif

	sub sp, sp, #PT_REGS_SIZE
#ifdef CONFIG_VMAP_STACK
	/*
	 * Test whether the SP has overflowed, without corrupting a GPR.
	 * Task and IRQ stacks are aligned so that SP & (1 << THREAD_SHIFT)
	 * should always be zero.
	 */
	add sp, sp, x0 // sp' = sp + x0
	sub x0, sp, x0 // x0' = sp' - x0 = (sp + x0) - x0 = sp
	tbnz x0, #THREAD_SHIFT, 0f
	sub x0, sp, x0 // x0'' = sp' - x0' = (sp + x0) - sp = x0
	sub sp, sp, x0 // sp'' = sp' - x0 = (sp + x0) - x0 = sp
	b el\el\ht\()_\regsize\()_\label

0:
	/*
	 * Either we've just detected an overflow, or we've taken an exception
	 * while on the overflow stack. Either way, we won't return to
	 * userspace, and can clobber EL0 registers to free up GPRs.
	 */

	/* Stash the original SP (minus PT_REGS_SIZE) in tpidr_el0. */
	msr tpidr_el0, x0

	/* Recover the original x0 value and stash it in tpidrro_el0 */
	sub x0, sp, x0
	msr tpidrro_el0, x0

	/* Switch to the overflow stack */
	adr_this_cpu sp, overflow_stack + OVERFLOW_STACK_SIZE, x0

	/*
	 * Check whether we were already on the overflow stack. This may happen
	 * after panic() re-enables interrupts.
	 */
	mrs x0, tpidr_el0 // sp of interrupted context
	sub x0, sp, x0 // delta with top of overflow stack
	tst x0, #~(OVERFLOW_STACK_SIZE - 1) // within range?
		__bad_stack // no? -> bad stack pointer

	/* We were already on the overflow stack. Restore sp/x0 and carry on. */
	sub sp, sp, x0
	mrs x0, tpidrro_el0
#endif
	b el\el\ht\()_\regsize\()_\label # It's still the same.,Finally, we jump to the actual handler.
.org .Lventry_start\@ + 128 // Did we overflow the ventry slot?
	.endm

Speak slowly below:

.macro kernel_ventry, el:req, ht:req, regsize:req, label:req
    .align 7
.Lventry_start\@:
    .if \el == 0
        b .Lskip_tramp_vectors_cleanup\@
        .if \regsize == 64
            mrs x30, tpidrro_el0
            msr tpidrro_el0, xzr
        .else
            mov x30, xzr
        .endif
.Lskip_tramp_vectors_cleanup\@:
    .endif

    sub sp, sp, #PT_REGS_SIZE
  • kernel_ventrymacros define the kernel's entry logic at different exception levels and register sizes.
  • Instruction Description
    • b .Lskip_tramp_vectors_cleanup\@Used to skip cleanup operations.
    • Depending on the register size (64-bit or 32-bit), set the appropriatex30Registers.

stack overflow check

#ifdef CONFIG_VMAP_STACK
    add sp, sp, x0
    sub x0, sp, x0
    tbnz x0, #THREAD_SHIFT, 0f
    sub x0, sp, x0
    sub sp, sp, x0
    b el\el\ht\()_\regsize\()_\label

0:
    msr tpidr_el0, x0
    sub x0, sp, x0
    msr tpidrro_el0, x0
    adr_this_cpu sp, overflow_stack + OVERFLOW_STACK_SIZE, x0
    mrs x0, tpidr_el0
    sub x0, sp, x0
    tst x0, #~(OVERFLOW_STACK_SIZE - 1)
     __bad_stack
    sub sp, sp, x0
    mrs x0, tpidrro_el0
#endif

This code detects a stack overflow and performs the appropriate handling.

  • By adjustingspto check for stack overflows.
  • If an overflow is detected, saves the relevant register value and switches to the overflow stack.

Entrance Jump

b el\el\ht\()_\regsize\()_\label
.org .Lventry_start\@ + 128  // Did we overflow the ventry slot?
.endm
  • functionality: The final jump to the corresponding kernel handler function ensures access to the correct exception handling logic.

We now summarize the following:

Our modern kernel uses an assembly macro approach to define a kernel exception handle table that is extremely flexible. He arranges them and uses a handler template to generate the corresponding registration mechanism. Entering the interrupt vector, he first checks the stack (to see if it's a stack overflow or some other error), then jumps to the actual handler function (this has to be looked for out there, but unfortunately it's not available at the moment 😦 ), and then has to do a cleanup exit, deciding whether to return to recover the stack information from wherever it was, depending on whether it was kernel or user state when it entered the handler handle.

That's it!