Location>code7788 >text

Design and Implementation of an Operating System - Chapter 23 Fast System Calls

Popularity:871 ℃/2024-09-01 10:31:11

23.1 What is a Fast System Call

System calls are a means by which the operating system provides services for 3-privileged level tasks. In 32-bit operating systems, we implement system calls through interrupts. Since system calls are a very frequently used mechanism and interrupts are not specifically designed for system calls, 64-bit CPUs provide a specialized mechanism for system calls: fast system calls.

Fast system calls are handled by the specializedsyscallThe command is initiated with a dedicatedsysretThe command returns.syscallmust be transferred from privilege level 3 to privilege level 0.sysretMust return from privilege level 0 to privilege level 3. The fast system call uses register passes throughout, and the system call function'scs:ripis predetermined, so that thesyscall/sysretNone of them require parameters.

To summarize, the whole set of mechanisms for fast system calls is very fixed, which leads to high efficiency.

23.2 Installation of Quick System Calls

Before you can use Fast System Calls, you need to install the components required for Fast System Calls, which involves 4 MSRs.

23.2.1 IA32_EFER

Fast System Calls This feature is turned off in its initial state, and its switch is located in theIA32_EFERof the 0th digit. This MSR, which we have already seen, is numbered0xc0000080

23.2.2 IA32_STAR

The lower 32 bits of this MSR are reserved; bits 32 to 47 are used to set thesyscallPrivileged Level 0 segment selector used; bits 48 to 63 are used to set thesysretThe 3-privileged segment selector used.

Note that it doesn't say that the setting is a "segment selector", just a "segment selector", because the selector setting has a rather strange definition:

  • For bits 32 to 47, the value itself is treated as a privileged level 0 code segment selector; the value obtained by adding 8 to this value is treated as a privileged level 0 data segment selector.
  • For bits 48 to 63, the value itself will be treated as a 3-privileged compatibility mode code segment selector; the value obtained by adding 8 to this value will be treated as a 3-privileged data segment selector; and the value obtained by adding 16 to this value will be treated as a 3-privileged IA32-e mode code segment selector. Then, when executing thesysretWhen it does, which code segment does it actually choose? This question will be discussed in

The segment selector is obtained by shifting the descriptor index value left by 3 bits, so adding 8 is the next descriptor in the GDT. That is, bits 32 to 47 set the first of two consecutive segment descriptors; bits 48 to 63 set the first of three consecutive segment descriptors. However, since our operating system never uses compatibility mode code segments, this descriptor is not defined in the GDT.

The number of this MSR is0xc0000081

23.2.3 IA32_LSTAR

This MSR is used to set the address of the system call function, which is numbered0xc0000082

23.2.4 IA32_FMASK

This MSR is used to set the RFLAGS masking mask. Specifically, when executingsyscallwhenrflagsIt will turn out like this:rflags &= ~IA32_FMASK. In our operating system, this MSR is used to mask the IF bit with a mask mask of0x200

The number of this MSR is0xc0000084

23.3 syscallImplementation details

When executing thesyscallWhen it does, the CPU performs the following operations:

  • rcx = rip
  • r11 = rflags
  • cs = IA32_STAR[32:47]
  • rip = IA32_LSTAR
  • rflags &= ~IA32_FMASK

In other words.rcxcap (a poem)r11will besyscalluse, they cannot be used for passing parameters. In addition, thesyscallwill not have a direct impact onrspdo any processing, which is an important issue that we will discuss below.

23.4 sysretImplementation details

When executing thesysretWhen it does, the CPU performs the following operations:

  • rip = rcx
  • rflags = r11
  • in the event thatsysretThere is no 64-bit prefix, then:cs = IA32_STAR[48:63]Otherwise:cs = IA32_STAR[48:63] + 16

That is to say:

  1. The operating system needs to be protectedrcxtogether withr11
  2. sysretRequires a 64-bit prefix

Point 1 above will be discussed below; point 2 is available in nasmo64 sysretRealization.

23.5 Implementation of system calls

See the code in this chapter23/

In line 3, the declaration ofsyscallInitfunction. This function is implemented in assembly language.

Next, look at the code for this chapter23/

In lines 15 to 18, replaceIA32_EFERof position 0 of 1 to turn on the fast system call function.

Lines 20 to 23, setIA32_STAR. In GDT, descriptor #3 is a privileged level 0 code segment and descriptor #4 is a privileged level 0 data segment, and these two segment descriptors correspond to theIA32_STARbits 32 to 47; descriptor #5 is a 3-privileged data segment, and descriptor #6 is a 3-privileged code segment with no compatibility mode code segment, so descriptor #4 should be forced to be installed here into theIA32_STARbits 48 to 63 of the program, leaving descriptors #5 and #6 in the correct position.

In lines 25 to 29, the system call functionsyscallHandleaddress to install into theIA32_LSTAR

In lines 31 to 34, the masking masks will be0x200mounted onIA32_FMASK

At this point, the fast system call is ready.

syscallHandlefunction is a system call function. In 32-bit operating systems, system calls are realized by interrupts, and when an interrupt occurs, the CPU automatically switches to the 0-privileged stack, which is provided by the operating system to ensure its safety. What does "safe stack" mean? If you do not switch the stack, what is the problem? Please see the following example:

void test()
{
    char s[] = "666";
    __asm__ __volatile__("syscall");
}

Translating this code into assembly language could be:

test:
	mov dword [rsp - 4], '666'
	syscall
	ret

can be found: this function'srspis not and does not need to actually subtract 4, but if such arspProviding it for use by the system call function is just wrong, because the system call function doesn't know exactly how the stack is supposed to work. This is the problem posed by unsafe stacks, so it is necessary to switch to a safe stack at system call time.

However.syscalldoes not switch the stack automatically, we need to do this manually.0 The privilege level stack is in the TSS, and the address of the TSS is0xffff800000092000But to use this address, you must first turn over the 64-bit immediate number using a register. Which register to use? Unrelated to the ABI, it seems that whichever one is used is not perfect. At this point, our previously setIA32_GS_BASEIt came in handy to usegsIt is then possible to manipulate the TSS directly. Not only that, but our operating system's TSS is extended to 128 bytes, and a small section of memory after 104 bytes can be used to back up the current stack before changing thersp. At this point, the stack change problem is perfectly solved.

In line 44, replacerspBackup to[TSS + 104]

On line 45, switch to the 0 privilege level stack.

Lines 47-48, protectionrcxtogether withr11.. The stack is now safe to use.

Lines 50 to 51, callraxThe specified function.

Lines 53-54, recoveryrcxtogether withr11

On line 56, restore the 3-privilege level stack.

Line 58, return from fast system call.

Lines 60 to 63, define the system call table. system call #1 is reserved for use in subsequent chapters.

Next, look at the code for this chapter23/

_startFunctions are the real entry point for 3-privilege level tasks, and their use is to make the task exit automatically at the end.

23.6 Compilation and Testing

Code for this chapter23/Makefileincreasedtogether withcompilation and linking commands.

Code for this chapter23/together with23/System calls 0 & 2 were tested.