Location>code7788 >text

Go runtime scheduler in a nutshell (VIII): sysmon threads and runtime overrun preemption

Popularity:661 ℃/2024-09-19 11:31:01

0. Preface

existGo runtime scheduler in a nutshell (VII): case study In this article we present a case study of preemption. We analyzed the implementation of preemption from the case study and did not cover the source code level. In this article, we will continue to look at the source code to see how the Go runtime scheduler implements the preemption logic.

1. sysmon thread

remember thatGo runtime scheduler in a nutshell (4): running the main goroutine We've made a cursory mention of it in the first article.sysmon Threads, which are monitor threads running on the system stack, are responsible for monitoring the state of the goroutine and handling it accordingly. Of course, it is also responsible for handling preemption, which is the focus of this talk.

image

sysmon The creation of thesrc/runtime/:sysmon

// The main goroutine.
func main() {
	...
	if GOARCH != "wasm" { // no threads on wasm yet, so no sysmon
		systemstack(func() {
			newm(sysmon, nil, -1)
		})
	}
    ...
}

sysmon Does not need to be tied to P and runs as a monitor thread on the system stack. Enter thesysmon

func sysmon() {
	...
    idle := 0 // how many cycles in succession we had not wokeup somebody
	delay := uint32(0)

	for {
		if idle == 0 { // start with 20us sleep...
			delay = 20 //
		} else if idle > 50 { // start doubling the sleep after 1ms...
			delay *= 2
		}
		if delay > 10*1000 { // up to 10ms
			delay = 10 * 1000
		}
		usleep(delay) // hibernation delay us

        // retake P's blocked in syscalls
		// and preempt long running G's
		if retake(now) != 0 {
			idle = 0
		} else {
			idle++
		}
        ...
    }
}

Omitting much that has nothing to do with preemption, what is relevant to preemption isretake function into theretake

func retake(now int64) uint32 {
n := 0
lock(&allpLock)

    // .
for i := 0; i < len(allp); i++ {
        if pp == nil {
// This can happen if procresize has grown
// allp but not yet created new Ps.
continue
continue}

        pd := & // Used in the sysmon thread to record the number and duration of system calls made by the monitored p.
pd := & // Used in sysmon threads to record the number and duration of system calls made by the monitored p. Continue }
sysretake := false
if s == _Prunning || s == _Psyscall { // Process P if it's _Prunning or _Psyscall
// Preempt G if it's running for too long. t := int64() // If P is _Prunning or _Psyscall, then process P.
t := int64() // P's schedtick is used to keep track of how many times P has been scheduled
if int64() ! = t {
= uint32(t) // If the system monitor and scheduling times don't match, update the system monitor with the scheduling times and scheduling timestamps.
= now
} else if +forcePreemptNS <= now { // forcePreemptNS is 10ms, if P's goroutine runs longer than 10ms then preempt P
preemptone(pp) // preempt P
// In case of syscall, preemptone() doesn't // work, because there is no MAC address.
// work, because there is no M wired to P.
sysretake = true // Set the retake flag to true.
sysretake = true // Set the retake flag to true.}
}
        ...
    }
    unlock(&allpLock)
return uint32(n)
}

The point here is that if P's goroutine is taking too long to run, it enters thepreemptone(pp) Preempts P, i.e., preempts goroutines that are taking too long to run.

1.1 Preempting goroutines that take too long to run

go intopreemptone

func preemptone(pp *p) bool {
	mp := () // P Bound threads
	if mp == nil || mp == getg().m {
		return false
	}
	gp := // threaded goroutine,precisely goroutine long
	if gp == nil || gp == mp.g0 {
		return false
	}

     = true // Set the preemption flag bit to true

    // Every call in a goroutine checks for stack overflow by
	// comparing the current stack pointer to gp->stackguard0.
	// Setting gp->stackguard0 to StackPreempt folds
	// preemption into the normal stack overflow check.
	gp.stackguard0 = stackPreempt // The official notes are clear enough,set up goroutine (used form a nominal expression) stackguard0 because of stackPreempt,stackPreempt 是一个比任何栈都大(used form a nominal expression)数

    // Request an async preemption of this P.
	if preemptMSupported && == 0 { // Whether to enable asynchronous preemption,Let's ignore
		 = true
		preemptM(mp)
	}

	return true
}

As can be seen.preemptone The main thing is to update the goroutine'sgp.stackguard0Why are you updating this?

Primarily, this value is used by the scheduler to determine whether the current goroutine should be preempted when the function is next called.

Let's look at a goroutine stack as follows:

func gpm() {
	print("hello runtime")
}

func main() {
	go gpm()
	(1 * )
	print("hello main")
}

Put a breakpoint on the goroutine.dlv Go to the breakpoint:

(dlv) b 
Breakpoint 1 set at 0x46232a for () ./:5
(dlv) c
> () ./:5 (hits goroutine(5):1 total:1) (PC: 0x46232a)
     1: package main
     2:
     3: import "time"
     4:
=>   5: func gpm() {
     6:         print("hello runtime")
     7: }
     8:
     9: func main() {
    10:         go gpm()
(dlv) disass
TEXT (SB) /root/go/src/foundation/gpm/
        :5       0x462320        493b6610        cmp rsp, qword ptr [r14+0x10]
        :5       0x462324        762a            jbe 0x462350
        :5       0x462326        55              push rbp
        :5       0x462327        4889e5          mov rbp, rsp
=>      :5       0x46232a*       4883ec10        sub rsp, 0x10
        :6       0x46232e        e82d28fdff      call $
        ...
        :5       0x462350        e8abb1ffff      call $runtime.morestack_noctxt
        :5       0x462355        ebc9            jmp $

exist in the stack, first executing thecmp rsp, qword ptr [r14+0x10] instruction, this instruction means that the top of the current stack and the[r14+0x10] Comparison.[r14+0x10] is the stackguard0 value of the goroutine. If rsp is greater thang.stackguard0 indicates that the stack capacity is sufficient, and if it is less thang.stackguard0 Indicates that there is not enough stack space and you need to execute thejbe 0x462350 The jump instruction, which calls thecall $runtime.morestack_noctxt Expanding the stack.

Here, if the goroutine is to be preempted, then theg.stackguard0 will besysmon is set to a very large value. goroutine (the function in) executes thecmp rsp, qword ptr [r14+0x10] instruction compares the top-of-stack pointer with theg.stackguard0Because the top of the stack, rsp, must be less than the top of the stack. Because the top-of-stack rsp must be less thang.stackguard0Callcall $runtime.morestack_noctxt Expanding the stack.

go intoruntime.morestack_noctxt

// morestack but not preserving ctxt.
TEXT runtime·morestack_noctxt(SB),NOSPLIT,$0
	MOVL $0, DX
	JMP runtime·morestack(SB)

TEXT runtime·morestack(SB),NOSPLIT|NOFRAME,$0-0
    ...
    // There's a lot of content.,Here are just the highlights and grabbing related present (sb for a job etc)
    BL runtime·newstack(SB)
    ...

go into

func newstack() {
    thisg := getg()
    ...
    gp :=
    ...
    stackguard0 := (&gp.stackguard0)
    preempt := stackguard0 == stackPreempt // in the event that gp.stackguard0 == stackPreempt,Then set the preemption flag preempt == true
    if preempt {
		if !canPreemptM() { // Determine whether it is possible to seize
			// Let the goroutine keep running for now.
			// gp->preempt is set, so it will be preempted next time.
			gp.stackguard0 = + stackGuard // in the event that不能抢占,resumption gp.stackguard0 constitute a normal value
			gogo(&) // never return // gogo fulfillment goroutine
		}
	}
    ...
    if preempt { // fulfillment到这里,clarification goroutine It's up for grabs.,Again, determine if the preemption flag is true
		if gp == .g0 {
			throw("runtime: preempt g0")
		}
		if == 0 && == 0 {
			throw("runtime: g is running but p is not")
		}

		...

		if { // Determine if the preemption type is preemptStop,This type is similar to the GC have sth to do with,We won't discuss it here.
			preemptPark(gp) // never returns
		}

		// Act like goroutine called .
		gopreempt_m(gp) // never return // focus on gopreempt_m seizure
	}
    ...
}

newstack will execute the preemption logic, as shown in the annotation, after layers of execution, calling thegopreempt_m Preempts goroutines that are taking too long to run:

func gopreempt_m(gp *g) {
goschedImpl(gp)
}

func goschedImpl(gp *g) {
status := readgstatus(gp) // Get the status of the goroutine.
if status&^_Gscan ! = _Grunning {
dumpgstatus(gp)
throw("bad g status")
}
casgstatus(gp, _Grunning, _Grunnable) // while the goroutine is still running, update the status of the goroutine to _Grunnable
dropg() // Call dropg to unbind the thread from the goroutine.
lock(&)
globrunqput(gp) // put the goroutine on the global runnable queue, because it's been running long enough that it won't be put on P's local queue.
unlock(&)

schedule() // The thread re-enters the scheduling logic to run the next _Grunnable goroutine.
}

At this point, we know how to preempt goroutines that run too long.

Sort out the execution process again:

  1. sysmon The monitor thread finds a goroutine that is taking too long to run and updates the goroutine's stackguard0 to a stackPreempt value that is larger than any of the stacks
  2. When a thread makes a function call, it checks the stack space of the goroutine stack by comparing the top of the stack rsp to g.stackguard0.
  3. Because it updates the goroutine stack'sstackguard0The thread will walk to the extension logic and enter the corresponding preempt scheduling according to the preempt flag bit.

2. Summary

This talk introduced the sysmon thread, and along with the sysmon thread, the implementation of a goroutine that preempts a long runtime. The next talk will continue with an introduction to sysmon threads and goroutines that preempt system calls that take too long.