从Hello World谈起

2023-09-02 10:19:25 浏览数 (1)

工作原因很久没更,记得这篇blog还是从java转golang刚刚一个月写下,不知不觉躺了3年了,orz。当时写了不少CRUD 但不了解golang底层,写起来还是有些不踏实。故而整理了这篇关于golang runtime机制冰山一角的文章。 废话少说 全文有些长 建议先马后看人灿烂 全文枯燥预警

Golang的runtime机制是Golang语言的核心组成部分之一,它负责管理和调度goroutine,垃圾回收,内存分配,锁和其他底层功能。

  1. Goroutine

Goroutine是Golang语言中的并发执行单元,它比操作系统线程更轻量级,可以轻松创建和管理。在Golang的runtime机制中,Goroutine的实现是核心部分之一。Goroutine的实现基于M:N线程模型,即多个Goroutine可以运行在少量的线程之上,这样可以更好地利用系统资源。

在runtime/sched包中,有一个goroutine结构体用于描述Goroutine的状态和信息,它的定义如下:

代码语言:javascript复制
type g struct {
    ...
    atomicstatus uint32 //Goroutine的状态
    _p           uintptr //Goroutine关联的P
    ...
}

Goroutine的状态可以是以下几种:

  • _Gidle:Goroutine处于空闲状态,等待新的任务。
  • _Grunnable:Goroutine可以运行,但还没有被调度。
  • _Grunning:Goroutine正在运行。
  • _Gsyscall:Goroutine正在执行系统调用。
  • _Gdead:Goroutine已经完成,但还没有被垃圾回收。

Goroutine的调度是由调度器(scheduler)来完成的。调度器是Golang runtime机制的另一个核心部分,它负责将Goroutine分配给可用的线程。调度器的实现在runtime/proc.go文件中,其中最重要的函数是schedule函数,它的作用是将Goroutine分配给可用的线程。

  1. 垃圾回收

垃圾回收是Golang runtime机制的另一个重要部分。在Golang中,垃圾回收是由runtime自动执行的,它负责回收不再使用的内存,防止内存泄漏和溢出。

Golang的垃圾回收算法使用的是标记-清除算法(mark-and-sweep)。该算法分为两个阶段:标记阶段和清除阶段。在标记阶段,垃圾回收器会遍历程序中所有的对象,并标记那些还在使用中的对象。在清除阶段,垃圾回收器会清除那些没有被标记的对象,将它们的内存释放回来。

在runtime/mgc.go文件中,有一个mgc函数用于执行垃圾回收。该函数会遍历所有的对象,并标记那些还在使用中的对象。然后,它会清除那些没有被标记的对象,并将它们的内存释放回来。

  1. 内存分配

内存分配是Golang runtime机制的另一个重要部分。在Golang中,内存分配是由runtime自动完成的,它负责管理程序中所有的内存分配和释放操作。

在runtime/malloc.go文件中,有一个mallocgc函数用于执行内存分配。该函数会从堆中分配一块内存,并返回一个指向该内存的指针。分配的内存会被垃圾回收器自动管理,当内存不再使用时,它会被释放回到堆中。

锁是Golang runtime机制的另一个核心部分。在Golang中,锁用于保护共享资源,防止多个Goroutine同时访问同一资源导致的竞态条件。

在runtime/lock.go文件中,有一个mutex结构体用于描述锁的状态和信息,它的定义如下:

代码语言:javascript复制
type mutex struct {
    state int32 //锁的状态
    ...
}

锁的状态可以是以下几种:

  • mutexLocked:锁被占用。
  • mutexUnlocked:锁未被占用。

在runtime/lock_futex.go文件中,有一个lock函数用于获取锁,有一个unlock函数用于释放锁。这两个函数会通过操作系统提供的内核级别的futex机制来实现锁的同步和唤醒。

总之,Golang的runtime机制是Golang语言的核心组成部分之一,它负责管理和调度goroutine,垃圾回收,内存分配,锁和其他底层功能。通过深入了解Golang的runtime机制,我们可以更好地理解Golang的并发模型,从而更加高效地编写并发程序。

而今天我们就从一个简单的hello world程序来追溯一下golang进程的启动和调度,一窥golang runtime的冰山一角。

搬个砖的同学都清楚,所有的程序都会有一个main函数作为入口,golang(go版本1.13.5)也不例外,不过用户定义的main.main并不是真正的入口,在这之前还会有一段plan9的汇编引导程序,接下来我们使用gdb一点一点找到程序入口慢慢触及golang的runtime。

gdb的安装就不再赘述了,我们先来写一个golang版的hello world

代码语言:javascript复制
package main

func main(){
 println("hello world")
}

接下来我们使用go build -gcflags "-N -l" -o hello hello.go对源码进行编译(-gcflags "-N -l" 参数关闭编译器代码优化和函数内联,避免断点和单步执行无法准确对应源码行,避免小函数和局部变量被优化掉)然后gdb hello执行之:

代码语言:javascript复制
root@4ff18d748169:/home/workspace# gdb hello
(gdb) info files
Symbols from "/home/workspace/hello".
Local exec file:
	`/home/workspace/hello', file type elf64-x86-64.
	Entry point: 0x44d730
	0x0000000000401000 - 0x0000000000452601 is .text
	0x0000000000453000 - 0x000000000048379f is .rodata
	0x0000000000483960 - 0x00000000004840dc is .typelink
	0x00000000004840e0 - 0x00000000004840e8 is .itablink
	0x00000000004840e8 - 0x00000000004840e8 is .gosymtab
	0x0000000000484100 - 0x00000000004c17f3 is .gopclntab
	0x00000000004c2000 - 0x00000000004c2020 is .go.buildinfo
	0x00000000004c2020 - 0x00000000004c2c08 is .noptrdata
	0x00000000004c2c20 - 0x00000000004c4ab0 is .data
	0x00000000004c4ac0 - 0x00000000004dff30 is .bss
	0x00000000004dff40 - 0x00000000004e2668 is .noptrbss
	0x0000000000400f9c - 0x0000000000401000 is .note.go.buildid
(gdb) b *0x44d730
Note: breakpoint 1 also set at pc 0x44d730.
Breakpoint 2 at 0x44d730: file /home/app/go/src/runtime/rt0_linux_amd64.s, line 8.

可以看到我们很容易就看到go程序的真正入口,接下来我们一步一步调试看看go进程启动时如何初始化的

初始化

通过汇编文件名找到对应的汇编源码:

代码语言:javascript复制
#include "textflag.h"

TEXT _rt0_amd64_linux(SB),NOSPLIT,$-8
	JMP	_rt0_amd64(SB)

TEXT _rt0_amd64_linux_lib(SB),NOSPLIT,$0
	JMP	_rt0_amd64_lib(SB)

可以看到直接无条件跳转到_rt0_amd64(SB) gdb接着断点找到跳转具体位置

代码语言:javascript复制
(gdb) b _rt0_amd64
Breakpoint 8 at 0x449d60: file /home/app/go/src/runtime/asm_amd64.s, line 15.

找到对应源码

代码语言:javascript复制
// _rt0_amd64 is common startup code for most amd64 systems when using
// internal linking. This is the entry point for the program from the
// kernel for an ordinary -buildmode=exe program. The stack holds the
// number of arguments and the C-style argv.
TEXT _rt0_amd64(SB),NOSPLIT,$-8
	MOVQ	0(SP), DI	// argc
	LEAQ	8(SP), SI	// argv
	JMP	runtime·rt0_go(SB)

可以看到接着又无条件跳转到了runtime·rt0_go(SB) 如法炮制:

代码语言:javascript复制
gdb) b runtime.rt0_go
Breakpoint 3 at 0x449d70: file /home/app/go/src/runtime/asm_amd64.s, line 89.


TEXT runtime·rt0_go(SB),NOSPLIT,$0
... ...
//程序启动时必定会有一个线程启动(主线程)
//将当前的栈和资源保存在g0
//将该线程保存在m0
// set the per-goroutine and per-mach "registers"
	get_tls(BX)
	LEAQ	runtime·g0(SB), CX
	MOVQ	CX, g(BX)
	LEAQ	runtime·m0(SB), AX
	//m0和g0相互绑定
	// save m->g0 = g0
	MOVQ	CX, m_g0(AX)
	// save m0 to g0->m
	MOVQ	AX, g_m(CX)

	CLD				// convention is D is always left cleared
	CALL	runtime·check(SB)

	MOVL	16(SP), AX		// copy argc
	MOVL	AX, 0(SP)
	MOVQ	24(SP), AX		// copy argv
	MOVQ	AX, 8(SP)
	//处理args
	CALL	runtime·args(SB)
	//os初始化 os_linux.go 主要干了一个事儿 获取系统的cpu个数
	CALL	runtime·osinit(SB)
	//调度系统初始化 proc.go
	CALL	runtime·schedinit(SB)

	//创建一个goroutine 然后启动程序
	// create a new goroutine to start program
	MOVQ	$runtime·mainPC(SB), AX		// entry
	PUSHQ	AX
	PUSHQ	$0			// arg size
	CALL	runtime·newproc(SB)
	POPQ	AX
	POPQ	AX
	//启动线程并启动调度系统
	// start this M
	CALL	runtime·mstart(SB)

	CALL	runtime·abort(SB)	// mstart should never return
	RET

	// Prevent dead-code elimination of debugCallV1, which is
	// intended to be called by debuggers.
	MOVQ	$runtime·debugCallV1(SB), AX
	RET

DATA	runtime·mainPC 0(SB)/8,$runtime·main(SB)
GLOBL	runtime·mainPC(SB),RODATA,$8

其实asm_amd64.s的汇编源码中的初始化过程相当复杂,这里我们只介绍几个我们比较关心的步骤:

  • 命令行参数处理
  • 系统初始化
  • 调度系统初始化

命令行参数处理

代码语言:javascript复制
(gdb) b runtime.args
Breakpoint 4 at 0x432b60: file /home/app/go/src/runtime/runtime1.go, line 60.

func args(c int32, v **byte) {
	argc = c
	argv = v
	sysargs(c, v)
}

args函数整理命令行参数

系统初始化

runtime.osinit系统初始化其实就干了确定CPU core数量这一个事儿

代码语言:javascript复制
(gdb) b runtime.osinit
Breakpoint 5 at 0x423030: file /home/app/go/src/runtime/os_linux.go, line 289.

func osinit() {
	ncpu = getproccount()
	physHugePageSize = getHugePageSize()
}

调度系统初始化

schedinit()函数注释已经帮我们简单描述了启动的过程,我们关注的运行时环境的初始化构造也基本都在这里被调用。

代码语言:javascript复制
(gdb) b runtime.schedinit
Breakpoint 6 at 0x427690: file /home/app/go/src/runtime/proc.go, line 529.
代码语言:javascript复制
// The bootstrap sequence is:
//
// call osinit
// call schedinit
// make & queue new G
// call runtime·mstart
//
// The new G calls runtime·main.
func schedinit() {
 // raceinit must be the first call to race detector.
 // In particular, it must be done before mallocinit below calls racemapshadow.
 _g_ := getg()
 if raceenabled {
  _g_.racectx, raceprocctx0 = raceinit()
 }
 //最大系统线程数量限制
 sched.maxmcount = 10000

 tracebackinit()
 moduledataverify()
 //栈相关初始化
 stackinit()
 //内存相关初始化
 mallocinit()
 //调度器相关初始化
 mcommoninit(_g_.m)
 cpuinit()       // must run before alginit
 alginit()       // maps must not be used before this call
 modulesinit()   // provides activeModules
 typelinksinit() // uses maps, activeModules
 itabsinit()     // uses activeModules

 msigsave(_g_.m)
 initSigmask = _g_.m.sigmask
 //处理命令行参数和环境变量
 goargs()
 goenvs()
 //处理GODEBUG、GOTRACEBACK调试相关的环境变量设置
 parsedebugvars()
 //垃圾回收器初始化
 gcinit()

 sched.lastpoll = uint64(nanotime())
 //通过CPU core和GOMAXPROCS环境变量确定P的数量
 procs := ncpu //默认等于CPU个数
 if n, ok := atoi32(gogetenv("GOMAXPROCS")); ok && n > 0 {
  procs = n
 }
 //调整P数量
 if procresize(procs) != nil {
  throw("unknown runnable goroutine during bootstrap")
 }

 // For cgocheck > 1, we turn on the write barrier at all times
 // and check all pointer writes. We can't do this until after
 // procresize because the write barrier needs a P.
 if debug.cgocheck > 1 {
  writeBarrier.cgo = true
  writeBarrier.enabled = true
  for _, p := range allp {
   p.wbBuf.reset()
  }
 }

 if buildVersion == "" {
  // Condition should never trigger. This code just serves
  // to ensure runtime·buildVersion is kept in the resulting binary.
  buildVersion = "unknown"
 }
 if len(modinfo) == 1 {
  // Condition should never trigger. This code just serves
  // to ensure runtime·modinfo is kept in the resulting binary.
  modinfo = ""
 }
}

根据注释 go进程启动简化大致分为:

  1. call osinit 调用runtime.osinit()获取系统CPU个数
  2. call schedinit 调用runtime.schedinit()初始化调度系统,进行P的初始化,并将m0与某个P绑定
  3. make & queue new G 调用runtime.newproc新建一个goroutine即主线程 它的任务函数为runtime.main,创建好后放到m0绑定的P的本地队列
  4. call runtime·mstart 调用runtime.mstart启动m 这样m启动后就能从自己绑定的P的本地队列拿到runtime.main任务进行调度了

启动调度系统

那么CALL runtime·mstart(SB) 是如何启动m 开始调度的呢?源码下面无秘密:

代码语言:javascript复制
(gdb) b runtime.mstart
Breakpoint 9 at 0x429150: file /home/app/go/src/runtime/proc.go, line 1146.

源码一窥:

代码语言:javascript复制
// mstart is the entry-point for new Ms.
//
// This must not split the stack because we may not even have stack
// bounds set up yet.
//
// May run during STW (because it doesn't have a P yet), so write
// barriers are not allowed.
//
//go:nosplit
//go:nowritebarrierrec
func mstart() {
 //获取g0
 _g_ := getg()
 ... ...
 //主要逻辑在mstart1()
 mstart1()
  ... ...
}

func mstart1() {
 //获取g0
 _g_ := getg()
 //确保g是系统栈上的g0 调度只能在g0上运行
 if _g_ != _g_.m.g0 {
  throw("bad runtime·mstart")
 }

 // Record the caller for use as the top of stack in mcall and
 // for terminating the thread.
 // We're never coming back to mstart1 after we call schedule,
 // so other calls can reuse the current frame.
 save(getcallerpc(), getcallersp())
 asminit()
 //初始化m 主要是设置线程的备用信号堆栈和信号掩码 没有深入研究过
 minit()

 // Install signal handlers; after minit so that minit can
 // prepare the thread to be able to handle the signals.
 //如果_g_绑定的m是m0 则执行mstartm0()
 if _g_.m == &m0 {
  //对于初始m,需要一些特殊处理,主要是设置系统信号量的处理函数
  mstartm0()
 }

 // 如果有m的起始任务函数,则执行,比如 sysmon 函数
 // 对于m0来说,是没有 mstartfn 的
 if fn := _g_.m.mstartfn; fn != nil {
  fn()
 }

 if _g_.m != &m0 {//如果不是m0 需要绑定P
  //绑定P
  acquirep(_g_.m.nextp.ptr())
  _g_.m.nextp = 0
 }
 // 进入调度,而且不会在返回
 schedule()
}

mstart()只是简单调用了mstart1(),而是让`mstart1()来做了调度的前置初始化工作:

  • 调用getg()获取g,检查获取的g不是g0,则直接抛出异常,因为调度器只在g0上执行
  • 初始化m主要是设置线程的备用信号堆栈和信号掩码 没有深入研究过
  • 判断g绑定的是不是m0,如果是则做一些特殊处理,没深入研究
  • 检查m有无初始任务 有 则执行

m0 表示进程启动的第一个线程,它跟普通m没啥区别。但是m0是进程启动通过汇编赋值得到的,而普通m是runtime自己创建的,一个golang进程只有一个m0 g0:每个m都有一个g0,因为每个m都有一个系统堆栈,g0和普通的g结构一样,差异在于g0的栈是系统分配的,在linux上栈的大小默认是固定的8M,不能扩展,也不能缩小。而普通的g的栈一开始只有2kb,可扩展。且g0上没有任何任务函数,也没有任何状态,且不能被调度程序抢占,因为调度程序就是跑在g0上。

实际的调度逻辑在schedule()

代码语言:javascript复制
// One round of scheduler: find a runnable goroutine and execute it.
// Never returns.
func schedule() {
 _g_ := getg()

 if _g_.m.locks != 0 {
  throw("schedule: holding locks")
 }

 if _g_.m.lockedg != 0 {
  stoplockedm()
  execute(_g_.m.lockedg.ptr(), false) // Never returns.
 }

 // We should not schedule away from a g that is executing a cgo call,
 // since the cgo call is using the m's g0 stack.
 if _g_.m.incgo {
  throw("schedule: in cgo")
 }

top:
 //如果当前GC需要STW,则调用gcstopm()休眠当前m
 if sched.gcwaiting != 0 {
  gcstopm()
  //STW结束后回到top
  goto top
 }
 if _g_.m.p.ptr().runSafePointFn != 0 {
  runSafePointFn()
 }

 var gp *g
 var inheritTime bool

 // Normal goroutines will check for need to wakeP in ready,
 // but GCworkers and tracereaders will not, so the check must
 // be done here instead.
 tryWakeP := false
 if trace.enabled || trace.shutdown {
  gp = traceReader()
  if gp != nil {
   casgstatus(gp, _Gwaiting, _Grunnable)
   traceGoUnpark(gp, 0)
   tryWakeP = true
  }
 }
 if gp == nil && gcBlackenEnabled != 0 {
  gp = gcController.findRunnableGCWorker(_g_.m.p.ptr())
  tryWakeP = tryWakeP || gp != nil
 }
 if gp == nil {
  // Check the global runnable queue once in a while to ensure fairness.
  // Otherwise two goroutines can completely occupy the local runqueue
  // by constantly respawning each other.
  //每隔61次调度 从全局队列获取goroutine
  if _g_.m.p.ptr().schedticka == 0 && sched.runqsize > 0 {
   lock(&sched.lock)
   gp = globrunqget(_g_.m.p.ptr(), 1)
   unlock(&sched.lock)
  }
 }
 if gp == nil {
  //从P的本地队列获取goroutine
  gp, inheritTime = runqget(_g_.m.p.ptr())
  if gp != nil && _g_.m.spinning {
   throw("schedule: spinning with local work")
  }
 }
 if gp == nil {
  //findrunnable()想尽办法获取goroutine找不到就不返回
  gp, inheritTime = findrunnable() // blocks until work is available
 }

 // This thread is going to run a goroutine and is not spinning anymore,
 // so if it was marked as spinning we need to reset it now and potentially
 // start a new spinning M.
 if _g_.m.spinning {
  resetspinning()
 }

 if sched.disable.user && !schedEnabled(gp) {
  // Scheduling of this goroutine is disabled. Put it on
  // the list of pending runnable goroutines for when we
  // re-enable user scheduling and look again.
  lock(&sched.lock)
  if schedEnabled(gp) {
   // Something re-enabled scheduling while we
   // were acquiring the lock.
   unlock(&sched.lock)
  } else {
   sched.disable.runnable.pushBack(gp)
   sched.disable.n  
   unlock(&sched.lock)
   goto top
  }
 }

 // If about to schedule a not-normal goroutine (a GCworker or tracereader),
 // wake a P if there is one.
 if tryWakeP {
  if atomic.Load(&sched.npidle) != 0 && atomic.Load(&sched.nmspinning) == 0 {
   wakep()
  }
 }
 if gp.lockedm != 0 {
  // Hands off own p to the locked m,
  // then blocks waiting for a new p.
  startlockedm(gp)
  goto top
 }
 //找到goroutine 执行其任务函数
 execute(gp, inheritTime)
}
调度如何找到goroutine

线程启动后需要找到可执行的任务goruntine,大致逻辑为:

  1. 每隔61次调度 会从全局队列中获取goroutine 避免全局队列饿死。
  2. 如果全局队列未获取到,则从m绑定的P的本地队列获取
  3. 如果本地队列仍未获取到,则调用findrunnable()方法获取 获取不到则不返回
全局队列获取

if _g_.m.p.ptr().schedticka == 0 && sched.runqsize > 0 每隔61次调度,通过globrunqget()从全局队列获取goroutine,获取逻辑不复杂

代码语言:javascript复制
// Try get a batch of G's from the global runnable queue.
// Sched must be locked.
func globrunqget(_p_ *p, max int32) *g {
 if sched.runqsize == 0 {
  return nil
 }

 n := sched.runqsize/gomaxprocs   1
 if n > sched.runqsize {
  n = sched.runqsize
 }
 if max > 0 && n > max {
  n = max
 }
 if n > int32(len(_p_.runq))/2 {
  n = int32(len(_p_.runq)) / 2
 }

 sched.runqsize -= n

 gp := sched.runq.pop()
 n--
 for ; n > 0; n-- {
  gp1 := sched.runq.pop()
  runqput(_p_, gp1, false)
 }
 return gp
}
本地队列获取

全局队列里拿不到任务,则尝试从本地队列获取。

代码语言:javascript复制
// Get g from local runnable queue.
// If inheritTime is true, gp should inherit the remaining time in the
// current time slice. Otherwise, it should start a new time slice.
// Executed only by the owner P.
func runqget(_p_ *p) (gp *g, inheritTime bool) {
 // If there's a runnext, it's the next G to run.
 for {
  next := _p_.runnext
  if next == 0 {
   break
  }
  if _p_.runnext.cas(next, 0) {
   return next.ptr(), true
  }
 }

 for {
  h := atomic.LoadAcq(&_p_.runqhead) // load-acquire, synchronize with other consumers
  t := _p_.runqtail
  if t == h {
   return nil, false
  }
  gp := _p_.runq[h%uint32(len(_p_.runq))].ptr()
  if atomic.CasRel(&_p_.runqhead, h, h 1) { // cas-release, commits consume
   return gp, false
  }
 }
}
findrunnable

本地队列还拿不到,则调用findrunnable()g,找不到就把m给睡了,让他等待唤醒。

代码语言:javascript复制
// Finds a runnable goroutine to execute.
// Tries to steal from other P's, get g from global queue, poll network.
func findrunnable() (gp *g, inheritTime bool) {
 _g_ := getg()

 // The conditions here and in handoffp must agree: if
 // findrunnable would return a G to run, handoffp must start
 // an M.

top:
 _p_ := _g_.m.p.ptr()
 if sched.gcwaiting != 0 {
  gcstopm()
  goto top
 }
 if _p_.runSafePointFn != 0 {
  runSafePointFn()
 }
 if fingwait && fingwake {
  if gp := wakefing(); gp != nil {
   ready(gp, 0, true)
  }
 }
 if *cgo_yield != nil {
  asmcgocall(*cgo_yield, nil)
 }

 // local runq
 if gp, inheritTime := runqget(_p_); gp != nil {
  return gp, inheritTime
 }

 // global runq
 if sched.runqsize != 0 {
  lock(&sched.lock)
  gp := globrunqget(_p_, 0)
  unlock(&sched.lock)
  if gp != nil {
   return gp, false
  }
 }

 // Poll network.
 // This netpoll is only an optimization before we resort to stealing.
 // We can safely skip it if there are no waiters or a thread is blocked
 // in netpoll already. If there is any kind of logical race with that
 // blocked thread (e.g. it has already returned from netpoll, but does
 // not set lastpoll yet), this thread will do blocking netpoll below
 // anyway.
 if netpollinited() && atomic.Load(&netpollWaiters) > 0 && atomic.Load64(&sched.lastpoll) != 0 {
  if list := netpoll(false); !list.empty() { // non-blocking
   gp := list.pop()
   injectglist(&list)
   casgstatus(gp, _Gwaiting, _Grunnable)
   if trace.enabled {
    traceGoUnpark(gp, 0)
   }
   return gp, false
  }
 }

 // Steal work from other P's.
 procs := uint32(gomaxprocs)
 if atomic.Load(&sched.npidle) == procs-1 {
  // Either GOMAXPROCS=1 or everybody, except for us, is idle already.
  // New work can appear from returning syscall/cgocall, network or timers.
  // Neither of that submits to local run queues, so no point in stealing.
  goto stop
 }
 // If number of spinning M's >= number of busy P's, block.
 // This is necessary to prevent excessive CPU consumption
 // when GOMAXPROCS>>1 but the program parallelism is low.
 if !_g_.m.spinning && 2*atomic.Load(&sched.nmspinning) >= procs-atomic.Load(&sched.npidle) {
  goto stop
 }
 if !_g_.m.spinning {
  _g_.m.spinning = true
  atomic.Xadd(&sched.nmspinning, 1)
 }
 for i := 0; i < 4; i   {
  for enum := stealOrder.start(fastrand()); !enum.done(); enum.next() {
   if sched.gcwaiting != 0 {
    goto top
   }
   stealRunNextG := i > 2 // first look for ready queues with more than 1 g
   if gp := runqsteal(_p_, allp[enum.position()], stealRunNextG); gp != nil {
    return gp, false
   }
  }
 }

stop:

 // We have nothing to do. If we're in the GC mark phase, can
 // safely scan and blacken objects, and have work to do, run
 // idle-time marking rather than give up the P.
 if gcBlackenEnabled != 0 && _p_.gcBgMarkWorker != 0 && gcMarkWorkAvailable(_p_) {
  _p_.gcMarkWorkerMode = gcMarkWorkerIdleMode
  gp := _p_.gcBgMarkWorker.ptr()
  casgstatus(gp, _Gwaiting, _Grunnable)
  if trace.enabled {
   traceGoUnpark(gp, 0)
  }
  return gp, false
 }

 // wasm only:
 // If a callback returned and no other goroutine is awake,
 // then pause execution until a callback was triggered.
 if beforeIdle() {
  // At least one goroutine got woken.
  goto top
 }

 // Before we drop our P, make a snapshot of the allp slice,
 // which can change underfoot once we no longer block
 // safe-points. We don't need to snapshot the contents because
 // everything up to cap(allp) is immutable.
 allpSnapshot := allp

 // return P and block
 lock(&sched.lock)
 if sched.gcwaiting != 0 || _p_.runSafePointFn != 0 {
  unlock(&sched.lock)
  goto top
 }
 if sched.runqsize != 0 {
  gp := globrunqget(_p_, 0)
  unlock(&sched.lock)
  return gp, false
 }
 if releasep() != _p_ {
  throw("findrunnable: wrong p")
 }
 pidleput(_p_)
 unlock(&sched.lock)

 // Delicate dance: thread transitions from spinning to non-spinning state,
 // potentially concurrently with submission of new goroutines. We must
 // drop nmspinning first and then check all per-P queues again (with
 // #StoreLoad memory barrier in between). If we do it the other way around,
 // another thread can submit a goroutine after we've checked all run queues
 // but before we drop nmspinning; as the result nobody will unpark a thread
 // to run the goroutine.
 // If we discover new work below, we need to restore m.spinning as a signal
 // for resetspinning to unpark a new worker thread (because there can be more
 // than one starving goroutine). However, if after discovering new work
 // we also observe no idle Ps, it is OK to just park the current thread:
 // the system is fully loaded so no spinning threads are required.
 // Also see "Worker thread parking/unparking" comment at the top of the file.
 wasSpinning := _g_.m.spinning
 if _g_.m.spinning {
  _g_.m.spinning = false
  if int32(atomic.Xadd(&sched.nmspinning, -1)) < 0 {
   throw("findrunnable: negative nmspinning")
  }
 }

 // check all runqueues once again
 for _, _p_ := range allpSnapshot {
  if !runqempty(_p_) {
   lock(&sched.lock)
   _p_ = pidleget()
   unlock(&sched.lock)
   if _p_ != nil {
    acquirep(_p_)
    if wasSpinning {
     _g_.m.spinning = true
     atomic.Xadd(&sched.nmspinning, 1)
    }
    goto top
   }
   break
  }
 }

 // Check for idle-priority GC work again.
 if gcBlackenEnabled != 0 && gcMarkWorkAvailable(nil) {
  lock(&sched.lock)
  _p_ = pidleget()
  if _p_ != nil && _p_.gcBgMarkWorker == 0 {
   pidleput(_p_)
   _p_ = nil
  }
  unlock(&sched.lock)
  if _p_ != nil {
   acquirep(_p_)
   if wasSpinning {
    _g_.m.spinning = true
    atomic.Xadd(&sched.nmspinning, 1)
   }
   // Go back to idle GC check.
   goto stop
  }
 }

 // poll network
 if netpollinited() && atomic.Load(&netpollWaiters) > 0 && atomic.Xchg64(&sched.lastpoll, 0) != 0 {
  if _g_.m.p != 0 {
   throw("findrunnable: netpoll with p")
  }
  if _g_.m.spinning {
   throw("findrunnable: netpoll with spinning")
  }
  list := netpoll(true) // block until new work is available
  atomic.Store64(&sched.lastpoll, uint64(nanotime()))
  if !list.empty() {
   lock(&sched.lock)
   _p_ = pidleget()
   unlock(&sched.lock)
   if _p_ != nil {
    acquirep(_p_)
    gp := list.pop()
    injectglist(&list)
    casgstatus(gp, _Gwaiting, _Grunnable)
    if trace.enabled {
     traceGoUnpark(gp, 0)
    }
    return gp, false
   }
   injectglist(&list)
  }
 }
 stopm()
 goto top
}

至此我们回想下schedinit()的逻辑,我们已经将runtime.main作为初始任务放到里m0绑定的某个p的本地队列里了。故而在通过runqget从本地队列里拿g的时候,必然就拿到了runtime.main。下面接着一窥究竟

主线程调度任务

代码语言:javascript复制
//asm_amd64.s
... ...
MOVQ	$runtime·mainPC(SB), AX		// entry
... ...

DATA	runtime·mainPC 0(SB)/8,$runtime·main(SB)
GLOBL	runtime·mainPC(SB),RODATA,$8

代码语言:javascript复制
(gdb) b runtime.main
Breakpoint 7 at 0x426470: file /home/app/go/src/runtime/proc.go, line 113.

上述可知go启动的主线程的任务函数为runtime.main

代码语言:javascript复制
// The main goroutine.
func main() {
 g := getg() //获取main goroutine

 ...

 if GOARCH != "wasm" { // no threads on wasm yet, so no sysmon
    //在系统栈上运行sysmon
  systemstack(func() {
      //分配一个新的m 运行sysmon系统后台监控 定期垃圾回收和调度抢占
   newm(sysmon, nil)
  })
 }
 /*将主 goroutine 锁定到主 OS 线程上, 在初始化期间。 大多数程序不会关心,但有一些确实需要主线程进行某些调用。
  那些可以安排 main.main 在主线程中运行,通过在初始化期间调用 runtime.LockOSThread 来保留锁。
  */
 lockOSThread()
 //确保是主线程
 if g.m != &m0 {
  throw("runtime.main not on m0")
 }
 //runtime内部init函数的执行 编译器动态生成
 doInit(&runtime_inittask) // must be before defer
  ...

 // Defer unlock so that runtime.Goexit during init does the unlock too.
 needUnlock := true
 defer func() {
  if needUnlock {
   unlockOSThread()
  }
 }()

  ...
  //gc 启动一个goroutine进行gc清扫
 gcenable()

 ...
 //执行init函数,编译器动态生成,且包括用户自定义的所有的init函数
 doInit(&main_inittask)

 close(main_init_done)

 needUnlock = false
 unlockOSThread()

 if isarchive || islibrary {
  // A program compiled with -buildmode=c-archive or c-shared
  // has a main, but it is not executed.
  return
 }
  //真正执行用户编写的package main中的main function
 fn := main_main // make an indirect call, as the linker doesn't know the address of the main package when laying down the runtime
 fn()
 ...
  //退出程序
 exit(0)

 for {
  var x *int32
  *x = 0
 }
}

到此我们终于看到我们自己敲的这段

代码语言:javascript复制
package main

func main(){
 println("hello world")
}

main function的调用的地方了。调用之前还做了一些其他工作:

  • 创建一个新的线程来执行sysmon,来定期垃圾回收、调度抢占
  • 检查确保当前是在主线程上运行
  • runtime init函数执行
  • 创建一个线程启动gc清扫
  • 执行main_init函数,执行编译器生成和用户自定义的init函数

func main (){}函数结尾留了一个代码彩蛋

代码语言:javascript复制
for {
  var x *int32
  *x = 0
 }

有人知道是干嘛用的嘛?评论区说出你的想法

0 人点赞