生成二进制可执行文件
首先,写一个go main函数,这里就简单输出以下hello world。
package main
import "fmt"
func main() {
fmt.Println("hello word")
}
执行go build -gcflags "-N -l" -ldflags=-compressdwarf=false -o main main.go生成可执行二进制文件。
开启gdb调试
执行gdb main开始gdb调试。通过i files查看程序入口地址,再这个地址打上断点。
Loading Go Runtime support.
(gdb) i files
Symbols from "/Users/vector/go/src/alg/source/main/main".
Local exec file:
`/Users/vector/go/src/alg/source/main/main', file type mach-o-x86-64.
Entry point: 0x1063c80
0x0000000001001000 - 0x00000000010a6b73 is .text
0x00000000010a6b80 - 0x00000000010ee254 is __TEXT.__rodata
0x00000000010ee260 - 0x00000000010ee386 is __TEXT.__symbol_stub1
0x00000000010ee3a0 - 0x00000000010eeb40 is __TEXT.__typelink
0x00000000010eeb40 - 0x00000000010eebb0 is __TEXT.__itablink
0x00000000010eebb0 - 0x00000000010eebb0 is __TEXT.__gosymtab
0x00000000010eebc0 - 0x0000000001155c85 is __TEXT.__gopclntab
0x0000000001156000 - 0x0000000001156020 is __DATA.__go_buildinfo
0x0000000001156020 - 0x00000000011561a8 is __DATA.__nl_symbol_ptr
0x00000000011561c0 - 0x00000000011646c0 is __DATA.__noptrdata
0x00000000011646c0 - 0x000000000116b7f0 is .data
0x000000000116b800 - 0x000000000119b830 is .bss
0x000000000119b840 - 0x000000000119df08 is __DATA.__noptrbss
(gdb) b *0x1063c80
Breakpoint 1 at 0x1063c80: file /usr/local/go/src/runtime/rt0_darwin_amd64.s, line 8.
执行run程序停在断点处,说明程序入口在/usr/local/go/src/runtime/rt0_darwin_amd64.s的第8行。
(gdb) run
Starting program: /Users/vector/go/src/alg/source/main/main
[New Thread 0xc03 of process 99850]
[New Thread 0x2903 of process 99850]
warning: unhandled dyld version (16)
Thread 2 hit Breakpoint 1, _rt0_amd64_darwin () at /usr/local/go/src/runtime/rt0_darwin_amd64.s:8
8 JMP _rt0_amd64(SB)
用编辑器打开go源码,入口程序执行_rt0_amd64(SB)
TEXT _rt0_amd64_darwin(SB),NOSPLIT,$-8
JMP _rt0_amd64(SB)
gdb输入s继续执行,找到_rt0_amd64()的位置
(gdb) s
_rt0_amd64 () at /usr/local/go/src/runtime/asm_amd64.s:15
15 MOVQ 0(SP), DI // argc
(gdb)
_rt0_amd64函数源码, 这部分主要是读取命令行参数argc、argv,分别读取到寄存器di、si,然后跳到runtime·rt0_go(SB)
TEXT _rt0_amd64(SB),NOSPLIT,$-8
MOVQ 0(SP), DI // argc
LEAQ 8(SP), SI // argv
JMP runtime·rt0_go(SB)
gdb 继续执行,找到runtime.rt0_go
_rt0_amd64 () at /usr/local/go/src/runtime/asm_amd64.s:15
15 MOVQ 0(SP), DI // argc
[...]
(gdb) s
runtime.rt0_go () at /usr/local/go/src/runtime/asm_amd64.s:89
89 MOVQ DI, AX // argc
(gdb)
runtime.rt0_go 部分代码比较长,我们分块来看,首先是将命令行参数放到堆栈中,将栈顶寄存器SP进行16字节对齐。
TEXT runtime·rt0_go(SB),NOSPLIT,$0
// copy arguments forward on an even stack
MOVQ DI, AX // argc 把argc放到AX
MOVQ SI, BX // argv 把argv方法BX
SUBQ $(4*8+7), SP // 2args 2auto
ANDQ $~15, SP // 内存16字节对齐
MOVQ AX, 16(SP) // argc 放到 SP + 16字节处
MOVQ BX, 24(SP) // argv 放到 SP + 24字节处
通过gdb调试看下这里sp地址的变化,首先是执行SUBQ $(4*8+7), SP前后,执行前0x7ffeefbff330,执行后地址0x7ffeefbff309,变化前后的十进制差是39=4*8+7,也就是这里通过移动SP指针分配39字节的内存。至于为什么要分配内存,应该是为了保存argc,argv。 MOVQ BX, 24(SP)是移动8字节的BX到SP+24字节处,这也就是为什么要分配4*8+7内存,要大于32字节。
91 SUBQ $(4*8+7), SP // 2args 2auto
(gdb) i frame
Stack level 0, frame at 0x7ffeefbff338:
rip = 0x10607e6 in runtime.rt0_go (/usr/local/go/src/runtime/asm_amd64.s:91); saved rip = 0x1
called by frame at 0x7ffeefbff340
source language asm.
Arglist at 0x7ffeefbff328, args:
Locals at 0x7ffeefbff328, Previous frame's sp is 0x7ffeefbff338
Saved registers:
// 执行前地址0x7ffeefbff330
rip at 0x7ffeefbff330
(gdb) s
runtime.rt0_go () at /usr/local/go/src/runtime/asm_amd64.s:92
92 ANDQ $~15, SP
(gdb) i frame
Stack level 0, frame at 0x7ffeefbff311:
rip = 0x10607ea in runtime.rt0_go (/usr/local/go/src/runtime/asm_amd64.s:92); saved rip = 0x11bf0
called by frame at 0x7ffeefbff319
source language asm.
Arglist at 0x7ffeefbff301, args:
Locals at 0x7ffeefbff301, Previous frame's sp is 0x7ffeefbff311
Saved registers:
// 执行后地址 0x7ffeefbff309
rip at 0x7ffeefbff309
下面继续看16字节对齐操作,执行ANDQ $~15, SP按位&将0x7ffeefbff309后16位变成0得到0x7ffeefbff300变成16的整数倍,这样做主要是因为CPU中的SSE指令一般都要求16字节对齐。
runtime.rt0_go () at /usr/local/go/src/runtime/asm_amd64.s:92
92 ANDQ $~15, SP
(gdb) i frame
Stack level 0, frame at 0x7ffeefbff311:
rip = 0x10607ea in runtime.rt0_go (/usr/local/go/src/runtime/asm_amd64.s:92); saved rip = 0x11bf0
called by frame at 0x7ffeefbff319
source language asm.
Arglist at 0x7ffeefbff301, args:
Locals at 0x7ffeefbff301, Previous frame's sp is 0x7ffeefbff311
Saved registers:
rip at 0x7ffeefbff309
(gdb) s
runtime.rt0_go () at /usr/local/go/src/runtime/asm_amd64.s:93
93 MOVQ AX, 16(SP)
(gdb) i frame
Stack level 0, frame at 0x7ffeefbff308:
rip = 0x10607ee in runtime.rt0_go (/usr/local/go/src/runtime/asm_amd64.s:93); saved rip = 0x7ffeefbff328
called by frame at 0x7ffeefbff310
source language asm.
Arglist at 0x7ffeefbff2f8, args:
Locals at 0x7ffeefbff2f8, Previous frame's sp is 0x7ffeefbff308
Saved registers:
rip at 0x7ffeefbff300
下面是针对g0的一些操作也是初始goroutine, g0的栈初始大小大约64k,从下面的代码中可以看到g_stackguard0是开启CGO时会用到的。
MOVQ $runtime·g0(SB), DI //将g0放到 DI
LEAQ (-64*1024+104)(SP), BX // 将SP-64*1024+104的地址放到BX
MOVQ BX, g_stackguard0(DI) // 将BX赋值给g0.g_stackguard0
MOVQ BX, g_stackguard1(DI) // 将BX赋值给g0.g_stackguard1
MOVQ BX, (g_stack+stack_lo)(DI) // 将BX赋值g0.g_stack.stack_lo goroutine栈底部
MOVQ SP, (g_stack+stack_hi)(DI) // 将SP赋值g0.g_stack.stack_hi goroutine栈顶部
再往下就是关于cpu信息的处理以及CGO的初始化,这部分就略过,接着往下看是根据操作系统类型判断是否进行TLS的初始化,如果不满足这几个操作系统就执行TLS初始化并校验是否支持TLS,满足就直接执行ok部分的代码。
#ifdef GOOS_plan9
// skip TLS setup on Plan 9
JMP ok
#endif
#ifdef GOOS_solaris
// skip TLS setup on Solaris
JMP ok
#endif
#ifdef GOOS_illumos
// skip TLS setup on illumos
JMP ok
#endif
#ifdef GOOS_darwin
// skip TLS setup on Darwin
JMP ok
#endif
// 将m0的m_tls地址放到DI寄存器
LEAQ runtime·m0+m_tls(SB), DI
// 对m0设置tls
CALL runtime·settls(SB)
// store through it, to make sure it works
// 将tls地址放到寄存器BX,也就是m0.m_tls[1]的地址
get_tls(BX)
// 把常量0x123拷贝到BX保存的地址指向的位置也就是m0.m_tls
MOVQ $0x123, g(BX)
// 将m0.m_tls的值拷贝到AX
MOVQ runtime·m0+m_tls(SB), AX
// 比较是否相等,支持TLS的话这里就是相等的
CMPQ AX, $0x123
JEQ 2(PC)
// 不支持TLS就退出程序
CALL runtime·abort(SB)
继续看ok部分的代码,这部分主要是进行g0和m0的绑定,变量类型校验,获取命令行参数,进行osinit,schedinit,最后启动一个新的goroutine,执行main函数
ok:
// set the per-goroutine and per-mach "registers"
// 进行g0和m0的双向绑定
get_tls(BX)
LEAQ runtime·g0(SB), CX
MOVQ CX, g(BX)
LEAQ runtime·m0(SB), AX
// save m->g0 = g0
MOVQ CX, m_g0(AX)
// save m0 to g0->m
MOVQ AX, g_m(CX)
CLD // convention is D is always left cleared
// 进行变量类型校验
CALL runtime·check(SB)
// 解析命令行参数
MOVL 16(SP), AX // copy argc
MOVL AX, 0(SP)
MOVQ 24(SP), AX // copy argv
MOVQ AX, 8(SP)
CALL runtime·args(SB)
// 进行系统信息获取,cpu核数,内存页大小
CALL runtime·osinit(SB)
// 进行各种初始化内存分配,gc等
CALL runtime·schedinit(SB)
// 启动系统监控任务
MOVQ $runtime·mainPC(SB), AX // entry
PUSHQ AX
PUSHQ $0 // arg size
// 创建一个新的goroution放到p中
CALL runtime·newproc(SB)
POPQ AX
POPQ AX
// 启动m,执行调度循环,执行goroutine
CALL runtime·mstart(SB)
CALL runtime·abort(SB) // mstart should never return
RET
// Prevent dead-code elimination of debugCallV1, which is
// intended to be called by debuggers.
MOVQ $runtime·debugCallV1(SB), AX
RET
小结
大概了解了go启动流程,其中runtime.schedinit是启动过程内容最多的一块,下一步需要详细看下这部分的内容。