Go逃逸分析

First Post:

Last Update:

Word Count:
1.6k

Read Time:
8 min

Golang 逃逸学习

golang的内存管理比较安全,不可直接操作内存,且在编译期间会进行数组越界检查,在运行的时候直接报错。

这篇文章就记录一下golang逃逸的一些相关原理吧。

golang逃逸分析wiki

In compiler optimization, escape analysis is a method for determining the dynamic scope of pointers - where in the program a pointer can be accessed. It is related to pointer analysis and shape analysis.

When a variable (or an object) is allocated in a subroutine, a pointer to the variable can escape to other threads of execution, or to calling subroutines. If an implementation uses tail call optimization (usually required for functional languages), objects may also be seen as escaping to called subroutines. If a language supports first-class continuations (as do Scheme and Standard ML of New Jersey), portions of the call stack may also escape.

If a subroutine allocates an object and returns a pointer to it, the object can be accessed from undetermined places in the program — the pointer has “escaped”. Pointers can also escape if they are stored in global variables or other data structures that, in turn, escape the current procedure.

Escape analysis determines all the places where a pointer can be stored and whether the lifetime of the pointer can be proven to be restricted only to the current procedure and/or thread

逃逸分析优势

1 最大的好处应该是减少gc的压力,不逃逸的对象分配在栈上,当函数返回时就回收了资源,不需要gc标记清除。

2 因为逃逸分析完后可以确定哪些变量可以分配在栈上,栈的分配比堆快,性能好

3 同步消除,如果你定义的对象的方法上有同步锁,但在运行时,却只有一个线程在访问,此时逃逸分析后的机器码,会去掉同步锁运行。

go与c/c++编译的程序区别

c/c++编译的程序,堆栈空间给的比较少,一般做大型项目的时候,数据量大了就把数据放在堆里储存,而go呢,内存机制有自己本身管理,采用堆栈迁移的方式把堆栈迁移到所开辟出来空间比较大的地方,所以golang 在运行完大多数对象都可以放在堆栈中。下面来跟踪一下golang程序的运行

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
package main

func hack() {
fake_flag := []int64{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}
var p []int64
p = fake_flag
p[1] = 2;
}
func main() {
flag := []int64{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}
for i, v := range flag {
flag[i] = v + 1
}
hack()

}

以上编译为elf文件后是gdb进行调试

main_mian的地址为0x45DA40

程序刚开始运行时的堆栈就为操作系统所给的堆栈空间

1
2
3
4
5
6
7
8
 RSP  0x7fffffffdf30 ◂— 0x1 
RIP 0x45b9e0 (_rt0_amd64_linux) ◂— jmp 0x458580

下断点到main_main函数

RBP 0xc00003e7d0 ◂— 0x0
RSP 0xc00003e780 —▸ 0x42fb29 (runtime.main+521) ◂— mov eax, dword ptr [rip + 0xcbbf5]
RIP 0x45da40 (main.main) ◂— mov rcx, qword ptr fs:[0xfffffffffffffff8]

可以看到以上,rsp与rbp已经不再是0x7f开头的地址了而是进行了堆栈迁移

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
pwndbg> vmmap 
LEGEND: STACK | HEAP | CODE | DATA | RWX | RODATA
0x400000 0x45e000 r-xp 5e000 0 /home/logan/share/bytectf/leak/leak
0x45e000 0x4c8000 r--p 6a000 5e000 /home/logan/share/bytectf/leak/leak
0x4c8000 0x4cc000 rw-p 4000 c8000 /home/logan/share/bytectf/leak/leak
0x4cc000 0x4fe000 rw-p 32000 0 [heap]
0xc000000000 0xc004000000 rw-p 4000000 0
0x7fffd1328000 0x7fffd3699000 rw-p 2371000 0
0x7fffd3699000 0x7fffe3819000 ---p 10180000 0
0x7fffe3819000 0x7fffe381a000 rw-p 1000 0
0x7fffe381a000 0x7ffff56c9000 ---p 11eaf000 0
0x7ffff56c9000 0x7ffff56ca000 rw-p 1000 0
0x7ffff56ca000 0x7ffff7a9f000 ---p 23d5000 0
0x7ffff7a9f000 0x7ffff7aa0000 rw-p 1000 0
0x7ffff7aa0000 0x7ffff7f19000 ---p 479000 0
0x7ffff7f19000 0x7ffff7f1a000 rw-p 1000 0
0x7ffff7f1a000 0x7ffff7f99000 ---p 7f000 0
0x7ffff7f99000 0x7ffff7ff9000 rw-p 60000 0
0x7ffff7ff9000 0x7ffff7ffd000 r--p 4000 0 [vvar]
0x7ffff7ffd000 0x7ffff7fff000 r-xp 2000 0 [vdso]
0x7ffffffde000 0x7ffffffff000 rw-p 21000 0 [stack]
0xffffffffff600000 0xffffffffff601000 --xp 1000 0 [vsyscall]

可以发现新的堆栈空间大小为0x4000000, 而操作系统所给的为0x21000, 所以golang的变量大部分都会优先储存在堆栈上,因为堆栈空间比较大,且内存管理比较简单。

go的逃逸分析

go在动态编译的时候进行逃逸分析,来决定一个对象放栈上还是放堆上,不逃逸的对象放栈上,可能逃逸的放堆上。

开启逃逸分析日志

在编译的时候参数加上-gcflags '-m',为了不产生inline函数,一般都会加上-l

也就是 -gcflags '-m -l'

Example 1

1
2
3
4
5
6
7
package main
import "fmt"

func main() {
s := "Hello World"
fmt.Println(s)
}

逃逸分析

1
2
3
4
5
6
┌[logan☮arch]-(~/share/bytectf/leak)
└> go run -gcflags '-m -l' 1.go
# command-line-arguments
./1.go:6:13: ... argument does not escape
./1.go:6:13: s escapes to heap
Hello World

Example 2

1
2
3
4
5
6
7
8
9
10
11
12
13
package main

type S struct{}

func main() {
var x S
y := &x
_ = *identity(y)
}

func identity(z *S) *S {
return z
}

输出

1
2
3
4
┌[logan☮arch]-(~/share/bytectf/leak)
└> go run -gcflags '-m -l' 2.go
# command-line-arguments
./2.go:11:15: leaking param: z to result ~r1 level=0

x没有被引用,没有发生逃逸

Example 3

1
2
3
4
5
6
7
8
9
10
11
12
package main

type S struct{}

func main() {
var x S
_ = *ref(x)
}

func ref(z S) *S {
return &z
}

输出

1
2
3
4
┌[logan☮arch]-(~/share/bytectf/leak)
└> go run -gcflags '-m -l' 3.go
# command-line-arguments
./3.go:10:10: moved to heap: z

go进行值传递,而在调用ref后进行引用,避免内存错误,则会将z放在heap上。

Example 4

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
package main

type S struct {
M *int
}

func main() {
var i int
refStruct(i)
}

func refStruct(y int) (z S) {
z.M = &y
return z
}

输出

1
2
3
4
┌[logan☮arch]-(~/share/bytectf/leak)
└> go run -gcflags '-m -l' 4.go
# command-line-arguments
./4.go:12:16: moved to heap: y

对y进行了值引用,则使y放在heap上

Example 5

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
package main

type S struct {
M *int
}

func main() {
var i int
refStruct(&i)
}

func refStruct(y *int) (z S) {
z.M = y
return z
}

输出

1
2
3
4
┌[logan☮arch]-(~/share/bytectf/leak)
└> go run -gcflags '-m -l' 5.go
# command-line-arguments
./5.go:12:16: leaking param: y to result z level=0

对原先堆栈里的数据进行引用,没有发生逃逸

Example 6

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
package main

type S struct {
M *int
}

func main() {
var x S
var i int
ref(&i, &x)
}

func ref(y *int, z *S) {
z.M = y
}

输出

1
2
3
4
5
6
┌[logan☮arch]-(~/share/bytectf/leak)
└> go run -gcflags '-m -l' 6.go
# command-line-arguments
./6.go:13:10: leaking param: y
./6.go:13:18: z does not escape
./6.go:9:9: moved to heap: i

z没有逃逸,有两个指针指向i变量,而i逃逸了,go的逃逸分析不知道z和i的关系,逃逸分析不知道参数y是z的一个成员,所以只能把i分配给堆管理

总结

以上是golang编译器通过分析代码会在编译时觉得哪些变量该分配在stack中,哪些变量该分配在heap中。

ref: https://www.bookstack.cn/read/For-learning-Go-Tutorial/src-chapter13-01.0.md

打赏点小钱
支付宝 | Alipay
微信 | WeChat