golang源码分析:uber-go/goleak检查goroutine泄漏原理

2023-09-06 19:26:33 浏览数 (2)

https://github.com/uber-go/goleak是一个检测goroutine泄漏的工具,首先我们看下如何使用,然后分析下源码实现,看看它的具体原理。起一个groutine泄漏的例子。

代码语言:javascript复制
package leak

func leak() {
  ch := make(chan struct{})
  go func() {
    ch <- struct{}{}
  }()
}

我们就可以在单测中引入上述包,只需要一个语句

defer goleak.VerifyNone(t)即可

代码语言:javascript复制
package leak

import (
  "testing"

  "go.uber.org/goleak"
)


func Test_leak(t *testing.T) {
  defer goleak.VerifyNone(t)
  tests := []struct {
    name string
  }{
    // TODO: Add test cases.
    {},
  }
  for _, tt := range tests {
    t.Run(tt.name, func(t *testing.T) {
      leak()
    })
  }
}

运行之后结果如下:

代码语言:javascript复制

--- FAIL: Test_leak (0.45s)
    /Users/xiazemin/groutine/leak/leak_test.go:25: found unexpected goroutines:
        [Goroutine 22 in state chan send, with groutine/leak.leak.func1 on top of the stack:
        goroutine 22 [chan send]:
        groutine/leak.leak.func1()
          /Users/xiazemin/groutine/leak/leak.go:6  0x2c
        created by groutine/leak.leak
          /Users/xiazemin/groutine/leak/leak.go:5  0x6e
        ]
FAIL

可以看出打印出了泄漏栈,对于大量单测,我们不想这么麻烦怎么办呢?可以在TestMain里加上语句即可goleak.VerifyTestMain(m),看下完整例子

代码语言:javascript复制
package leak

import (
  "testing"

  "go.uber.org/goleak"
)

func TestMain(m *testing.M) {
  goleak.VerifyTestMain(m)
}

func Test_leakM(t *testing.T) {
  tests := []struct {
    name string
  }{
    // TODO: Add test cases.
    {},
  }
  for _, tt := range tests {
    t.Run(tt.name, func(t *testing.T) {
      leak()
    })
  }
}

执行下测试,结果如下:

代码语言:javascript复制
goleak: Errors on successful test run: found unexpected goroutines:
[Goroutine 6 in state chan send, with groutine/leak.leak.func1 on top of the stack:
goroutine 6 [chan send]:
groutine/leak.leak.func1()
  /Users/xiazemin/groutine/leak/leak.go:6  0x2c
created by groutine/leak.leak
  /Users/xiazemin/groutine/leak/leak.go:5  0x6e
]

体验完应用后,我们开始分析下它的源码,它提供了两个接口

代码语言:javascript复制
func VerifyTestMain(m TestingM, options ...Option) 
func VerifyNone(t TestingT, options ...Option)

其内部逻辑基本一样,分为三步

代码语言:javascript复制
opts := buildOpts(options...)

cleanup, opts.cleanup = opts.cleanup, nil

if err := Find(opts); err != nil {
  t.Error(err)
}
  
cleanup(0)

我们先看下buildOpts

代码语言:javascript复制
func buildOpts(options ...Option) *opts {
  opts := &opts{
    maxRetries: _defaultRetries,
    maxSleep:   100 * time.Millisecond,
  }
  opts.filters = append(opts.filters,
    isTestStack,
    isSyscallStack,
    isStdLibStack,
    isTraceStack,
  )
  for _, option := range options {
    option.apply(opts)
  }
  return opts
}

它里面定义了最大重试次数和过滤器,依次看下每个过滤器

1,过滤掉测试函数,看下调用栈的入口函数是不是测试函数,如果是,判断状态是不是等待接受chan,是说明可以过滤掉。

代码语言:javascript复制
func isTestStack(s stack.Stack) bool {
  // Until go1.7, the main goroutine ran RunTests, which started
  // the test in a separate goroutine and waited for that test goroutine
  // to end by waiting on a channel.
  // Since go1.7, a separate goroutine is started to wait for signals.
  // T.Parallel is for parallel tests, which are blocked until all serial
  // tests have run with T.Parallel at the top of the stack.
  switch s.FirstFunction() {
  case "testing.RunTests", "testing.(*T).Run", "testing.(*T).Parallel":
    // In pre1.7 and post-1.7, background goroutines started by the testing
    // package are blocked waiting on a channel.
    return strings.HasPrefix(s.State(), "chan receive")
  }
  return false
}

2,过滤掉系统调用函数

代码语言:javascript复制
func isSyscallStack(s stack.Stack) bool {
  // Typically runs in the background when code uses CGo:
  // https://github.com/golang/go/issues/16714
  return s.FirstFunction() == "runtime.goexit" && strings.HasPrefix(s.State(), "syscall")
}

3,过滤掉stdlib函数

代码语言:javascript复制
func isStdLibStack(s stack.Stack) bool {
  // Importing os/signal starts a background goroutine.
  // The name of the function at the top has changed between versions.
  if f := s.FirstFunction(); f == "os/signal.signal_recv" || f == "os/signal.loop" {
    return true
  }

  // Using signal.Notify will start a runtime goroutine.
  return strings.Contains(s.Full(), "runtime.ensureSigM")
}

4,过滤掉trace函数

代码语言:javascript复制

func isTraceStack(s stack.Stack) bool {
  return strings.Contains(s.Full(), "runtime.ReadTrace")
}

5,除了上述选项外,我们也可以自定义选项。

接着看下Find函数

代码语言:javascript复制
func Find(options ...Option) error {
  cur := stack.Current().ID()

  opts := buildOpts(options...)
  if opts.cleanup != nil {
    return errors.New("Cleanup can only be passed to VerifyNone or VerifyTestMain")
  }
  var stacks []stack.Stack
  retry := true
  for i := 0; retry; i   {
    stacks = filterStacks(stack.All(), cur, opts)

    if len(stacks) == 0 {
      return nil
    }
    retry = opts.retry(i)
  }

  return fmt.Errorf("found unexpected goroutines:n%s", stacks)
}

它首先获取当前goroutine的ID,然后获取所有其它的goroutine,使用上面定义的过滤函数选项进行过滤。最后判断过滤完后有没有剩余函数,没有说明没有goroutine泄漏。

获取当前goroutine过程如下

代码语言:javascript复制
func Current() Stack {
  return getStacks(false)[0]
}

调用了

代码语言:javascript复制
func getStacks(all bool) []Stack {
  stackReader := bufio.NewReader(bytes.NewReader(getStackBuffer(all)))
       line, err := stackReader.ReadString('n')
       if strings.HasPrefix(line, "goroutine ") {
         id, goState := parseGoStackHeader(line)
            curStack = &Stack{
              id:        id,
              state:     goState,
              fullStack: &bytes.Buffer{},
            }
       
}

读取goroutine调用栈信息,然后,进行解析,存储到Stack结构体里,供后面使用。

其中获取goroutine栈使用了系统函数runtime.Stack,第二个参数为false标识获取当前goroutine的,否则获取所有goroutine的。

代码语言:javascript复制
func getStackBuffer(all bool) []byte {
  for i := _defaultBufferSize; ; i *= 2 {
    buf := make([]byte, i)
    if n := runtime.Stack(buf, all); n < i {
      return buf[:n]
    }
  }
}

然后解析出goroutine的ID

代码语言:javascript复制
func parseGoStackHeader(line string) (goroutineID int, state string) {
  line = strings.TrimSuffix(line, ":n")
  parts := strings.SplitN(line, " ", 3)
  if len(parts) != 3 {
    panic(fmt.Sprintf("unexpected stack header format: %q", line))
  }

  id, err := strconv.Atoi(parts[1])

获取所有goroutine的调用的过程是一样的

代码语言:javascript复制
func All() []Stack {
  return getStacks(true)
}

然后就是过滤当前goroutine和过滤器过滤的过程

代码语言:javascript复制
func filterStacks(stacks []stack.Stack, skipID int, opts *opts) []stack.Stack {
  filtered := stacks[:0]
  for _, stack := range stacks {
    // Always skip the running goroutine.
    if stack.ID() == skipID {
      continue
    }
    // Run any default or user-specified filters.
    if opts.filter(stack) {
      continue
    }
    filtered = append(filtered, stack)
  }
  return filtered
}

总结下:它的原理是跑完单测以后,分析下当前的goroutine栈,过滤掉当前goroutine、测试、系统调用等goroutine,判断还有没有其它goroutine栈存在,如果存在说明有groutine泄漏,将goroutine的栈状态打印出来,以上就是整个库的基本原理。

0 人点赞