runtime._ExternalCode Cpu使用率过高,高达80%

I wrote a tcp handler in golang, about 300 connections per second. There was no problem with the program just released to production. But after running for about 10 days, I see that the cpu usage is up to 100%. I used the golang tool "go tool pprof" to get the information of cpu usage :

File: gateway-w
Type: cpu
Time: Nov 7, 2018 at 5:38pm (CST)
Duration: 30.14s, Total samples = 30.13s (  100%)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top
Showing nodes accounting for 27.42s, 91.01% of 30.13s total
Dropped 95 nodes (cum <= 0.15s)
Showing top 10 nodes out of 28
      flat  flat%   sum%        cum   cum%
    24.69s 81.94% 81.94%     24.69s 81.94%  runtime._ExternalCode /usr/local/go/src/runtime/proc.go
     0.57s  1.89% 83.84%      0.57s  1.89%  runtime.lock /usr/local/go/src/runtime/lock_futex.go
     0.56s  1.86% 85.70%      0.56s  1.86%  runtime.unlock /usr/local/go/src/runtime/lock_futex.go
     0.26s  0.86% 86.56%      5.37s 17.82%  gateway-w/connect/connect-tcp.tcpStartSession /go/src/gateway-w/connect/connect-tcp/tcp_framework.go
     0.25s  0.83% 87.39%      1.67s  5.54%  net.(*conn).Read /usr/local/go/src/net/net.go
     0.24s   0.8% 88.18%      1.41s  4.68%  net.(*netFD).Read /usr/local/go/src/net/fd_unix.go
     0.23s  0.76% 88.95%      0.23s  0.76%  runtime.nanotime /usr/local/go/src/runtime/sys_linux_amd64.s
     0.22s  0.73% 89.68%      0.22s  0.73%  internal/poll.(*fdMutex).incref /usr/local/go/src/internal/poll/fd_mutex.go
     0.21s   0.7% 90.38%      0.21s   0.7%  internal/poll.(*fdMutex).rwunlock /usr/local/go/src/internal/poll/fd_mutex.go
     0.19s  0.63% 91.01%      0.19s  0.63%  internal/poll.(*fdMutex).rwlock /usr/local/go/src/internal/poll/fd_mutex.go

my tcpHandle code is like this:

func tcpStartSession(conn net.Conn) {

    defer closeTcp(conn)

    var (last, n int
        err     error
        buff    []byte
    )

    last, n, err, buff =
        0, 0, nil,
        make([]byte, MAX_PACKET_LEN)

    for {
        // set read timeout
        conn.SetReadDeadline(time.Now().Add(time.Duration(tcpTimeOutSec) * time.Second))
        n, err = conn.Read(buff[last:])

        if err != nil {
            log.Info("tcp read error maybe timeout , ", err)
            break
        }
        if n == 0 {
            log.Debug("empty packet, continue")
            continue
        }
        log.Debug("read bytes ", n)

        log.Info("get a raw package:", hex.EncodeToString(buff[:last+n]))
        last += n
        ...

        for {
            if last == 0 {
                break
            }

            ret, err := protoHandle.IsWhole(buff[:last])
            if err != nil {
                log.Warn("proto handle check iswhole error", err)
            }

            log.Debug("rest buffer len = %d
", ret)
            if ret < 0 {
                //wait for more tcp fragment.
                break
            }
            packetLen := last - ret
            packetBuf := make([]byte, packetLen)
            copy(packetBuf, buff[:packetLen])

            last = ret

            if last > 0 {
                copy(buff, buff[packetLen:packetLen+last])
            }

            ...

        }
    }
}

I can't understand what runtime._ExternalCode means. This is the function inside golang.

my golang version is :go version go1.9.2 linux/amd64

my program is running on docker

my docker version is : 1.12.6

I hope someone can help me. Thank you very much!


I tried to upgrade the golang version to 1.10.3. After running for more than half a year, there was no problem. Recently, the same problem occurred, but I have not changed the program code. I suspect that there is a problem with this code:

conn.SetReadDeadline(time.Now().Add(time.Duration(tcpTimeOutSec) * time.Second))

Need your help, thank you.

drhzn3911
drhzn3911 谢谢。我试图在docker中更改centos版本和golang版本,至少直到现在它仍在正常运行。
接近 2 年之前 回复
drj14664
drj14664 也许ExternalCode是底层的unix读取syscall。也许代码在热循环中旋转?可能最后一次是MAX_PACKET_LEN吗?
接近 2 年之前 回复
dph6308
dph6308 显然,“ExternalCode意味着花费在本机动态库中的时间。”性能
接近 2 年之前 回复
douzhimao8656
douzhimao8656 谢谢!,但是我的程序未使用CGO_ENABLED=0-_-..编译,并且什么工具可以在C库的各个部分进行概要分析?
接近 2 年之前 回复
dqd3690
dqd3690 程序使用CGO_ENABLED=0编译了吗?如果不是,那么问题可能出在程序的C语言部分。pprof无法剖析C库的各个部分
接近 2 年之前 回复

1个回答



已经确认您的程序不是 </ em>用CGO_ENABLED = 0 </ p>
\构建的 n

问题(可能是在程序的C go部分中)。 pprof无法在C库的各个部分中进行剖析</ p>

我相信还有其他一些东西像“ time”一样算作“ _External”。现在在某些系统上</ p>
</ div >

展开原文

原文

As you have confirmed that your program was not built with CGO_ENABLED=0

The problem is (probably) in the C go parts of the program. pprof can't profile inside sections of C libraries

I believe there are some other things that count as "_External" like time.Now on some systems

Csdn user default icon
上传中...
上传图片
插入图片
抄袭、复制答案,以达到刷声望分或其他目的的行为,在CSDN问答是严格禁止的,一经发现立刻封号。是时候展现真正的技术了!
立即提问
相关内容推荐