duanhuan7750 2013-10-18 17:35
浏览 30
已采纳

使用mgo驱动程序,Mongo连接计数每10秒爬行一次

We monitor our mongoDB connection count using this:

http://godoc.org/labix.org/v2/mgo#GetStats

However, we have been facing a strange connection leak issue where the connectionCount creeps up consistently by 1 more open connection per 10 seconds. (That's regardless whether there is any requests). I can spin up a server in localhost, leave it there, do nothing, the conectionCount will still creep up. Connection count eventually creeps up to a few thousand and it kills the app/db then and we have to restart the app.

This might not be enough information for you to debug. Does anyone have any ideas, connection leaks that you have dealt with in the past. How did you debug it? What are some of the way that I can debug this.

We have tried a few things, we scanned our code base for any code that could open a connection and put counters/debugging statements there, and so far we have found no leak. It is almost like there is a leak in a library somewhere.

This is a bug in a branch that we have been working on and there have been a few hundred commits into it. We have done a diff between this and master and couldn't find why there is a connection leak in this branch.

As an example, there is the dataset that I am referencing:

Clusters:      1   
MasterConns:   9936      <-- creeps up 1 per second
SlaveConns:    -7359     <-- why is this negative?
SentOps:       42091780   
ReceivedOps:   38684525   
ReceivedDocs:  39466143   
SocketsAlive:  78        <-- what is the difference between the socket count and the master conns count?
SocketsInUse:  1231   
SocketRefs:    1231

MasterConns is the number that creeps up one per 10 second. I am not entirely sure what the other numbers can mean.

  • 写回答

1条回答 默认 最新

  • duanju8431 2013-10-18 18:32
    关注

    MasterConns cannot tell you whether there's a leak or not, because it does not decrease. The field indicates the number of connections made since the last statistics reset, not the number of sockets that are currently in use. The latter is indicated by the SocketsAlive field.

    To give you some additional relief on the subject, every single test in the mgo suite is wrapped around logic that ensures that statistics show sane values after the test finishes, so that potential leaks don't go unnoticed. That's the main reason why such statistics collection system was introduced.

    Then, the reason why you see this number increasing every 10 seconds or so is due to the internal activity that happens to learn the status of the cluster. That said, this behavior was recently changed so that it doesn't establish new connections and instead picks existent sockets from the pool, so I believe you're not using the latest release.

    Having SlaveConns negative looks like a bug. There's a small edge case about statistics collection for connections made, because we cannot tell whether a given server is a master or a slave before we've talked to it, so there might be an uncovered path. If you still see that behavior after you upgrade, please report the issue and I'll be happy to look at it.

    SocketsInUse is the number of sockets that are still being referenced by one or more sessions, whether they are alive (the connection is established) or not. SocketsAlive is, again, the real number of live TCP connections. The delta between the two indicates that a number of sessions were not closed. This may be okay, if they are still being held in memory by the application and will eventually be closed, or it may be a leak if a session.Close operation was missed by the application.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 素材场景中光线烘焙后灯光失效
  • ¥15 请教一下各位,为什么我这个没有实现模拟点击
  • ¥15 执行 virtuoso 命令后,界面没有,cadence 启动不起来
  • ¥50 comfyui下连接animatediff节点生成视频质量非常差的原因
  • ¥20 有关区间dp的问题求解
  • ¥15 多电路系统共用电源的串扰问题
  • ¥15 slam rangenet++配置
  • ¥15 有没有研究水声通信方面的帮我改俩matlab代码
  • ¥15 ubuntu子系统密码忘记
  • ¥15 保护模式-系统加载-段寄存器