2019-01-31 08:19


I have a small daemon written on Golang, which works in a loop and does some stuff. I've discovered, the daemon behaves differently in cases when it's compiled with CGO_ENABLE=1 or CGO_ENABLED=0. For example, with CGO_ENABLE=1 (which is default) the program's VSZ bloats up to 1-2GB during short period of time (within a hour). With CGO_ENABLED=0, VSZ is the same during long period of time (over days). Look at the numbers below:

CGO_ENABLED=1 (daemon has worked 5 minutes)

$ grep -E 'VmSize|VmRSS' /proc/14916/status
VmSize:    1084052 kB
VmRSS:       12524 kB

CGO_ENABLED=0 (daemon has worked ~30 hours)

$ grep -E 'VmSize|VmRSS' /proc/15160/status
VmSize:    110232 kB
VmRSS:       9756 kB

The daemon is not used CGO-dependent packages or functions. Other Go-written programs show the same behaviour. I know the difference between VSZ and RSS and I'm interesting what is the nature of such behaviour? Why program compiled with CGO_ENABLED=1 asks to provide so much memory from the kernel?

I would prefer answers that are not in the form "don't worry, VSZ is a just virtual memory, and really it's not used by process".

  • 点赞
  • 写回答
  • 关注问题
  • 收藏
  • 复制链接分享
  • 邀请回答


  • dsdapobp26141 dsdapobp26141 2年前

    I could make an educated guess.

    As you probably know, the compiler of the "reference" Go implementation (historically dubbed "gc"; that one, available for download from the main site) by default produces statically-linked binaries. This means, such binaries rely only on the so-called "system calls" provided by the OS kernel and do not depend on any shared libraries provided by the OS (or 3rd parties).

    On Linux-based platforms, this is not completely true: in the default setting (building on Linux for Linux, i.e., not cross-compiling) the generated binary is actually linked with libc and with libpthread (indirectly, via libc).

    This "twist" comes out of the two needs the Go standard library has to interact with the OS:

    1. DNS resolving, which is needed by the net package.
    2. User and group lookup, which is needed by the os package.

    The problem here is two-fold:

    • The Linux itself (that is, the kernel, not the whole OS) does not provide any means to carry out those tasks.

    • Any typical UNIX-like system, since forever, provides for both those tasks using a special facility called "NSS", which is the "Name-Service Switch"¹.

      The NSS provides for pluggable modules which can serve as the databases offering queries of particular type: DNS, user/group database, and more (such as well-known names for "services" etc). A supposedly rather common example of a non-standard provider for the user/group databases is a local service which contacts an LDAP server.

    On a typical GNU/Linux-based OS the NSS is implemented by libc (on less typical systems it might be provided by a separate shared library but this does not change much).

    Since — again, typically, — the libc is a rather stable library in terms of its API (it even provides versioned symbols to be future-proof), the Go authors rightfully decided that linking against libc to import a minimal subset of symbols (mostly getaddrinfo, getnameinfo, getpwnam_r etc) is OK to be done by default as it's safe for 99% of cases, and when it isn't, those who have to tackle these cases usually know what to do anyway.

    So, by default cgo is enabled and used to implement these lookups using NSS.

    If cgo is disabled, the Go compiler instead links in its own fallback implementations which try to mimic a subset of what a full-blown NSS implementation does (i.e. parse /etc/resolv.conf and use the information from it to directly query the DNS servers listed here; parse /etc/passwd and /etc/group to serve the user/group database queries).

    As you can see, in the defult case,

    • The libc gets mapped in, and
    • It is initialized and uses some memory for its own needs — such as obvious caching of the data the NSS calls return.

    Conversely, in the case when cgo is disabled, the above two things do not happen. You have more stdlib code linked in statically but looks like the default case merely trumps the latter one in terms of the overall cumulative RSS usage.

    Consider studying the output of this query for additional fun ;-)

    ¹ not to be confused with Mozilla's libnss.

    点赞 评论 复制链接分享