为什么Go在Windows上将cgo用于简单的File.Write？

Rewriting a simple program from C# to Go, I found the resulting executable 3 to 4 times slower. Expecialy the Go version use 3 to 4 times more CPU. It's surprising because the code does many I/O and is not supposed to consume significant amount of CPU.

I made a very simple version only doing sequential writes, and made benchmarks. I ran the same benchmarks on Windows 10 and Linux (Debian Jessie). The time can't be compared (not the same systems, disks, ...) but the result is interesting.

I'm using the same Go version on both platforms : 1.6

On Windows os.File.Write use cgo (see runtime.cgocall below), not on Linux. Why ?

Here is the disk.go program :

    package main

    import (
        "crypto/rand"
        "fmt"
        "os"
        "time"
    )

    const (
        // size of the test file
        fullSize = 268435456
        // size of read/write per call
        partSize = 128
        // path of temporary test file
        filePath = "./bigfile.tmp"
    )

    func main() {
        buffer := make([]byte, partSize)

        seqWrite := func() error {
            return sequentialWrite(filePath, fullSize, buffer)
        }

        err := fillBuffer(buffer)
        panicIfError(err)
        duration, err := durationOf(seqWrite)
        panicIfError(err)
        fmt.Printf("Duration : %v
", duration)
    }

    // It's just a test ;)
    func panicIfError(err error) {
        if err != nil {
            panic(err)
        }
    }

    func durationOf(f func() error) (time.Duration, error) {
        startTime := time.Now()
        err := f()
        return time.Since(startTime), err
    }

    func fillBuffer(buffer []byte) error {
        _, err := rand.Read(buffer)
        return err
    }

    func sequentialWrite(filePath string, fullSize int, buffer []byte) error {
        desc, err := os.OpenFile(filePath, os.O_WRONLY|os.O_CREATE, 0666)
        if err != nil {
            return err
        }
        defer func() {
            desc.Close()
            err := os.Remove(filePath)
            panicIfError(err)
        }()

        var totalWrote int
        for totalWrote < fullSize {
            wrote, err := desc.Write(buffer)
            totalWrote += wrote
            if err != nil {
                return err
            }
        }

        return nil
    }

The benchmark test (disk_test.go) :

    package main

    import (
        "testing"
    )

    // go test -bench SequentialWrite -cpuprofile=cpu.out
    // Windows : go tool pprof -text -nodecount=10 ./disk.test.exe cpu.out
    // Linux : go tool pprof -text -nodecount=10 ./disk.test cpu.out
    func BenchmarkSequentialWrite(t *testing.B) {
        buffer := make([]byte, partSize)
        err := sequentialWrite(filePath, fullSize, buffer)
        panicIfError(err)
    }

The Windows result (with cgo) :

    11.68s of 11.95s total (97.74%)
    Dropped 18 nodes (cum <= 0.06s)
    Showing top 10 nodes out of 26 (cum >= 0.09s)
          flat  flat%   sum%        cum   cum%
        11.08s 92.72% 92.72%     11.20s 93.72%  runtime.cgocall
         0.11s  0.92% 93.64%      0.11s  0.92%  runtime.deferreturn
         0.09s  0.75% 94.39%     11.45s 95.82%  os.(*File).write
         0.08s  0.67% 95.06%      0.16s  1.34%  runtime.deferproc.func1
         0.07s  0.59% 95.65%      0.07s  0.59%  runtime.newdefer
         0.06s   0.5% 96.15%      0.28s  2.34%  runtime.systemstack
         0.06s   0.5% 96.65%     11.25s 94.14%  syscall.Write
         0.05s  0.42% 97.07%      0.07s  0.59%  runtime.deferproc
         0.04s  0.33% 97.41%     11.49s 96.15%  os.(*File).Write
         0.04s  0.33% 97.74%      0.09s  0.75%  syscall.(*LazyProc).Find

The Linux result (without cgo) :

    5.04s of 5.10s total (98.82%)
    Dropped 5 nodes (cum <= 0.03s)
    Showing top 10 nodes out of 19 (cum >= 0.06s)
          flat  flat%   sum%        cum   cum%
         4.62s 90.59% 90.59%      4.87s 95.49%  syscall.Syscall
         0.09s  1.76% 92.35%      0.09s  1.76%  runtime/internal/atomic.Cas
         0.08s  1.57% 93.92%      0.19s  3.73%  runtime.exitsyscall
         0.06s  1.18% 95.10%      4.98s 97.65%  os.(*File).write
         0.04s  0.78% 95.88%      5.10s   100%  _/home/sam/Provisoire/go-disk.sequentialWrite
         0.04s  0.78% 96.67%      5.05s 99.02%  os.(*File).Write
         0.04s  0.78% 97.45%      0.04s  0.78%  runtime.memclr
         0.03s  0.59% 98.04%      0.08s  1.57%  runtime.exitsyscallfast
         0.02s  0.39% 98.43%      0.03s  0.59%  os.epipecheck
         0.02s  0.39% 98.82%      0.06s  1.18%  runtime.casgstatus

写回答
好问题 0 提建议
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

1条回答默认最新

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
douzi2749 2016-03-04 17:00
关注
Go does not perform file I/O, it delegates the task to the operating system. See the Go operating system dependent syscall packages.

Linux and Windows are different operating systems with different OS ABIs. For example, Linux uses syscalls via syscall.Syscall and Windows uses Windows dlls. On Windows, the dll call is a C call. It doesn't use cgo. It does go through the same dynamic C pointer check used by cgo, runtime.cgocall. There is no runtime.wincall alias.

In summary, different operating systems have different OS call mechanisms.

Command cgo

Passing pointers

Go is a garbage collected language, and the garbage collector needs to know the location of every pointer to Go memory. Because of this, there are restrictions on passing pointers between Go and C.

In this section the term Go pointer means a pointer to memory allocated by Go (such as by using the & operator or calling the predefined new function) and the term C pointer means a pointer to memory allocated by C (such as by a call to C.malloc). Whether a pointer is a Go pointer or a C pointer is a dynamic property determined by how the memory was allocated; it has nothing to do with the type of the pointer.

Go code may pass a Go pointer to C provided the Go memory to which it points does not contain any Go pointers. The C code must preserve this property: it must not store any Go pointers in Go memory, even temporarily. When passing a pointer to a field in a struct, the Go memory in question is the memory occupied by the field, not the entire struct. When passing a pointer to an element in an array or slice, the Go memory in question is the entire array or the entire backing array of the slice.

C code may not keep a copy of a Go pointer after the call returns.

A Go function called by C code may not return a Go pointer. A Go function called by C code may take C pointers as arguments, and it may store non-pointer or C pointer data through those pointers, but it may not store a Go pointer in memory pointed to by a C pointer. A Go function called by C code may take a Go pointer as an argument, but it must preserve the property that the Go memory to which it points does not contain any Go pointers.

Go code may not store a Go pointer in C memory. C code may store Go pointers in C memory, subject to the rule above: it must stop storing the Go pointer when the C function returns.

These rules are checked dynamically at runtime. The checking is controlled by the cgocheck setting of the GODEBUG environment variable. The default setting is GODEBUG=cgocheck=1, which implements reasonably cheap dynamic checks. These checks may be disabled entirely using GODEBUG=cgocheck=0. Complete checking of pointer handling, at some cost in run time, is available via GODEBUG=cgocheck=2.

It is possible to defeat this enforcement by using the unsafe package, and of course there is nothing stopping the C code from doing anything it likes. However, programs that break these rules are likely to fail in unexpected and unpredictable ways.

"These rules are checked dynamically at runtime."

Benchmarks:

To paraphrase, there are lies, damn lies, and benchmarks.

For valid comparisons across operating systems you need to run on identical hardware. For example, the difference between CPUs, memory, and rust or silicon disk I/O. I dual-boot Linux and Windows on the same machine.

Run benchmarks at least three times back-to-back. Operating systems try to be smart. For example, caching I/O. Languages using virtual machines need warm-up time. And so on.

Know what you are measuring. If you are doing sequential I/O, you spend almost all your time in the operating system. Have you turned off malware protection? And so on.

And so on.

Here are some results for disk.go from the same machine using dual-boot Windows and Linux.

Windows:

>go build disk.go >/TimeMem disk Duration : 18.3300322s Elapsed time : 18.38 Kernel time : 13.71 (74.6%) User time : 4.62 (25.1%)

Linux:

$ go build disk.go $ time ./disk Duration : 18.54350723s real 0m18.547s user 0m2.336s sys 0m16.236s

Effectively, they are the same, 18 seconds disk.go duration. Just some variation between operating systems as to what is counted user time and what is counted as kernel or system time. Elapsed or real time is the same.

In your tests, kernel or system time was 93.72% runtime.cgocall versus 95.49% syscall.Syscall.
本回答被题主选为最佳回答 , 对您是否有帮助呢?

解决无用
评论打赏
分享
举报

评论

按下Enter换行，Ctrl+Enter发表内容

报告相同问题？

关注问题

Golang 教程
2018-10-12 17:21

Paul_0920的博客 Go语言是一门强类型的通用编程语言。它的基础语法与C语言很类似，但对于变量的声明有所不同，也对其他的一些优秀编程语言有所借鉴。另外，Go语言支持垃圾回收。与C++相比，Go语言并不包括如异常处理、继承、泛型...
Go 系统编程（一）
2024-07-12 10:21

绝不原创的飞龙的博客《Go 系统编程》是一本将帮助您使用 Go 开发系统软件的书，它是一种系统编程语言，最初是作为谷歌内部项目开始的，后来变得很受欢迎。Go 之所以如此受欢迎，是因为它让开发人员感到愉快，易于编写、易于阅读、易于...
Go微服务 - 第九部分 - 使用RabbitMQ和AMQP进行消息传递
2018-05-22 13:25

weixin_34150224的博客第九部分： Go语言微服务系列 - 使用RabbitMQ和AMQP进行消息传递本文我们将通过RabbitMQ和AMQP协议在Go微服务之间进行消息传递。简介微服务是将应用程序的业务领域分离成具有清晰分离域的边界上下文，运行进程...
精通 Go 并发（二）
2025-09-07 00:44

绝不原创的飞龙的博客我们已经研究了读和读/写锁的两种互斥形式，并开始将其应用于分布式系统，以防止在多个网络系统中出现阻塞和竞争条件。在下一章中，我们将更深入地探讨这些排除和数据一致性概念，构建非阻塞的网络应用程序，并学习...
kubernetes 1.24.2实战与源码(1)
2023-03-11 19:10

theo.wu的博客 ./go.mod ./main.go ./cmd ./cmd/root.go ./cmd/image.go ./LICENSE ./go.sum [root@k8s-worker02 my_cobra]# cobra-cli add container container created at /home/gopath/src/my_cobra [root@k8s-worker02 my_...
用FrankenPHP+sidekick创建一个极高性能的WordPress站点
2024-07-14 21:11

风屿Wind的博客我们需要准备一个服务器，腾讯云，阿里云等均可，建议选择支持试用或者 7 天无理由退款的，这里以为例，优惠码填 wp-admin 并且绑定微信后会获得 WordPress 建站礼包首先我们来到雨云官网，点击右上角的登陆/注册...
第一章 Docker入门基础
2018-11-05 22:27

云原生生态圈的博客简单入门docker的基本使用,文本介绍docker命令的基础使用,docker镜像库，网络，存储的一些知识环境 ubuntu16.04 python3.6 Docker version 1.13.1 Docker ID(为了镜像在公共仓库的pull,push) 一...
Falco 云原生安全实践指南（二）
2025-01-06 00:06

绝不原创的飞龙的博客原文：zh.annas-archive.org/md5/901f31c65e11db9dd25e51adeba7505a 译者：飞龙协议：CC BY-NC-SA 4.0 第八章：输出框架在前几章中，你学到了 Falco 如何收集事件（其输入）以及如何处理它们以使你能够接收重要的...
没有解决我的问题, 去提问

为什么Go在Windows上将cgo用于简单的File.Write？

1条回答 默认 最新

1条回答默认最新