dongwen1909 2016-12-30 18:05 采纳率: 0%
浏览 57
已采纳

是二进制的,读取速度慢吗?

I'm rewriting an old small C project into Go (to learn Go).

The project basically reads some binary data from a file, does some filtering on said data, then prints it into stdout.

The main part of the code looks like this (omitting error handling):

type netFlowRow struct {
    Timestamp uint32
    Srcip     [4]byte
    Dstip     [4]byte
    Proto     uint16
    Srcport   uint16
    Dstport   uint16
    Pkt       uint32
    Size      uint64
}

func main() {
    // ...
    file, _ := os.Open(path)
    for j := 0; j < infoRow.Count; j++ {
        netRow := netFlowRow{}
        binary.Read(file, binary.BigEndian, &netRow)

        // ...
        fmt.Printf("%v", netRow)
    }
}

After doing a naive rewrite go version ran 10 times slower than the C version (~40s vs 2-3s). I did profiling with pprof and it showed me this:

(pprof) top10
39.96s of 40.39s total (98.94%)
Dropped 71 nodes (cum <= 0.20s)
Showing top 10 nodes out of 11 (cum >= 39.87s)
      flat  flat%   sum%        cum   cum%
    39.87s 98.71% 98.71%     39.87s 98.71%  syscall.Syscall
     0.09s  0.22% 98.94%     40.03s 99.11%  encoding/binary.Read
         0     0% 98.94%     39.87s 98.71%  io.ReadAtLeast
         0     0% 98.94%     39.87s 98.71%  io.ReadFull
         0     0% 98.94%     40.03s 99.11%  main.main
         0     0% 98.94%     39.87s 98.71%  os.(*File).Read
         0     0% 98.94%     39.87s 98.71%  os.(*File).read
         0     0% 98.94%     40.21s 99.55%  runtime.goexit
         0     0% 98.94%     40.03s 99.11%  runtime.main
         0     0% 98.94%     39.87s 98.71%  syscall.Read

Am I reading this right? Is syscall.Syscall basically the main time consumer? Is it where the reading from file is going on?

Upd. I used bufio.Reader and got this profile:

(pprof) top10
34.16s of 36s total (94.89%)
Dropped 99 nodes (cum <= 0.18s)
Showing top 10 nodes out of 33 (cum >= 0.56s)
      flat  flat%   sum%        cum   cum%
    31.99s 88.86% 88.86%        32s 88.89%  syscall.Syscall
     0.43s  1.19% 90.06%      0.64s  1.78%  runtime.mallocgc
     0.39s  1.08% 91.14%      1.06s  2.94%  encoding/binary.(*decoder).value
     0.28s  0.78% 91.92%      0.99s  2.75%  reflect.(*structType).Field
     0.28s  0.78% 92.69%      0.28s  0.78%  runtime.duffcopy
     0.24s  0.67% 93.36%      1.64s  4.56%  encoding/binary.sizeof
     0.22s  0.61% 93.97%     34.51s 95.86%  encoding/binary.Read
     0.22s  0.61% 94.58%      0.22s  0.61%  runtime.mach_semaphore_signal
     0.07s  0.19% 94.78%      1.28s  3.56%  reflect.(*rtype).Field
     0.04s  0.11% 94.89%      0.56s  1.56%  runtime.newobject
  • 写回答

1条回答 默认 最新

  • duanfanta6741 2016-12-30 18:31
    关注

    binary.Read will be slower, due to the fact that it uses reflection. I would suggest bench-marking using bufio.Reader and manually invoking the binary.BigEndian methods to read your struct:

    type netFlowRow struct {
        Timestamp uint32   // 0
        Srcip     [4]byte  // 4
        Dstip     [4]byte  // 8
        Proto     uint16   // 12
        Srcport   uint16   // 14
        Dstport   uint16   // 16
        Pkt       uint32   // 18
        Size      uint64   // 22
    }
    
    func main() {
        // ...
        file, _ := os.Open(path)
        r := bufio.NewReader(file)
        for j := 0; j < infoRow.Count; j++ {
            var buff [4 + 4 + 4 + 2 + 2 + 2 + 4 + 8]byte
            if _, err := io.ReadFull(r, buff[:]); err != nil {
                panic(err)
            }
            netRow := netFlowRow{
                Timestamp: binary.BigEndian.Uint32(buff[:4]),
                // Srcip
                // Dstip
                Proto: binary.BigEndian.Uint16(buff[12:14]),
                Srcport: binary.BigEndian.Uint16(buff[14:16]),
                Dstport: binary.BigEndian.Uint16(buff[16:18]),
                Pkt: binary.BigEndian.Uint32(buff[18:22]),
                Size: binary.BigEndian.Uint64(buff[22:30]),
            }
            copy(netRow.Srcip[:], buff[4:8])
            copy(netRow.Dstip[:], buff[8:12])
    
            // ...
            fmt.Printf("%v", netRow)
        }
    }
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥40 如果update 一个列名为参数的value
  • ¥15 基于51单片机的水位检测系统设计中LCD1602一直不显示
  • ¥15 OCS2安装出现问题,请大家给点意见
  • ¥15 有没有大能能帮我出个适应度函数图,T_T
  • ¥15 ros小车启动launch文件报错
  • ¥15 vs2015到期想登陆但是登陆不上
  • ¥15 IPQ5018制作烧录固件,boot运行失败(操作系统-linux)(相关搜索:操作系统)(相关搜索:操作系统)
  • ¥20 icefall在librispeech基础上加入个人数据集
  • ¥30 keepalive高可用故障运维配置询问
  • ¥15 求帮助!国家电网内网u盘突然识别不出来了。