drsqpko5286 2014-12-16 12:38 采纳率: 0%
浏览 424
已采纳

在Golang中读取Zlib压缩文件的最有效方法?

I'm reading in and at the same time parsing (decoding) a file in a custom format, which is compressed with zlib. My question is how can I efficiently uncompress and then parse the uncompressed content without growing the slice? I would like to parse it whilst reading it into a reusable buffer.

This is for a speed-sensitive application and so I'd like to read it in as efficiently as possible. Normally I would just ioutil.ReadAll and then loop again through the data to parse it. This time I'd like to parse it as it's read, without having to grow the buffer into which it is read, for maximum efficiency.

Basically I'm thinking that if I can find a buffer of the perfect size then I can read into this, parse it, and then write over the buffer again, then parse that, etc. The issue here is that the zlib reader appears to read an arbitrary number of bytes each time Read(b) is called; it does not fill the slice. Because of this I don't know what the perfect buffer size would be. I'm concerned that it might break up some of the data that I wrote into two chunks, making it difficult to parse because one say uint64 could be split from into two reads and therefore not occur in the same buffer read - or perhaps that can never happen and it's always read out in chunks of the same size as were originally written?

  1. What is the optimal buffer size, or is there a way to calculate this?
  2. If I have written data into the zlib writer with f.Write(b []byte) is it possible that this same data could be split into two reads when reading back the compressed data (meaning I will have to have a history during parsing), or will it always come back in the same read?
  • 写回答

2条回答 默认 最新

  • douhoujun9304 2014-12-16 17:36
    关注

    OK, so I figured this out in the end using my own implementation of a reader.

    Basically the struct looks like this:

    type reader struct {
     at int
     n int
     f io.ReadCloser
     buf []byte
    }
    

    This can be attached to the zlib reader:

    // Open file for reading
    fi, err := os.Open(filename)
    if err != nil {
        return nil, err
    }
    defer fi.Close()
    // Attach zlib reader
    r := new(reader)
    r.buf = make([]byte, 2048)
    r.f, err = zlib.NewReader(fi)
    if err != nil {
        return nil, err
    }
    defer r.f.Close()
    

    Then x number of bytes can be read straight out of the zlib reader using a function like this:

    mydata := r.readx(10)
    
    func (r *reader) readx(x int) []byte {
        for r.n < x {
            copy(r.buf, r.buf[r.at:r.at+r.n])
            r.at = 0
            m, err := r.f.Read(r.buf[r.n:])
            if err != nil {
                panic(err)
            }
            r.n += m
        }
        tmp := make([]byte, x)
        copy(tmp, r.buf[r.at:r.at+x]) // must be copied to avoid memory leak
        r.at += x
        r.n -= x
        return tmp
    }
    

    Note that I have no need to check for EOF because I my parser should stop itself at the right place.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥15 素材场景中光线烘焙后灯光失效
  • ¥15 请教一下各位,为什么我这个没有实现模拟点击
  • ¥15 执行 virtuoso 命令后,界面没有,cadence 启动不起来
  • ¥50 comfyui下连接animatediff节点生成视频质量非常差的原因
  • ¥20 有关区间dp的问题求解
  • ¥15 多电路系统共用电源的串扰问题
  • ¥15 slam rangenet++配置
  • ¥15 有没有研究水声通信方面的帮我改俩matlab代码
  • ¥15 ubuntu子系统密码忘记
  • ¥15 保护模式-系统加载-段寄存器