drsqpko5286 2014-12-16 12:38 采纳率: 0%
浏览 424
已采纳

在Golang中读取Zlib压缩文件的最有效方法?

I'm reading in and at the same time parsing (decoding) a file in a custom format, which is compressed with zlib. My question is how can I efficiently uncompress and then parse the uncompressed content without growing the slice? I would like to parse it whilst reading it into a reusable buffer.

This is for a speed-sensitive application and so I'd like to read it in as efficiently as possible. Normally I would just ioutil.ReadAll and then loop again through the data to parse it. This time I'd like to parse it as it's read, without having to grow the buffer into which it is read, for maximum efficiency.

Basically I'm thinking that if I can find a buffer of the perfect size then I can read into this, parse it, and then write over the buffer again, then parse that, etc. The issue here is that the zlib reader appears to read an arbitrary number of bytes each time Read(b) is called; it does not fill the slice. Because of this I don't know what the perfect buffer size would be. I'm concerned that it might break up some of the data that I wrote into two chunks, making it difficult to parse because one say uint64 could be split from into two reads and therefore not occur in the same buffer read - or perhaps that can never happen and it's always read out in chunks of the same size as were originally written?

  1. What is the optimal buffer size, or is there a way to calculate this?
  2. If I have written data into the zlib writer with f.Write(b []byte) is it possible that this same data could be split into two reads when reading back the compressed data (meaning I will have to have a history during parsing), or will it always come back in the same read?
  • 写回答

2条回答 默认 最新

  • douhoujun9304 2014-12-16 17:36
    关注

    OK, so I figured this out in the end using my own implementation of a reader.

    Basically the struct looks like this:

    type reader struct {
     at int
     n int
     f io.ReadCloser
     buf []byte
    }
    

    This can be attached to the zlib reader:

    // Open file for reading
    fi, err := os.Open(filename)
    if err != nil {
        return nil, err
    }
    defer fi.Close()
    // Attach zlib reader
    r := new(reader)
    r.buf = make([]byte, 2048)
    r.f, err = zlib.NewReader(fi)
    if err != nil {
        return nil, err
    }
    defer r.f.Close()
    

    Then x number of bytes can be read straight out of the zlib reader using a function like this:

    mydata := r.readx(10)
    
    func (r *reader) readx(x int) []byte {
        for r.n < x {
            copy(r.buf, r.buf[r.at:r.at+r.n])
            r.at = 0
            m, err := r.f.Read(r.buf[r.n:])
            if err != nil {
                panic(err)
            }
            r.n += m
        }
        tmp := make([]byte, x)
        copy(tmp, r.buf[r.at:r.at+x]) // must be copied to avoid memory leak
        r.at += x
        r.n -= x
        return tmp
    }
    

    Note that I have no need to check for EOF because I my parser should stop itself at the right place.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥15 QT6颜色选择对话框显示不完整
  • ¥20 能提供一下思路或者代码吗
  • ¥15 用twincat控制!
  • ¥15 请问一下这个运行结果是怎么来的
  • ¥15 单通道放大电路的工作原理
  • ¥30 YOLO检测微调结果p为1
  • ¥15 DS18B20内部ADC模数转换器
  • ¥15 做个有关计算的小程序
  • ¥15 如何用MATLAB实现以下三个公式(有相互嵌套)
  • ¥30 关于#算法#的问题:运用EViews第九版本进行一系列计量经济学的时间数列数据回归分析预测问题 求各位帮我解答一下