在Golang中读取Zlib压缩文件的最有效方法？

I'm reading in and at the same time parsing (decoding) a file in a custom format, which is compressed with zlib. My question is how can I efficiently uncompress and then parse the uncompressed content without growing the slice? I would like to parse it whilst reading it into a reusable buffer.

This is for a speed-sensitive application and so I'd like to read it in as efficiently as possible. Normally I would just ioutil.ReadAll and then loop again through the data to parse it. This time I'd like to parse it as it's read, without having to grow the buffer into which it is read, for maximum efficiency.

Basically I'm thinking that if I can find a buffer of the perfect size then I can read into this, parse it, and then write over the buffer again, then parse that, etc. The issue here is that the zlib reader appears to read an arbitrary number of bytes each time Read(b) is called; it does not fill the slice. Because of this I don't know what the perfect buffer size would be. I'm concerned that it might break up some of the data that I wrote into two chunks, making it difficult to parse because one say uint64 could be split from into two reads and therefore not occur in the same buffer read - or perhaps that can never happen and it's always read out in chunks of the same size as were originally written?

What is the optimal buffer size, or is there a way to calculate this?
If I have written data into the zlib writer with f.Write(b []byte) is it possible that this same data could be split into two reads when reading back the compressed data (meaning I will have to have a history during parsing), or will it always come back in the same read?

写回答
好问题 0 提建议
追加酬金
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

2条回答默认最新

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
douhoujun9304 2014-12-16 17:36
关注
OK, so I figured this out in the end using my own implementation of a reader.

Basically the struct looks like this:

type reader struct { at int n int f io.ReadCloser buf []byte }

This can be attached to the zlib reader:

// Open file for reading fi, err := os.Open(filename) if err != nil { return nil, err } defer fi.Close() // Attach zlib reader r := new(reader) r.buf = make([]byte, 2048) r.f, err = zlib.NewReader(fi) if err != nil { return nil, err } defer r.f.Close()

Then x number of bytes can be read straight out of the zlib reader using a function like this:

mydata := r.readx(10) func (r *reader) readx(x int) []byte { for r.n < x { copy(r.buf, r.buf[r.at:r.at+r.n]) r.at = 0 m, err := r.f.Read(r.buf[r.n:]) if err != nil { panic(err) } r.n += m } tmp := make([]byte, x) copy(tmp, r.buf[r.at:r.at+x]) // must be copied to avoid memory leak r.at += x r.n -= x return tmp }

Note that I have no need to check for EOF because I my parser should stop itself at the right place.
本回答被题主选为最佳回答 , 对您是否有帮助呢?

解决无用
评论打赏
分享
举报

评论

按下Enter换行，Ctrl+Enter发表内容

查看更多回答(1条)

报告相同问题？

关注问题

我如何使用zlib在C中压缩并在golang中解压缩
2018-10-24 23:03

回答 1 已采纳 Comparing compressed data tells you nothing. Different compressors, or different versions of the s
在Golang中如何在其他包文件中使用struct？
2019-02-26 05:33

回答 1 已采纳 Well, finally I decided to redesign the project structure. main.go routes routes.go controlle
在Golang中读取二进制.fbx文件
2018-08-24 18:56

回答 2 已采纳 If you still interested in golang FBX reader here is my implementation https://github.com/o5h/fbx.
通向Golang的捷径【12. 读取和写入】
2019-12-30 18:40

点点吃得太多了的博客从键盘或标准输入端 (即 os.Stdin), 可读取用户输入, 当然最简单的方式则是采用 fmt 包中, 给出 Scan-或 Sscan-前缀的函数, 如下: 例 12.1 readinput1.go Scanln 可从标准输入端中, 读取文本输入, 并将连续出现...
正在从Golang中读取文本文件？
2016-07-31 14:11

回答 2 已采纳 Use the bufio package. Here's the basic syntax for opening a text file and looping through each l
求助：Golang文件读取仅读取最后一行
2015-06-04 00:36

回答 1 已采纳 The basic problem is that this file has line endings. It also isn't valid UTF-8. Together, those
如何从golang中读取cookie？
2019-01-20 10:51

回答 1 已采纳 A better way would be to encode your json into base64 for example. I made a working example... ma
golang知识图谱
2021-09-06 17:01

csy2005csy的博客其他 ...我们在进行流式读写（比如读写文件）时，通常会用到该包。 bufio 它在io的基础上提供了缓存功能。在具备了缓存功能后， bufio可以比较方便地提供ReadLine之类的操作。 strconv ...
尝试在Golang中读取gzip文件时出错
2016-10-14 13:19

回答 1 已采纳 You're reading the first 512 bytes of the file, so the gzip.Reader won't ever see that. Since thes
读取文件时如何在golang中删除特殊字符？
2016-02-01 03:11

回答 2 已采纳 The default line separator in text files on Windows is a sequence of two characters: . ^M charac
在Golang中读取Excel文件时出错
2016-04-15 11:37

回答 2 已采纳 change for _, cell := range row.Cells { fmt.Printf("%s ", cell.String(), "123") } to for
GolangWeb架构Beego
2023-05-14 20:15

啊啊啊杨的博客使用日志模块来记录你的操作信息:使用config 模块来解析你各种格式的文件，所以 beego 不仅可以用于 HTTP 类的应用开发，在你的 socket 游戏开发中也是很有用的模块，这也是 beego 如此受欢迎的一个原因，大家如果玩...
渗透测试——六、网站漏洞——文件上传
2023-12-06 20:22

君衍.⠀的博客本篇是学习渗透测试的第六课，网络漏洞之文件上传漏洞，详细内容包含了：文件包含 PHP伪协议、文件上传四、文件包含页面之包含测试五、文件上传页面之上传测试等等，
2021金九银十php/golang面试part2-答案
2021-10-23 19:13

筑梦悠然的博客 1. LRU算法（最晚使用算法）2. 头条（算法特别难，8 皇后问题）3. 其他公司（mid）4. 判断字符串的括号时候正确（）（），（）（5. 最大回文字段6. 二分查找...
Go 语言中各式各样的优化手段
2022-11-14 17:32

简说Linux的博客使用时要注意 go 版本更新后，是否有兼容问题，毕竟 go 团队并没有保证这些未导出的方法变量后续不会变更。和函数信息的，我们可以把第一次的结果缓存起来，后面直接使用。对网络 io，以及定时器的管理，会放到自己...
没有解决我的问题, 去提问

悬赏问题

¥15 QT6颜色选择对话框显示不完整
¥20 能提供一下思路或者代码吗
¥15 用twincat控制！
¥15 请问一下这个运行结果是怎么来的
¥15 单通道放大电路的工作原理
¥30 YOLO检测微调结果p为1
¥15 DS18B20内部ADC模数转换器
¥15 做个有关计算的小程序
¥15 如何用MATLAB实现以下三个公式（有相互嵌套）
¥30 关于#算法#的问题：运用EViews第九版本进行一系列计量经济学的时间数列数据回归分析预测问题求各位帮我解答一下

在Golang中读取Zlib压缩文件的最有效方法？

2条回答 默认 最新

悬赏问题

2条回答默认最新