dream989898 2018-10-11 19:03
浏览 114
已采纳

压缩后的输出不同于Go to Ruby的实现

I'm implementing a program that deflates a file into a git blob and stores it appropriately.

I have a ruby reference implementation that's based on an article from the git book

I'm attempting to implement this in go here

However, I'm running into an issue where the stored compressed data differs slightly with each implementation.

vbindiff shows that the first 2 bytes are identical (as run from this test script) (If I'm reading this right). These bytes store the compression method and flags, and flags respectively (as per https://tools.ietf.org/html/rfc1950). The third byte is where the difference begins, this is either the dictionary ID or the start of the original input data. The data remains similar until near the end of the file. I'm assuming this is probably the difference in the ADLER32 checksum.

It seems that both the go and Ruby implementations of zlib do not pass a dictionary to zlib by default (as per go zlib source and ruby zlib source)

The data appears identical.

I'm not sure if there's an implementation error in the libraries or if I'm just missing something.

Why are these outputs different?

  • 写回答

1条回答 默认 最新

  • doutan3040 2018-10-11 19:56
    关注

    The deflate algorithm as defined in RFC 1951 (which is used in the zlib format defined by RFC 1950 and also in gzip defined by RFC 1952) allows variations in the implementation which might lead to different results when compressing. But these results will still decompress to the same value. This allows for a tradeoff of compression time to compression level and makes also programs like zopfli possible which achieve better compression than the original zlib library (at the cost of significantly larger compression time).

    Go uses its own implementation of the deflate algorithm written in Go while ruby uses the zlib library. This is the reason your examples create different compressed output on the same input. But if you take the output from the Go or Ruby program and decompress (no matter if done with Ruby or Go or whatever standard-conforming implementation) it again it will result in exactly the same value.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 用windows做服务的同志有吗
  • ¥60 求一个简单的网页(标签-安全|关键词-上传)
  • ¥35 lstm时间序列共享单车预测,loss值优化,参数优化算法
  • ¥15 Python中的request,如何使用ssr节点,通过代理requests网页。本人在泰国,需要用大陆ip才能玩网页游戏,合法合规。
  • ¥100 为什么这个恒流源电路不能恒流?
  • ¥15 有偿求跨组件数据流路径图
  • ¥15 写一个方法checkPerson,入参实体类Person,出参布尔值
  • ¥15 我想咨询一下路面纹理三维点云数据处理的一些问题,上传的坐标文件里是怎么对无序点进行编号的,以及xy坐标在处理的时候是进行整体模型分片处理的吗
  • ¥15 一直显示正在等待HID—ISP
  • ¥15 Python turtle 画图