duanchangnie7996 2019-05-23 06:25
浏览 91
已采纳

如何在Go中执行与Git兼容的十六进制Sha压缩/压缩

I am going through the book Building Git by James Coglan, where James walks you through implementing a basic version of Git in Ruby. I decided to make things more complicated for myself by doing my implementation in Go.

I've gotten to the part where I need to store compressed hashes of file contents into a tree to write to disk, but I am having trouble doing this kind of hex compression/packing that Git is looking for.

Here is the Ruby code im working off of

ENTRY_FORMAT = "A7Z*H40"
MODE = "100644"
FILE_NAME = "tree.rb"
SHA = "baae99010b237a699ff0aba02fd5310c18903b1b"
[MODE, FILE_NAME , SHA].pack(ENTRY_FORMAT)

the Ruby pack method apparently:

The Array#pack method takes an array of various kinds of values and returns a string that represents those values. Exactly how each value gets represented in the string is determined by the format string we pass to pack.

The encoding for the MODE and FILE_NAME I think I am pretty good on. It's the last part that encodes the sha that I am struggling with.

• H40: this encodes a string of forty hexadecimal digits, entry.oid, by packing each pair of digits into a single byte

It's the "packing each pair of digits into a single byte that I can't get my head around. This is my current attempt:

mode := 100644
fileName := "tree.go"
sha:= "baae99010b237a699ff0aba02fd5310c18903b1b"
// slice of strings for constructing the packed sha
var eid []string

// iterate through each character in id
for i := 0; i < len(sha); i += 2 {
    // gathering them in pairs of two
    one, two := sha[i], sha[i+1]
    // compress two digits into one byte
    // using bitwise or?? addition?? bit shifting?? not sure.
    eid = append(eid, string(one|two))
}
// concat the new packed id with the mode and file name.
stringRep := fmt.Sprintf("%-7d", mode) + fileName + "\x00" + strings.Join(eid, "")

Go playground for above code

So for some reason that I can't figure out, the string representation of a tree entry that function produces isn't compatible with how Git stores trees on disk. I've tried shifting the bits before oring them, and I've tried just adding the bytes together, but nothing seems to be working. I basically need to replicate the behavior of the Ruby Array#pack method in a way that Git will accept.

Any guidance or advice is greatly appreciated. I'd be happy to explain more or post more code samples if necessary. Thank you so much for your time!

P.S. more context around the packing git is performing, from Building Git

Git is storing the ID of each entry in a packed format, using twenty bytes for each one. Each hexadecimal digit represents a number from zero to fifteen, where ten is represented by a, eleven by b, and so on up to f for fifteen. In a forty-digit object ID, each digit stands for four bits of a 160-bit number. Instead of splitting those bits into forty chunks of four bits each, we can split it into twenty blocks of eight bits—and eight bits is one byte. So all that’s happening here is that the 160-bit object ID is being stored in binary as twenty bytes, rather than as forty characters standing for hexadecimal digits.

  • 写回答

1条回答 默认 最新

  • doulachan8217 2019-05-23 07:51
    关注

    The functions to convert between binary and hexadecimal strings can be found in the hex package.

    For example : the function to turn an input hex string into an array of bytes (where each byte contains two of the initial hex string digits) is hex.DecodeString -- or hex.Decode if your input is a []byte instead of a string.


    If you want to re-implement this function :

    • each character of the input string should be converted to its numerical value,
    • each pair of values should be treated as a digit in base 16 : var newByte byte = 16*one + two
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥20 求一个html代码,有偿
  • ¥100 关于使用MATLAB中copularnd函数的问题
  • ¥20 在虚拟机的pycharm上
  • ¥15 jupyterthemes 设置完毕后没有效果
  • ¥15 matlab图像高斯低通滤波
  • ¥15 针对曲面部件的制孔路径规划,大家有什么思路吗
  • ¥15 钢筋实图交点识别,机器视觉代码
  • ¥15 如何在Linux系统中,但是在window系统上idea里面可以正常运行?(相关搜索:jar包)
  • ¥50 400g qsfp 光模块iphy方案
  • ¥15 两块ADC0804用proteus仿真时,出现异常