dongzhoulong1797 2017-11-07 11:31
浏览 146
已采纳

在Golang中散列多个值

I'm currently working on an application in go that needs to cache different resources. Different types of resources have handlers that will know what data is relevant to determine, if we have to rebuild a resource or if we can fetch it from cache. For this purpose handlers shall generate a hash of all relevant data for caching. Depending on the context that data can be primitives (int, float, ...), strings, slices, structs and maps. So almost everything. The number of objects used for hashing might also vary.

To calculate that hash in handlers, I've created a hashing function with variadic parameters of type interface{}.

My current approach is this:

func Hash(objs ...interface{})([]byte) {
    // Use MD5 because it's fast and is reasonably enough protected against accidental collisions.
    // There is no scenario here where intentional created collisions could do harm.
    digester := crypto.MD5.New()

    encoder := gob.NewEncoder(digester)
    encoder.Encode(objs) // In real life one would handle that error

    return digester.Sum(make([]byte, 0))
}

This works. But there are a few things that give me headaches with this implementation. Because I'm not sure that gob will always behave deterministic, for the current version that seems to be the case, but as the referenced answer points out, there might be changes between versions. According to the documentation for gob, default values (0 for ints, empty strings, nil, ...) will be omitted when transporting structs. Also all int values will be transmitted as generic numbers. So unit64 and int will be the same. I can't think of an actual issue with this for my usecase, but this smells like a source for trouble.

Now if I would write that function from scratch I would properly play it safe, traverse the structure with reflection and create a hashing tree. But I don't want to do that.

I'm pretty sure I'm not the first one out there with these requirements, but I wasn't able to find any well tested go code on the web, that addresses this issue.

Appendix

Also see: https://crypto.stackexchange.com/questions/10058/how-to-hash-a-list-of-multiple-items

This is not as trivial as it may seem. Simply concatenating data as pointed out by Adrian won't work, because then Hash("12", "3") and Hash("123") will generate the same hash. One possible solution is to prepend the length of data (and the number of elements of lists) before hashing or to create a tree of hashes, but I can't think of a reliable way to do either of that with complex data structures, without writing a lot of reflection code that handles all the different cases (integers, floats, strings, structs, slices, ...) and walks through the entire structure. I want to avoid that, because there are a lot of special cases that can be overseen doing so and it seems unnecessary complicated to me. This is why I tried to solve the problem using serialization, which will take care of most of the problems described above. I'm just not sure if there are not some drawbacks of using gob in this scenario and if there isn't a smarter solution to the problem.

  • 写回答

2条回答 默认 最新

  • doucongmishang2385 2017-11-07 21:42
    关注

    Perhaps to add to add to Adrian's answer. If you add fmt.Fprintf(digester, reflect.TypeOf(ob)) before each object. That would make Hash("12", "3") and Hash("123") different.

    package main

    import (
        "crypto"
        _ "crypto/md5"
        "fmt"
        "reflect"
    )
    
    func main() {
        fmt.Printf("%x
    ", Hash("12", "3"))
        fmt.Printf("%x
    ", Hash("123"))
    }
    
    func Hash(objs ...interface{}) []byte {
        digester := crypto.MD5.New()
        for _, ob := range objs {
            fmt.Fprint(digester, reflect.TypeOf(ob))
            fmt.Fprint(digester, ob)
        }
        return digester.Sum(nil)
    }
    

    Output:

    c3d5dcf1d7540d3e46e7d7b5a8c6e8ae
    787ca7e12a2fa991cea5051a64b49d0c
    

    https://play.golang.org/p/nufD3wTJkb

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥15 想问一下树莓派接上显示屏后出现如图所示画面,是什么问题导致的
  • ¥100 嵌入式系统基于PIC16F882和热敏电阻的数字温度计
  • ¥15 cmd cl 0x000007b
  • ¥20 BAPI_PR_CHANGE how to add account assignment information for service line
  • ¥500 火焰左右视图、视差(基于双目相机)
  • ¥100 set_link_state
  • ¥15 虚幻5 UE美术毛发渲染
  • ¥15 CVRP 图论 物流运输优化
  • ¥15 Tableau online 嵌入ppt失败
  • ¥100 支付宝网页转账系统不识别账号