drqvsx1228
drqvsx1228
2014-08-25 23:38

在Go中实现Merkle-tree数据结构

已采纳

I'm currently attempting to implement a merkle-tree data structure in Go. Basically, my end goal is to store a small set of structured data (10MB max) and allow this "database" to be easily synchronised with other nodes distributed over the network (see related ). I've implemented this reasonably effectively in Node as there are no type-checks. Herein lies the problem with Go, I'd like to make use of Go's compile-time type checks, though I also want to have one library which works with any provided tree.

In short, I'd like to use structs as merkle nodes and I'd like to have one Merkle.Update() method which is embedded in all types. I'm trying to avoid writing an Update() for every struct (though I'm aware this might be the only/best way).

My idea was to use embedded types:

//library
type Merkle struct {
    Initialised bool
    Container interface{} //in example this references foo
    Fields []reflect.Type
    //... other merkle state
}
//Merkle methods... Update()... etc...

//userland
type Foo struct {
    Merkle
    A int
    B bool
    C string
    D map[string]*Bazz
    E []*Bar
}

type Bazz struct {
    Merkle
    S int
    T int
    U int
}

type Bar struct {
    Merkle
    X int
    Y int
    Z int
}

In this example, Foo will be the root, which will contain Bazzs and Bars. This relationship could be inferred by reflecting on the types. The problem is the usage:

foo := &Foo{
    A: 42,
    B: true,
    C: "foo",
    D: map[string]*Bazz{
        "b1": &Bazz{},
        "b2": &Bazz{},
    },
    E: []*Bar{
        &Bar{},
        &Bar{},
        &Bar{},
    },
}

merkle.Init(foo)
foo.Hash //Initial hash => abc...

foo.A = 35
foo.E = append(foo.E, &Bar{})

foo.Update()
foo.Hash //Updated hash => def...

I think we need to merkle.Init(foo) since foo.Init() would actually be foo.Merkle.Init() and would not be able to reflect on foo. The uninitialised Bars and Bazzs could be detected and initialised by the parent foo.Update(). Some reflection is acceptable as correctness is more important than performance at the moment. Another problem is, when we Update() a node, all struct fields (child nodes) would need to be Update()d as well (rehashed) since we aren't sure what was changed. We could do foo.SetInt("A", 35) to implement an auto-update, though then we lose compile time type-checks.

Would this be considered idiomatic Go? If not, how could this be improved? Can anyone think of an alternative way to store a dataset in memory (for fast reads) with concise dataset comparison (for efficient delta transfers over the network)? Edit: And also a meta-question: Where is the best place to ask this kind of question, StackOverflow, Reddit or go-nuts? Originally posted on reddit with no answer :(

  • 点赞
  • 写回答
  • 关注问题
  • 收藏
  • 复制链接分享
  • 邀请回答

3条回答

  • dongyuqie4322 dongyuqie4322 7年前

    Some goals seem like:

    • Hash anything -- make it easy to use by hashing lots of things out of the box
    • Cache hashes -- make updates just rehash what they need to
    • Be idiomatic -- fit in well among other Go code

    I think you can attack hashing anything roughly the way that serialization tools like the built-in encoding/gob or encoding/json do, which is three-pronged: use a special method if the type implements it (for JSON that's MarshalJSON), use a type switch for basic types, and fall back to a nasty default case using reflection. Here's an API sketch that provides a helper for hash caching and lets types either implement Hash or not:

    package merkle
    
    type HashVal uint64
    
    const MissingHash HashVal = 0
    
    // Hasher provides a custom hash implementation for a type. Not
    // everything needs to implement it, but doing so can speed
    // updates.
    type Hasher interface {
        Hash() HashVal
    }
    
    // HashCacher is the interface for items that cache a hash value.
    // Normally implemented by embedding HashCache.
    type HashCacher interface {
        CachedHash() *HashVal
    }
    
    // HashCache implements HashCacher; it's meant to be embedded in your
    // structs to make updating hash trees more efficient.
    type HashCache struct {
        h HashVal
    }
    
    // CachedHash implements HashCacher.
    func (h *HashCache) CachedHash() *HashVal {
        return &h.h
    }
    
    // Hash returns something's hash, using a cached hash or Hash() method if
    // available.
    func Hash(i interface{}) HashVal {
        if hashCacher, ok := i.(HashCacher); ok {
            if cached := *hashCacher.CachedHash(); cached != MissingHash {
                return cached
            }
        }
    
        switch i := i.(type) {
        case Hasher:
            return i.Hash()
        case uint64:
            return HashVal(i * 8675309) // or, you know, use a real hash
        case []byte:
            // CRC the bytes, say
            return 0xdeadbeef
        default:
            return 0xdeadbeef
            // terrible slow recursive case using reflection
            // like: iterate fields using reflect, then hash each
        }
    
        // instead of panic()ing here, you could live a little
        // dangerously and declare that changes to unhashable
        // types don't invalidate the tree
        panic("unhashable type passed to Hash()")
    }
    
    // Item is a node in the Merkle tree, which must know how to find its
    // parent Item (the root node should return nil) and should usually
    // embed HashCache for efficient updates. To avoid using reflection,
    // Items might benefit from being Hashers as well.
    type Item interface {
        Parent() Item
        HashCacher
    }
    
    // Update updates the chain of items between i and the root, given the
    // leaf node that may have been changed.
    func Update(i Item) {
        for i != nil {
            cached := i.CachedHash()
            *cached = MissingHash // invalidate
            *cached = Hash(i)
            i = i.Parent()
        }
    }
    
    点赞 评论 复制链接分享
  • doutuichan2681 doutuichan2681 7年前

    Have you seen https://github.com/xsleonard/go-merkle which will allow you to create a binary merkle tree. You could append a type byte to the end of your data to identify it.

    点赞 评论 复制链接分享
  • drcigvoy48900 drcigvoy48900 7年前

    Go doesn't have inheritance in the same way other languages do.

    The "parent" can't modify items in the child, you'd have to implement Update on each struct then do your business in it then have it call the parent's Update.

    func (b *Bar) Update() {
        b.Merkle.Update()
        //do stuff related to b and b.Merkle
        //stuff
    }
    
    func (f *Foo) Update() {
        f.Merkle.Update()
        for _, b := range f.E {
            b.Update()
        }
        //etc
    }
    

    I think you will have to re-implement your tree in a different way.

    Also please provide a testable case the next time.

    点赞 评论 复制链接分享