将int数组转换为字节数组,将其压缩然后反转

I have a large int array that I want to persist on the filesystem. My understanding is the best way to store something like this is to use the gob package to convert it to a byte array and then to compress it with gzip. When I need it again, I reverse the process. I am pretty sure I am storing it correctly, however recovering it is failing with EOF. Long story short, I have some example code below that demonstrates the issue. (playground link here https://play.golang.org/p/v4rGGeVkLNh). I am not convinced gob is needed, however reading around it seems that its more efficient to store it as a byte array than an int array, but that may not be true. Thanks!

package main

import (
    "bufio"
    "bytes"
    "compress/gzip"
    "encoding/gob"
    "fmt"
)

func main() {
    arry := []int{1, 2, 3, 4, 5}
    //now gob this
    var indexBuffer bytes.Buffer
    writer := bufio.NewWriter(&indexBuffer)
    encoder := gob.NewEncoder(writer)
    if err := encoder.Encode(arry); err != nil {
        panic(err)
    }
    //now compress it
    var compressionBuffer bytes.Buffer
    compressor := gzip.NewWriter(&compressionBuffer)
    compressor.Write(indexBuffer.Bytes())
    defer compressor.Close()
    //<--- I think all is good until here

    //now decompress it
    buf := bytes.NewBuffer(compressionBuffer.Bytes())
    fmt.Println("byte array before unzipping: ", buf.Bytes())
    if reader, err := gzip.NewReader(buf); err != nil {
        fmt.Println("gzip failed ", err)
        panic(err)
    } else {
        //now ungob it...
        var intArray []int
        decoder := gob.NewDecoder(reader)
        defer reader.Close()
        if err := decoder.Decode(&intArray); err != nil {
            fmt.Println("gob failed ", err)
            panic(err)
        }
        fmt.Println("final int Array content: ", intArray)
    }
}
douzao5487
douzao5487 当我尝试在Gob->compress->expand->Gob之后恢复初始int数组时,我得到了EOF-即它没有还原到原始int数组
一年多之前 回复
doujizhong8352
doujizhong8352 那你有什么问题采空区是否合适或更有效?还是您的代码有一些特定的问题?
一年多之前 回复

1个回答

You are using bufio.Writer which–as its name implies–buffers bytes written to it. This means if you're using it, you have to flush it to make sure buffered data makes its way to the underlying writer:

writer := bufio.NewWriter(&indexBuffer)
encoder := gob.NewEncoder(writer)
if err := encoder.Encode(arry); err != nil {
    panic(err)
}
if err := writer.Flush(); err != nil {
    panic(err)
}

Although the use of bufio.Writer is completely unnecessary as you're already writing to an in-memory buffer (bytes.Buffer), so just skip that, and write directly to bytes.Buffer (and so you don't even have to flush):

var indexBuffer bytes.Buffer
encoder := gob.NewEncoder(&indexBuffer)
if err := encoder.Encode(arry); err != nil {
    panic(err)
}

The next error is how you close the gzip stream:

defer compressor.Close()

This deferred closing will only happen when the enclosing function (the main() function) returns, not a second earlier. But by that time you already wanted to read the zipped data, but that might still sit in an internal cache of gzip.Writer, and not in compressionBuffer, so you obviously can't read the compressed data from compressionBuffer. Close the gzip stream without using defer:

if err := compressor.Close(); err != nil {
    panic(err)
}

With these changes, you program runs and outputs (try it on the Go Playground):

byte array before unzipping:  [31 139 8 0 0 0 0 0 0 255 226 249 223 200 196 200 244 191 137 129 145 133 129 129 243 127 19 3 43 19 11 27 7 23 32 0 0 255 255 110 125 126 12 23 0 0 0]
final int Array content:  [1 2 3 4 5]

As a side note: buf := bytes.NewBuffer(compressionBuffer.Bytes()) – this buf is also completely unnecessary, you can just start decoding compressionBuffer itself, you can read data from it that was previously written to it.

As you might have noticed, the compressed data is much larger than the initial, compressed data. There are several reasons: both encoding/gob and compress/gzip streams have significant overhead, and they (may) only make input smaller on a larger scale (5 int numbers don't qualify to this).

Please check related question: Efficient Go serialization of struct to disk

For small arrays, you may also consider variable-length encoding, see binary.PutVarint().

dqmdlo9674
dqmdlo9674 那就是票。 通过不使用料滴来节省开销,而只需保持长度即可。 感谢你的帮助
一年多之前 回复
dsjuimtq920056
dsjuimtq920056 类型[] int64的值的长度不是固定的,因此您必须自己保留该长度。 采空区为您解决了这个问题。
一年多之前 回复
drfm55597
drfm55597 嗯,也许我需要先知道数组的大小? 采空区可以处理吗? 编码/二进制倾斜? -编辑-是的,似乎就是这样。 嗯。 这是否意味着我需要坚持原始[] int64的长度,还是可以即时“解决”问题? play.golang.org/p/cayXw7fCcK2
一年多之前 回复
doumei1908
doumei1908 太酷了,因此,除掉球料听起来是个好主意。 我想我很接近了,它在压缩后使用编码/二进制将其转换回[] int64,这对我来说是失败的-读入[] int64后,我得到了一个空数组。 您有机会看看吗? play.golang.org/p/NRimlw4Udss
一年多之前 回复
dongrouyuan5685
dongrouyuan5685 当它变得“有价值”时:测量。 这取决于输入。 某些输入可能会更好地压缩。 料滴不是必需的,实际上它只是在计算和空间开销上。 如果仍然要压缩,则只需使用encoding / binary将整数转换为字节(您可以将其写入gzip流)。 另请注意,您不应使用int,因为它的大小取决于体系结构,而应使用固定大小的整数,例如int32或int64。
一年多之前 回复
drci47425
drci47425 非常感谢。 现在阅读有关刷新的信息,但会从过程中删除编写器。 编辑:有没有办法知道压缩何时变得有价值? 是否需要使用gob保存不值得压缩的[] int?
一年多之前 回复
Csdn user default icon
上传中...
上传图片
插入图片
抄袭、复制答案,以达到刷声望分或其他目的的行为,在CSDN问答是严格禁止的,一经发现立刻封号。是时候展现真正的技术了!
立即提问
相关内容推荐