drex88669 2019-07-09 10:18
浏览 100
已采纳

将int数组转换为字节数组,将其压缩然后反转

I have a large int array that I want to persist on the filesystem. My understanding is the best way to store something like this is to use the gob package to convert it to a byte array and then to compress it with gzip. When I need it again, I reverse the process. I am pretty sure I am storing it correctly, however recovering it is failing with EOF. Long story short, I have some example code below that demonstrates the issue. (playground link here https://play.golang.org/p/v4rGGeVkLNh). I am not convinced gob is needed, however reading around it seems that its more efficient to store it as a byte array than an int array, but that may not be true. Thanks!

package main

import (
    "bufio"
    "bytes"
    "compress/gzip"
    "encoding/gob"
    "fmt"
)

func main() {
    arry := []int{1, 2, 3, 4, 5}
    //now gob this
    var indexBuffer bytes.Buffer
    writer := bufio.NewWriter(&indexBuffer)
    encoder := gob.NewEncoder(writer)
    if err := encoder.Encode(arry); err != nil {
        panic(err)
    }
    //now compress it
    var compressionBuffer bytes.Buffer
    compressor := gzip.NewWriter(&compressionBuffer)
    compressor.Write(indexBuffer.Bytes())
    defer compressor.Close()
    //<--- I think all is good until here

    //now decompress it
    buf := bytes.NewBuffer(compressionBuffer.Bytes())
    fmt.Println("byte array before unzipping: ", buf.Bytes())
    if reader, err := gzip.NewReader(buf); err != nil {
        fmt.Println("gzip failed ", err)
        panic(err)
    } else {
        //now ungob it...
        var intArray []int
        decoder := gob.NewDecoder(reader)
        defer reader.Close()
        if err := decoder.Decode(&intArray); err != nil {
            fmt.Println("gob failed ", err)
            panic(err)
        }
        fmt.Println("final int Array content: ", intArray)
    }
}
  • 写回答

1条回答 默认 最新

  • dongtuhe0506 2019-07-09 10:56
    关注

    You are using bufio.Writer which–as its name implies–buffers bytes written to it. This means if you're using it, you have to flush it to make sure buffered data makes its way to the underlying writer:

    writer := bufio.NewWriter(&indexBuffer)
    encoder := gob.NewEncoder(writer)
    if err := encoder.Encode(arry); err != nil {
        panic(err)
    }
    if err := writer.Flush(); err != nil {
        panic(err)
    }
    

    Although the use of bufio.Writer is completely unnecessary as you're already writing to an in-memory buffer (bytes.Buffer), so just skip that, and write directly to bytes.Buffer (and so you don't even have to flush):

    var indexBuffer bytes.Buffer
    encoder := gob.NewEncoder(&indexBuffer)
    if err := encoder.Encode(arry); err != nil {
        panic(err)
    }
    

    The next error is how you close the gzip stream:

    defer compressor.Close()
    

    This deferred closing will only happen when the enclosing function (the main() function) returns, not a second earlier. But by that time you already wanted to read the zipped data, but that might still sit in an internal cache of gzip.Writer, and not in compressionBuffer, so you obviously can't read the compressed data from compressionBuffer. Close the gzip stream without using defer:

    if err := compressor.Close(); err != nil {
        panic(err)
    }
    

    With these changes, you program runs and outputs (try it on the Go Playground):

    byte array before unzipping:  [31 139 8 0 0 0 0 0 0 255 226 249 223 200 196 200 244 191 137 129 145 133 129 129 243 127 19 3 43 19 11 27 7 23 32 0 0 255 255 110 125 126 12 23 0 0 0]
    final int Array content:  [1 2 3 4 5]
    

    As a side note: buf := bytes.NewBuffer(compressionBuffer.Bytes()) – this buf is also completely unnecessary, you can just start decoding compressionBuffer itself, you can read data from it that was previously written to it.

    As you might have noticed, the compressed data is much larger than the initial, compressed data. There are several reasons: both encoding/gob and compress/gzip streams have significant overhead, and they (may) only make input smaller on a larger scale (5 int numbers don't qualify to this).

    Please check related question: Efficient Go serialization of struct to disk

    For small arrays, you may also consider variable-length encoding, see binary.PutVarint().

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 有人懂怎么做大模型的客服系统吗?卡住了卡住了
  • ¥20 firefly-rk3399上启动卡住了
  • ¥15 如何删除这个虚拟音频
  • ¥50 hyper默认的default switch
  • ¥15 网站打不开,提示502 Bad Gateway
  • ¥20 基于MATLAB的绝热压缩空气储能系统代码咨询
  • ¥15 R语言建立随机森林模型出现的问题
  • ¥15 关于#wpf#的问题:怎么更改LayoutGroup组件的标签页的字体颜色
  • ¥15 中级微观经济学,生产可能性边界问题
  • ¥15 TCP传输时不同网卡传输用时差异过大