drex88669 2019-07-09 10:18
浏览 100
已采纳

将int数组转换为字节数组,将其压缩然后反转

I have a large int array that I want to persist on the filesystem. My understanding is the best way to store something like this is to use the gob package to convert it to a byte array and then to compress it with gzip. When I need it again, I reverse the process. I am pretty sure I am storing it correctly, however recovering it is failing with EOF. Long story short, I have some example code below that demonstrates the issue. (playground link here https://play.golang.org/p/v4rGGeVkLNh). I am not convinced gob is needed, however reading around it seems that its more efficient to store it as a byte array than an int array, but that may not be true. Thanks!

package main

import (
    "bufio"
    "bytes"
    "compress/gzip"
    "encoding/gob"
    "fmt"
)

func main() {
    arry := []int{1, 2, 3, 4, 5}
    //now gob this
    var indexBuffer bytes.Buffer
    writer := bufio.NewWriter(&indexBuffer)
    encoder := gob.NewEncoder(writer)
    if err := encoder.Encode(arry); err != nil {
        panic(err)
    }
    //now compress it
    var compressionBuffer bytes.Buffer
    compressor := gzip.NewWriter(&compressionBuffer)
    compressor.Write(indexBuffer.Bytes())
    defer compressor.Close()
    //<--- I think all is good until here

    //now decompress it
    buf := bytes.NewBuffer(compressionBuffer.Bytes())
    fmt.Println("byte array before unzipping: ", buf.Bytes())
    if reader, err := gzip.NewReader(buf); err != nil {
        fmt.Println("gzip failed ", err)
        panic(err)
    } else {
        //now ungob it...
        var intArray []int
        decoder := gob.NewDecoder(reader)
        defer reader.Close()
        if err := decoder.Decode(&intArray); err != nil {
            fmt.Println("gob failed ", err)
            panic(err)
        }
        fmt.Println("final int Array content: ", intArray)
    }
}
  • 写回答

1条回答 默认 最新

  • dongtuhe0506 2019-07-09 10:56
    关注

    You are using bufio.Writer which–as its name implies–buffers bytes written to it. This means if you're using it, you have to flush it to make sure buffered data makes its way to the underlying writer:

    writer := bufio.NewWriter(&indexBuffer)
    encoder := gob.NewEncoder(writer)
    if err := encoder.Encode(arry); err != nil {
        panic(err)
    }
    if err := writer.Flush(); err != nil {
        panic(err)
    }
    

    Although the use of bufio.Writer is completely unnecessary as you're already writing to an in-memory buffer (bytes.Buffer), so just skip that, and write directly to bytes.Buffer (and so you don't even have to flush):

    var indexBuffer bytes.Buffer
    encoder := gob.NewEncoder(&indexBuffer)
    if err := encoder.Encode(arry); err != nil {
        panic(err)
    }
    

    The next error is how you close the gzip stream:

    defer compressor.Close()
    

    This deferred closing will only happen when the enclosing function (the main() function) returns, not a second earlier. But by that time you already wanted to read the zipped data, but that might still sit in an internal cache of gzip.Writer, and not in compressionBuffer, so you obviously can't read the compressed data from compressionBuffer. Close the gzip stream without using defer:

    if err := compressor.Close(); err != nil {
        panic(err)
    }
    

    With these changes, you program runs and outputs (try it on the Go Playground):

    byte array before unzipping:  [31 139 8 0 0 0 0 0 0 255 226 249 223 200 196 200 244 191 137 129 145 133 129 129 243 127 19 3 43 19 11 27 7 23 32 0 0 255 255 110 125 126 12 23 0 0 0]
    final int Array content:  [1 2 3 4 5]
    

    As a side note: buf := bytes.NewBuffer(compressionBuffer.Bytes()) – this buf is also completely unnecessary, you can just start decoding compressionBuffer itself, you can read data from it that was previously written to it.

    As you might have noticed, the compressed data is much larger than the initial, compressed data. There are several reasons: both encoding/gob and compress/gzip streams have significant overhead, and they (may) only make input smaller on a larger scale (5 int numbers don't qualify to this).

    Please check related question: Efficient Go serialization of struct to disk

    For small arrays, you may also consider variable-length encoding, see binary.PutVarint().

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 算法题:数的划分,用记忆化DFS做WA求调
  • ¥15 chatglm-6b应用到django项目中,模型加载失败
  • ¥15 武汉岩海低应变分析软件,导数据库里不显示波形图
  • ¥15 CreateBitmapFromWicBitmap内存释放问题。
  • ¥30 win c++ socket
  • ¥30 CanMv K210开发板实现功能
  • ¥15 C# datagridview 栏位进度
  • ¥15 vue3页面el-table页面数据过多
  • ¥100 vue3中融入gRPC-web
  • ¥15 kali环境运行volatility分析android内存文件,缺profile