从Golang中的字节数组解压缩gzip

I have a bunch of files that come from some web requests, and some are gziped, i need to unpack them and print them as a string. This is the first time I try using golang, I tried some examples I found online but can't get it working.

Here's the last test I was trying:

package main

import (
    "bytes"
    "compress/gzip"
    "fmt"
    "io/ioutil"
)

func main() {
    content := []byte{72,84,84,80,47,49,46,49,32,50,48,48,32,79,75,13,10,84,114,97,110,115,102,101,114,45,69,110,99,111,100,105,110,103,58,32,99,104,117,110,107,101,100,13,10,67,111,110,110,101,99,116,105,111,110,58,32,75,101,101,112,45,65,108,105,118,101,13,10,67,111,110,116,101,110,116,45,69,110,99,111,100,105,110,103,58,32,103,122,105,112,13,10,67,111,110,116,101,110,116,45,84,121,112,101,58,32,116,101,120,116,47,104,116,109,108,13,10,68,97,116,101,58,32,83,117,110,44,32,49,52,32,65,112,114,32,50,48,49,57,32,48,53,58,49,50,58,50,51,32,71,77,84,13,10,75,101,101,112,45,65,108,105,118,101,58,32,116,105,109,101,111,117,116,61,53,44,32,109,97,120,61,49,48,48,13,10,83,101,114,118,101,114,58,32,65,112,97,99,104,101,47,50,46,52,46,49,48,32,40,68,101,98,105,97,110,41,13,10,86,97,114,121,58,32,65,99,99,101,112,116,45,69,110,99,111,100,105,110,103,13,10,13,10,54,98,100,13,10,31,139,8,0,0,0,0,0,0,3,236,89,235,111,218,72,16,255,156,254,21,83,127,1,36,108,243,200,163,215,0,82,66,104,19,53,205,85,133,94,85,157,78,104,177,7,188,141,189,246,237,174,33,180,234,255,126,179,182,73,8,143,132,180,167,83,117,106,68,204,122,119,231,249,155,89,123,134,103,123,173,231,103,191,119,7,159,222,245,32,208,81,216,121,214,202,191,0,90,1,50,223,12,104,168,185,14,177,211,239,95,2,116,121,18,160,132,126,202,53,42,250,74,146,88,106,244,225,116,14,159,226,84,194,169,140,103,10,101,203,205,105,114,250,231,182,221,138,80,51,16,44,194,182,53,229,56,51,100,22,120,177,208,40,116,219,154,113,95,7,109,31,167,220,67,59,187,169,2,23,92,115,22,218,202,99,33,182,235,78,205,234,216,118,193,241,135,185,85,33,98,55,60,74,163,197,132,117,167,43,156,198,177,86,90,178,4,110,5,134,92,92,67,32,113,220,182,60,165,220,209,98,135,19,113,225,208,140,5,18,195,182,165,244,60,68,21,32,146,58,17,250,156,209,148,39,17,133,181,204,39,219,202,73,93,171,96,25,104,157,168,151,174,235,121,142,239,41,244,156,84,112,59,96,66,196,83,148,142,143,46,143,38,238,152,77,29,34,178,64,207,19,178,155,71,108,130,238,141,157,241,201,16,115,23,144,181,70,177,63,47,4,250,124,10,220,39,159,144,178,214,210,156,23,50,165,200,24,242,25,227,2,101,177,182,186,158,144,16,219,240,189,183,99,111,121,11,33,190,180,68,139,247,86,85,194,68,157,160,107,5,245,44,130,150,3,8,206,8,70,30,42,136,199,43,209,67,155,91,46,113,121,152,111,131,216,178,220,133,165,133,11,103,179,153,179,193,131,165,78,139,124,8,74,122,237,146,113,230,89,183,143,158,125,25,79,226,161,23,205,175,157,68,76,104,139,203,54,136,93,153,89,189,189,211,169,68,218,151,54,59,169,148,187,161,212,25,4,92,193,12,71,42,203,159,9,159,210,117,30,167,20,158,227,88,70,76,243,88,0,125,116,128,96,188,229,229,222,82,121,186,205,141,143,70,185,143,104,46,203,61,5,68,8,100,113,42,185,152,192,249,96,240,174,111,50,65,160,103,152,41,103,69,223,117,173,72,229,189,189,91,115,216,74,72,234,25,215,154,92,232,197,145,171,2,38,209,90,32,80,172,216,217,172,61,74,181,54,1,237,51,205,236,84,134,59,4,181,23,160,119,93,80,104,188,161,196,237,154,25,136,83,189,226,128,45,166,59,5,177,226,95,40,33,66,38,39,88,204,4,76,5,154,77,140,150,161,202,121,88,157,193,140,242,210,96,76,246,82,48,81,98,242,68,119,158,143,83,145,185,170,236,87,85,149,251,149,175,83,38,225,179,170,142,63,171,182,239,76,80,247,66,140,232,92,81,167,243,1,155,92,209,161,83,86,149,63,107,127,29,243,113,249,249,242,134,211,249,133,95,38,6,149,175,25,37,229,61,211,88,172,17,201,241,103,229,80,38,114,223,12,76,32,90,174,155,132,76,27,224,157,101,55,211,137,69,76,149,243,89,89,199,164,132,147,144,127,133,190,138,125,116,184,32,251,245,41,18,13,150,115,29,43,199,223,190,149,253,216,75,141,152,170,149,91,101,85,111,225,153,17,155,202,113,203,45,204,93,10,132,173,113,221,26,73,183,211,74,22,81,18,82,250,151,204,33,82,34,103,14,77,164,150,58,219,158,3,163,251,207,1,40,199,146,142,142,124,33,161,192,162,177,240,176,242,178,229,38,157,181,212,217,16,154,45,58,101,99,49,233,244,19,244,200,134,252,38,87,117,117,247,254,221,238,123,135,140,65,236,17,210,37,65,111,112,14,125,138,167,71,40,14,239,40,206,48,119,45,133,208,42,209,26,233,54,51,203,94,173,218,24,85,182,26,214,235,158,157,247,108,186,246,79,236,147,94,191,222,120,97,191,238,190,181,251,231,39,141,131,195,173,54,209,54,56,229,122,171,5,198,86,188,241,40,39,39,248,18,214,142,82,20,206,140,95,243,196,60,199,156,88,78,92,115,231,246,194,208,24,235,13,233,196,153,226,240,140,143,199,28,237,115,12,195,136,137,18,104,147,133,186,93,26,142,66,38,174,115,205,77,206,85,129,128,151,243,204,77,59,203,58,241,167,140,194,197,31,246,110,105,135,125,205,132,207,164,191,46,138,28,147,75,122,123,210,221,89,4,185,208,38,31,174,115,91,248,150,117,156,239,66,243,232,49,52,223,223,97,249,11,199,159,22,199,102,109,71,28,73,200,34,39,155,47,246,183,98,73,219,126,97,185,3,150,228,195,141,88,102,190,253,78,44,235,251,79,192,146,100,253,2,241,7,65,172,111,132,176,254,125,0,214,106,213,223,252,237,0,254,4,105,72,42,12,203,44,156,196,146,235,32,170,172,27,79,27,126,118,200,254,245,188,35,216,154,7,59,193,246,159,103,220,255,1,175,157,83,108,219,59,253,56,149,84,108,201,226,189,254,85,126,183,92,143,238,246,174,222,44,117,62,208,11,191,125,50,161,50,228,229,102,156,94,148,58,111,227,47,60,12,153,123,224,212,160,252,145,11,159,202,4,184,26,64,189,230,212,142,129,38,14,247,143,225,230,112,191,2,39,73,18,226,71,28,189,225,218,61,104,30,57,205,67,40,191,57,31,188,189,172,66,200,175,17,94,83,169,24,87,160,27,200,56,66,247,168,233,212,104,203,139,166,83,175,53,161,207,198,76,242,130,236,73,17,75,102,188,203,202,20,83,179,80,1,234,14,46,251,64,245,170,202,161,223,102,22,237,154,214,157,198,83,69,237,138,51,210,3,65,14,77,33,51,188,16,62,247,50,92,54,0,127,117,97,112,191,7,222,118,157,183,213,229,79,53,194,148,233,74,51,42,220,233,193,101,42,85,208,60,194,135,156,69,181,189,113,107,78,20,143,239,23,247,62,247,65,196,26,20,10,31,88,198,11,166,44,76,113,67,56,155,81,214,81,185,107,120,64,170,168,32,45,240,0,170,129,3,216,242,186,13,44,19,80,188,92,195,53,157,32,166,149,114,151,234,206,35,153,35,25,57,228,61,155,237,150,32,141,82,231,143,71,227,168,233,52,225,73,254,111,44,74,113,245,16,210,181,198,168,74,151,35,186,52,107,116,169,239,211,163,148,142,229,39,203,234,221,104,20,198,134,135,196,213,232,175,74,255,7,230,194,204,101,100,46,126,117,60,174,213,105,84,111,60,89,238,123,140,98,42,229,7,255,81,96,61,91,52,71,158,45,53,249,76,23,53,73,85,96,45,247,8,139,225,222,237,142,113,76,138,222,245,74,31,237,181,46,226,202,138,82,211,63,241,232,220,225,218,202,163,154,180,54,45,108,160,97,34,227,41,247,243,62,138,233,141,173,245,233,182,244,61,173,181,67,226,172,219,239,117,179,99,66,162,66,38,189,0,38,50,78,19,96,122,51,215,213,182,221,58,203,75,228,35,193,191,192,7,193,179,163,82,207,225,188,32,200,158,67,134,111,65,99,229,52,171,141,239,45,218,187,60,74,72,75,149,70,142,249,93,194,234,92,208,61,23,89,19,47,239,82,193,69,134,50,4,108,138,148,208,115,58,9,162,172,89,7,148,202,127,167,168,178,38,40,208,163,132,41,204,126,33,96,222,138,157,15,136,31,179,32,44,36,247,153,162,183,11,120,69,51,121,155,56,185,133,120,61,26,104,144,183,224,41,142,242,31,84,246,254,1,0,0,255,255,3,0,8,20,73,242,107,25,0,0,13,10,48,13,10,13,10}
    buf := bytes.NewBuffer(content)
    reader, err := gzip.NewReader(buf)
    if err != nil {
        panic(err)
    }
    defer reader.Close()


    s, err := ioutil.ReadAll(reader)
    if err != nil {
        panic(err)
    }

    fmt.Println("decompressed:\t", string(s))
}

But it shows the error: panic: gzip: invalid header, same as a few other examples.

How can I ungzip the byte array content?

1个回答




  content:= [] byte {72,84,84,80,47,49,46  ,49,32,50,48,48,32,79,75,... 
</ code> </ pre>
</ blockquote>

这根本不是gzip数据 。 正确的gzip数据应以魔术序列 0x1f 0x8b </ code>开始,即 [] byte {31,139,...} </ code>。 就此而言,它正确地抱怨 gzip:无效的标题</ em>。 </ p>

因此,让我们仔细看一下该字节序列的实际含义。 当将其打印为字符串时,它给出:</ p>

  HTTP / 1.1 200 OK 
Transfer-Encoding:分块
连接:Keep-Alive
Content-Encoding:gzip
Content-Type: text / html
日期:2019年4月14日,星期日,格林尼治标准时间
Keep-Alive:超时= 5,最大= 100
服务器:Apache / 2.4.10(Debian)
不同:Accept-Encoding

6bd

...二进制数据..
0
</ code> </ pre>

因此,这是HTTP响应,其中主体首先使用gzip压缩,然后使用分块传输编码进行编码 。 要提取数据,您需要首先删除HTTP标头,然后从分块的传输编码中解码,然后将结果提取并使用gzip解压缩。</ p>
</ div>

展开原文

原文

content := []byte{72,84,84,80,47,49,46,49,32,50,48,48,32,79,75, ...

This is no gzip data at all. Proper gzip data start with the magic sequence 0x1f 0x8b, i.e. []byte{31,139,...}. Insofar it rightly complains about gzip: invalid header.

So lets have a closer look of what this byte sequence actually is. When printing it as string it gives:

HTTP/1.1 200 OK
Transfer-Encoding: chunked
Connection: Keep-Alive
Content-Encoding: gzip
Content-Type: text/html
Date: Sun, 14 Apr 2019 05:12:23 GMT
Keep-Alive: timeout=5, max=100
Server: Apache/2.4.10 (Debian)
Vary: Accept-Encoding

6bd
... binary data ..
0

Thus, this is a HTTP response where the body is first compressed with gzip and then encoded with chunked transfer encoding. To extract the data you need to first remove the HTTP header, then decode from chunked transfer encoding and then you'll can take the result and decompress it with gzip.

dousi1994
dousi1994 http.NewChunkedReader
一年多之前 回复
douyigua5381
douyigua5381 这与如何从字节数组解压缩无关。 因此,应该将其作为一个单独的问题进行询问,因为没有人会期望在对一个不相关的问题的评论中讨论过此问题。 也许您应该在这个新问题中也首先解释为什么要使用这种形式的HTTP响应,因为当使用net / http请求数据时,直接访问主体会更好。
一年多之前 回复
dongxu0690
dongxu0690 关于如何使用go进行分块传输编码进行解码的任何提示?
一年多之前 回复
Csdn user default icon
上传中...
上传图片
插入图片
抄袭、复制答案,以达到刷声望分或其他目的的行为,在CSDN问答是严格禁止的,一经发现立刻封号。是时候展现真正的技术了!
立即提问
相关内容推荐