duanli4146 2019-08-16 17:55
浏览 131
已采纳

附加到[] byte,写入文件并读取后,Go出现问题

I'm trying to parse lots of IP's (~20mb or 4 million IPs), store them as bytes in a file, and read them later.

The issue I'm having is that I expect them to be stored in sorted order, but I'm seeing random byte slices which look like mangled IPs when reading them back.

// Let this be called generator.go

var buf []byte


// So this is where we build up `buf`, which we later write to a file.
func writeOut(record RecordStruct) {
    // This line is never hit. All slices have a length of 4, as expected
    if len(record.IPEnd.Bytes()) != 4 {
        fmt.Println(len(record.IPEnd.Bytes()), record.IPEnd.Bytes())
    }

    // Let's append the IP to the byte slice with a seperater of 10 null bytes which we will later call bytes.Split on.
    buf = append(buf, append(record.IPEnd.Bytes(), bytes.Repeat([]byte{0}, 10)...)...)
}

func main () {
    // Called many times. For brevity I won't include all of that logic. 
    // There are no Goroutines in the code and running with -race says all is fine.

    writeOut(...)

    err := ioutil.WriteFile("bin/test", buf, 0644)
}

reader.go

func main() {
    bytez, err := ioutil.ReadFile("bin/test")

    if err != nil {
        fmt.Println("Asset was not found.")
    }

    haystack := bytes.Split(bytez, bytes.Repeat([]byte{0}, 10))

    for _, needle := range haystack {
        // Get's hit maybe 10% of the time. The logs are below.
        if len(needle) != 4 {
            fmt.Println(fmt.Println(needle))
        }
    }
}
[188 114 235]
14 <nil>
[120 188 114 235 121]
22 <nil>
[188 148 98]
13 <nil>
[120 188 148 98 121]
21 <nil>

As you can see there are either too few or too many bits to be IPs.

And if I changed the log to better illustrate the issue, it looks like the last octet overflows?

Fine: [46 36 202 235]
Fine: [46 36 202 239]
Fine: [46 36 202 255]
Weird: [46 36 203]
Weird: [0 46 36 203 1]
Fine: [46 36 203 3]
Fine: [46 36 203 5]
Fine: [46 36 203 7]
Fine: [46 36 203 9]
  • 写回答

2条回答 默认 最新

  • dongwen7813 2019-08-16 18:26
    关注

    The code does not split the bytes correctly when an IP address ends with a zero byte. Fix by converting the address to 16 byte representation and store 16 byte records with no delimiters.

    You can efficiently append a mix of v4 and v6 addresses to the buffer using the following:

    switch len(p) {
    case net.IPv6len: 
        buf = append(buf, p...)
    case net.IPv4len:
        buf = append(buf, v4InV6Prefix...)
        buf = append(buf, p...)
    default:
        // handle error
    }
    

    where v4InV6Prefix is a package-level variable with the value []byte{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0xff, 0xff}.

    Read the file as v6 addresses:

     buf, err := ioutil.ReadFile(xxx)
     if err != nil {
         // handle error
     }
     for i := 0; i < len(buf); i += 16 {
        addr := net.IP(buf[i:i+16])
        // do something with addr
     }
    

    Note that it's also possible to read and write the file incrementally using a io.Reader and io.Writer. The code in this answer matches the code in the question where the application reads and write the file in one go.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥15 华为ensp模拟器中S5700交换机在配置过程中老是反复重启
  • ¥15 java写代码遇到问题,求帮助
  • ¥15 uniapp uview http 如何实现统一的请求异常信息提示?
  • ¥15 有了解d3和topogram.js库的吗?有偿请教
  • ¥100 任意维数的K均值聚类
  • ¥15 stamps做sbas-insar,时序沉降图怎么画
  • ¥15 买了个传感器,根据商家发的代码和步骤使用但是代码报错了不会改,有没有人可以看看
  • ¥15 关于#Java#的问题,如何解决?
  • ¥15 加热介质是液体,换热器壳侧导热系数和总的导热系数怎么算
  • ¥100 嵌入式系统基于PIC16F882和热敏电阻的数字温度计