drjyvoi734793 2017-04-16 10:16
浏览 136
已采纳

如何在golang中将GUID原始16个字节写入CSV?

I have following code, which try save UUID raw 16 bytes (with 0x0A inside) to CSV format

package main

import (
    "encoding/csv"
    "github.com/satori/go.uuid"
    "log"
    "os"
)

func main() {
    u, err := uuid.FromString("e1393c62-877a-4adc-8ffb-f1bf0a337c5f")
    if err != nil {
        log.Fatal(err)
    }
    csv_file, err := os.OpenFile("csv_wtf.csv", os.O_WRONLY|os.O_CREATE, 0644)
    if err != nil {
        log.Fatal(err)
    }
    s := string(u.Bytes())
    log.Printf("len(s)=%d",len(s))
    csv_writer := csv.NewWriter(csv_file)
    csv_writer.UseCRLF = false
    csv_writer.Write([]string{s})
    csv_writer.Flush()
    finfo, err := csv_file.Stat()
    if err != nil {
        log.Fatal(err)
    }
    log.Printf("size csv_wtf.csv = %d", finfo.Size())
    csv_file.Close()
}

this code output data to csv with add extra bytes

2017/04/16 12:37:14 len(s)=16
2017/04/16 12:37:14 size csv_wtf.csv = 29

why encoding/csv add extra bytes when follow my string over range (see https://golang.org/src/encoding/csv/writer.go#L38, https://golang.org/src/encoding/csv/writer.go#L50 and https://golang.org/src/encoding/csv/writer.go#L76)?

could somebody help me find CSV package who don't do it strange conversion ??

  • 写回答

1条回答 默认 最新

  • douzhi3667 2017-04-16 11:07
    关注

    This is because CSV format is not suitable for storing raw binary data, which is unlikely to be a valid utf-8 sequence.

    What happens is that when csv_writer.Write iterates a string with range loop, every time it encounters an invalid utf-8 sequence, the rune r1 gets equal to 65533, which is encoded as 3 bytes: 0xef, 0xbf, 0xbd.

    Illustrative example:

    package main
    
    import (
        "bytes"
        "fmt"
    )
    
    func main() {
        invalidString := string([]byte{0xff, 0xfe, 0xfd})
        var b bytes.Buffer
        for _, r := range invalidString {
            fmt.Printf("current rune: %v
    ", r)
            b.WriteRune(r)
        }
    
        fmt.Printf("total data: %v
    ", b.Bytes())
    }
    

    The output is:

    current rune: 65533
    current rune: 65533
    current rune: 65533
    total data: [239 191 189 239 191 189 239 191 189]
    

    So you should either abandon CSV in favour of some other format (suitable for storing binary data), or store UUIDs in their string form.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 #MATLAB仿真#车辆换道路径规划
  • ¥15 java 操作 elasticsearch 8.1 实现 索引的重建
  • ¥15 数据可视化Python
  • ¥15 要给毕业设计添加扫码登录的功能!!有偿
  • ¥15 kafka 分区副本增加会导致消息丢失或者不可用吗?
  • ¥15 微信公众号自制会员卡没有收款渠道啊
  • ¥100 Jenkins自动化部署—悬赏100元
  • ¥15 关于#python#的问题:求帮写python代码
  • ¥20 MATLAB画图图形出现上下震荡的线条
  • ¥15 关于#windows#的问题:怎么用WIN 11系统的电脑 克隆WIN NT3.51-4.0系统的硬盘