doulian4762 2018-06-15 17:20
浏览 145

为什么将JSON视为[] byte而不是string? [关闭]

RFC 7159 says

JavaScript Object Notation (JSON) is a text format for the serialization of structured data.

But Go treats JSON as []byte

func Marshal(v interface{}) ([]byte, error)
func Unmarshal(data []byte, v interface{}) error

Why don't these functions take and return a string?

I could not find any explanation here https://golang.org/pkg/encoding/json/ https://blog.golang.org/json-and-go

  • 写回答

1条回答 默认 最新

  • drrkgbm6851 2018-06-15 17:42
    关注

    Go does not go by "strings are for text, byte types are for other stuff" like some other languages (e.g. Python 3) do. "In Go, a string is in effect a read-only slice of bytes." The string type has a few behaviors attached that are handy for dealing with UTF-8 text, but it'll hold whatever bytes you put in it. Text-handling stuff in the standard library is often written to work with []bytes too, e.g. package bytes mirrors package strings and regexp deals in either.

    Given that there's no rule about text/binary semantically belonging in one type or the other, the choice to use []byte was probably made for practical reasons. Since strings are read-only slices of bytes, almost all operations changing strings have to copy bytes to a new string instead of modifying the existing one. (String slicing is a key exception; it just makes a new string header that can point into the old string's bytes.)

    Copying string contents for each operation leads to a quadratic slowdown as the string length and number of copies both grow with input size. On top of the direct cost of the copies, allocating the space for them makes garbage collection happen more often. For those reasons, almost everything that builds up content via a lot of small operations in Go uses a []byte internally. That includes Go's JSON-marshalling code, and the strings.Builder class added in Go 1.10.

    (For similar reasons, Java and C# offer string-builder types as well and modern JavaScript VMs have clever tricks to defer copying bytes until after a long series of concat operations, such as V8's cons strings and SpiderMonkey's ropes.)

    Because []bytes are read-write and strings are read-only, converting one to the other also has to copy bytes. If MarshalJSON returned a string, that would require making another copy of the content (and the associated load on the GC). Also, if you're ultimately going to do I/O with this, Write() takes a byte slice, so for that you'd have to convert back, creating another copy. (To slightly mitigate that, some I/O types including *os.File support WriteString() as well. But not all do!)

    So it makes more sense for json.Encoder to return the []byte it built up internally; you can of course call string(bytes) on the result if you need a string and the copying isn't a problem.

    A bit out of the original question's scope, but often the best performing option is just to stream the output directly to an io.Writer using a json.Encoder. You never have to allocate the whole chunk of output at once, and it can make your code simpler as well since there's no temp variable and you can handle marshalling and I/O errors in one place.

    评论

报告相同问题?

悬赏问题

  • ¥15 ogg dd trandata 报错
  • ¥15 高缺失率数据如何选择填充方式
  • ¥50 potsgresql15备份问题
  • ¥15 Mac系统vs code使用phpstudy如何配置debug来调试php
  • ¥15 目前主流的音乐软件,像网易云音乐,QQ音乐他们的前端和后台部分是用的什么技术实现的?求解!
  • ¥60 pb数据库修改与连接
  • ¥15 spss统计中二分类变量和有序变量的相关性分析可以用kendall相关分析吗?
  • ¥15 拟通过pc下指令到安卓系统,如果追求响应速度,尽可能无延迟,是不是用安卓模拟器会优于实体的安卓手机?如果是,可以快多少毫秒?
  • ¥20 神经网络Sequential name=sequential, built=False
  • ¥16 Qphython 用xlrd读取excel报错