douwa0280 2017-11-02 21:26
浏览 83
已采纳

如何在JSON对象流中跳过“噪声”?

Trying to get the following code to skip parse error noise in a JSON data object stream. Basically I want it to skip the ERROR: ... lines and continue onto the next parseable record.

json.Decoder has a limited set of methods - so it's unclear how to move the decoder's index forward (say a byte at a time) to move past the noise.

io.Reader has methods to skip to say the end of the line (or at least try skipping a character at time) - but doing such operations does not (understandably) affect the json.Decoder's seek state.

Is there a clean way to do this?

https://play.golang.org/p/riIDh9g1Rx

package main

import (
        "encoding/json"
        "fmt"
        "strings"
        "time"
)

type event struct {
        T    time.Time
        Desc string
}

var jsonStream = ` 
{"T":"2017-11-02T16:00:00-04:00","Desc":"window opened"}
{"T":"2017-11-02T16:30:00-04:00","Desc":"window closed"}
{"T":"2017-11-02T16:41:34-04:00","Desc":"front door opened"}
ERROR: retrieving event 1234
{"T":"2017-11-02T16:41:40-04:00","Desc":"front door closed"}
`

func main() {
        jsonReader := strings.NewReader(jsonStream)
        decodeStream := json.NewDecoder(jsonReader)

        i := 0
        for decodeStream.More() {
                i++ 
                var ev event
                if err := decodeStream.Decode(&ev); err != nil {
                        fmt.Println("parse error: %s", err)
                        break
                }   
                fmt.Printf("%3d: %+v
", i, ev) 
        }   
}

got:

  1: {T:2017-11-02 16:00:00 -0400 -0400 Desc:window opened}
  2: {T:2017-11-02 16:30:00 -0400 -0400 Desc:window closed}
  3: {T:2017-11-02 16:41:34 -0400 -0400 Desc:front door opened}
parse error: %s invalid character 'E' looking for beginning of value

want:

  1: {T:2017-11-02 16:00:00 -0400 -0400 Desc:window opened}
  2: {T:2017-11-02 16:30:00 -0400 -0400 Desc:window closed}
  3: {T:2017-11-02 16:41:34 -0400 -0400 Desc:front door opened}
  4: {T:2017-11-02 16:41:40 -0400 -0400 Desc:front door closed}
  • 写回答

2条回答 默认 最新

  • douyan8267 2017-11-02 21:31
    关注

    I think the "correct" way to do this, as the stream itself is not valid JSON (even without the errors, a JSON document must have a single root entry, this is a series of root objects which is not valid), would be to pre-parse into individual, valid JSON documents, and unmarshal each separately. Read the stream line-by-line using e.g. bufio.Scanner, discard the non-JSON lines, and Unmarshal the others as normal.

    See working example here: https://play.golang.org/p/DZrAVmzwr-

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥15 Python爬取指定微博话题下的内容,保存为txt
  • ¥15 vue2登录调用后端接口如何实现
  • ¥65 永磁型步进电机PID算法
  • ¥15 sqlite 附加(attach database)加密数据库时,返回26是什么原因呢?
  • ¥88 找成都本地经验丰富懂小程序开发的技术大咖
  • ¥15 如何处理复杂数据表格的除法运算
  • ¥15 如何用stc8h1k08的片子做485数据透传的功能?(关键词-串口)
  • ¥15 有兄弟姐妹会用word插图功能制作类似citespace的图片吗?
  • ¥15 latex怎么处理论文引理引用参考文献
  • ¥15 请教:如何用postman调用本地虚拟机区块链接上的合约?