dongshuogai2343 2018-10-04 02:36
浏览 53
已采纳

从kinesis firehose解析json

Hi I'm trying using kinesis firehose with S3. And I tried to read those s3 files. I'm using GO to read it.

However, I can't parse the JSON because the values are only appending without any delimiter.

here's the example of the file (note that the original input is appending to each other, I split them by a newline for formatting purposes):

{"ticker_symbol":"PLM","sector":"FINANCIAL","change":-0.16,"price":19.99}
{"ticker_symbol":"AZL","sector":"HEALTHCARE","change":-0.78,"price":16.51}
{"ticker_symbol":"IOP","sector":"TECHNOLOGY","change":-1.98,"price":121.88}
{"ticker_symbol":"VVY","sector":"HEALTHCARE","change":-0.56,"price":47.62}
{"ticker_symbol":"BFH","sector":"RETAIL","change":0.74,"price":16.61}
{"ticker_symbol":"WAS","sector":"RETAIL","change":-0.6,"price":16.72}

my question is, how can I parse it in Go? one solution that I can think of is to split them by }{ and append them again. But it's pretty hackish.

Or does kinesis firehose provides delimiter?

------UPDATE------

currently I have implemented the solution with replacing all }{ with },{ and then add [ at the beginning and ] at the end. Then parse it.

However I'm still looking for alternatives as this solution would restrict any }{ in the content of the json object

  • 写回答

1条回答 默认 最新

  • dongzhonggua4229 2018-10-04 03:08
    关注

    Create a simple struct to unmarshal the json which is coming in batches. So each batch json is unmarshalled in to a json object. Then create a slice of structs to append the parsed json into the slice. This will append you result json all in slice of struct.

    package main
    
    import (
        "encoding/json"
        "fmt"
    )
    
    type Ticker struct {
        TickerSymbol string  `json:"ticker_symbol"`
        Sector       string  `json:"sector"`
        Change       float64 `json:"change"`
        Price        float64 `json:"price"`
    }
    
    var jsonBytes = []byte(`{"ticker_symbol":"PLM","sector":"FINANCIAL","change":-0.16,"price":19.99}`)
    
    func main() {
        var singleResult Ticker
        var result []Ticker
        if err := json.Unmarshal(jsonBytes, &singleResult); err != nil {
            fmt.Println(err)
        }
    
        if len(result) == 0 {
            result = append(result, singleResult)
        }
        fmt.Printf("%+v", result)
    }
    

    Edited:

    If the data is coming in batch which contains json objects appended to each other than you can go for regex expression to replace } with }, and then trim right most , to make a valid json array of objects as:

    package main
    
    import (
        "fmt"
        "regexp"
        "strings"
    )
    
    type Ticker struct {
        TickerSymbol string  `json:"ticker_symbol"`
        Sector       string  `json:"sector"`
        Change       float64 `json:"change"`
        Price        float64 `json:"price"`
    }
    
    var str = `{"ticker_symbol":"PLM","sector":"FINANCIAL","change":-0.16,"price":19.99}
    {"ticker_symbol":"AZL","sector":"HEALTHCARE","change":-0.78,"price":16.51}
    {"ticker_symbol":"IOP","sector":"TECHNOLOGY","change":-1.98,"price":121.88}
    {"ticker_symbol":"VVY","sector":"HEALTHCARE","change":-0.56,"price":47.62}
    {"ticker_symbol":"BFH","sector":"RETAIL","change":0.74,"price":16.61}
    {"ticker_symbol":"WAS","sector":"RETAIL","change":-0.6,"price":16.72}`
    
    func main() {
    
        r := regexp.MustCompile("}")
        output := strings.TrimRight(r.ReplaceAllString(str, "},"), ",")
        output = fmt.Sprintf("[%s]", output)
        fmt.Println(output)
    }
    

    Using r := regexp.MustCompile("}") will help you not to worry about whitespaces in between }{ which will interfere in replacing the string. So just replace } with }, and then trim right.

    Also The reason I am using MustCompile is:

    When creating constants with regular expressions you can use the MustCompile variation of Compile. A plain Compile won’t work for constants because it has 2 return values.

    Full Working code with json parse on Go playground

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 spss统计中二分类变量和有序变量的相关性分析可以用kendall相关分析吗?
  • ¥15 拟通过pc下指令到安卓系统,如果追求响应速度,尽可能无延迟,是不是用安卓模拟器会优于实体的安卓手机?如果是,可以快多少毫秒?
  • ¥20 神经网络Sequential name=sequential, built=False
  • ¥16 Qphython 用xlrd读取excel报错
  • ¥15 单片机学习顺序问题!!
  • ¥15 ikuai客户端多拨vpn,重启总是有个别重拨不上
  • ¥20 关于#anlogic#sdram#的问题,如何解决?(关键词-performance)
  • ¥15 相敏解调 matlab
  • ¥15 求lingo代码和思路
  • ¥15 公交车和无人机协同运输