dongnao2048
2018-09-25 06:22
浏览 85
已采纳

从键值对中“过滤” JSON对象的最有效方法是什么?

I am reading in a .json file. It's an array of objects in valid JSON format, example:

    [
        {
                "Id": 13,
                "Location": "Australia",
                "Content": "Another string"
        },
        {
                "Id": 145,
                "Location": "England",
                "Content": "SomeString"
        },
        {
                "Id": 12,
                "Location": "England",
                "Content": "SomeString"
        },
        {
                "Id": 12331,
                "Location": "Sweden",
                "Content": "SomeString"
        },
        {
                "Id": 213123,
                "Location": "England",
                "Content": "SomeString"
        }
     ]

I want to filter these objects out - say, removing anything where "Location"doesn't equal "England".

What I've tried so far is creating a custom UnmarshalJSON function. It does unmarshal it, but the objects it produces are empty - and as many as the input.

Sample code:

type languageStruct struct {
    ID                  int     `json:"Id"`
    Location            string  `json:"Location"` 
    Content             string  `json:"Content"`
}

func filterJSON(file []byte) ([]byte, error) {
    var x []*languageStruct

    err := json.Unmarshal(file, &x)
    check(err)

    return json.MarshalIndent(x, "", " ")
}


func (s *languageStruct) UnmarshalJSON(p []byte) error {

    var result struct {
        ID              int     `json:"Id"`
        Location        string  `json:"Location"` 
        Content         string  `json:"Content"`
    }

    err := json.Unmarshal(p, &result)
    check(err)

    // slice of locations we'd like to filter the objects on
    locations := []string{"England"} // Can be more 

    if sliceContains(s.Location, locations) {
        s.ID = result.ID
        s.Location= result.Location
        s.Content = result.Content
    }

    return nil
}

// helper func to check if a given string, f.e. a value of a key-value pair in a json object, is in a provided list
func sliceContains(a string, list []string) bool {
    for _, b := range list {
        if b == a {
            fmt.Println("it's a match!")
            return true
        }
    }
    return false
}

While this runs - the output is wrong. It creates as many objects as comes in - however, the new ones are empty, f.e.:

// ...
 [
 {
  "Id": 0,
  "Location": "",
  "Content": ""
 },
 {
  "Id": 0,
  "Location": "",
  "Content": ""
 }
 ]
//...

Whereas my desired output, from the first given input, would be:

[
    {
            "Id": 145,
            "Location": "England",
            "Content": "SomeString"
    },
    {
            "Id": 12,
            "Location": "England",
            "Content": "SomeString"
    },
    {
            "Id": 213123,
            "Location": "England",
            "Content": "SomeString"
    }
 ]
  • 写回答
  • 好问题 提建议
  • 追加酬金
  • 关注问题
  • 邀请回答

1条回答 默认 最新

  • duanseci1039 2018-09-25 06:39
    最佳回答

    When languageStruct.UnmarshalJSON() is called, there is already a languageStruct prepared that will be appended to the slice, no matter if you fill its content (fields) or not.

    The easiest and my suggested solution is to just unmarshal normally, and post-process the slice: remove elements according to your requirements. This results in clean code, which you can easily adjust / alter in the future. Although it could be implemented as custom marshaling logic on a custom slice type []languageStruct, I would still not create custom marshaling logic for this but implement it as a separate filtering logic.

    Here's a simple code unmarshaling, filtering and marshaling it again (note: no custom marshaling is defined / used for this):

    var x []*languageStruct
    
    err := json.Unmarshal(file, &x)
    if err != nil {
        panic(err)
    }
    
    var x2 []*languageStruct
    for _, v := range x {
        if v.Location == "England" {
            x2 = append(x2, v)
        }
    }
    
    data, err := json.MarshalIndent(x2, "", " ")
    fmt.Println(string(data), err)
    

    This will result in your desired output. Try it on the Go Playground.

    The fastest and most complex solution would be to use event-driven parsing and building a state machine, but the complexity would increase by large. The idea would be to process the JSON by tokens, track where you're at currently in the object tree, and when an object is detected that must be excluded, don't process / add it to your slice. For details and ideas how this can be written, check out this anwser: Go - Decode JSON as it is still streaming in via net/http

    评论
    解决 无用
    打赏 举报

相关推荐 更多相似问题