duancoubeng5909 2018-02-20 17:27
浏览 34
已采纳

CSV到结构建议

If I have a csv read into a struct how can I manipulate the input to build the struct how I want? I am getting stuck in circles following various tutorials. This is the closest I have come.

I essentially want to open a csv, read selected columns, ensure the value is recorded from the same row when referencing the column. Then the resulting data in a format which can be put into a database.

Example CSV:

Ignore,Customer,Fruit,Number
123,A,Apple,1
123,A,Apple,3
123,B,Orange,4
123,C,Melon,5

Example Code:

package main
import (
    "bufio"
    "encoding/csv"
    "encoding/json"
    "fmt"
    "io"
    "log"
    "os"
)

type Account struct {
    Customer string `json:"Customer"`
    LineItem *LineItem  `json:"LineItem"`
}

type LineItem struct {
    ProductName string `json:"ProductName"`
    Count string `json:"Count"`
}


func main() {
    csvFile, _ := os.Open("/home/frank/gocode/src/local/billing/fruit.csv")

    reader := csv.NewReader(bufio.NewReader(csvFile))
    var billData []Account
    for {
        line, error := reader.Read()
        if error == io.EOF {
            break
        } else if error != nil {
            log.Fatal(error)
        }
        billData = append(billData, Account{
            Customer: line[1],
            LineItem: &LineItem{
                ProductName:   line[2],
                Count: line[3],
            },
        })
    }

    billingJson, _ := json.Marshal(billData)
    fmt.Println(string(billingJson))
}

The current output is:

[{"Customer":"Customer","LineItem":{"ProductName":"Fruit","Count":"Number"}},{"Customer":"A","LineItem":{"ProductName":"Apple","Count":"1"}},{"Customer":"A","LineItem":{"ProductName":"Apple","Count":"3"}},{"Customer":"B","LineItem":{"ProductName":"Orange","Count":"4"}},{"Customer":"C","LineItem":{"ProductName":"Melon","Count":"5"}}]

I would like to get rid of first record so the headers are not kept. e.g.

[{"Customer":"A","LineItem":{"ProductName":"Apple","Count":"1"}},{"Customer":"A","LineItem":{"ProductName":"Apple","Count":"3"}},{"Customer":"B","LineItem":{"ProductName":"Orange","Count":"4"}},{"Customer":"C","LineItem":{"ProductName":"Melon","Count":"5"}}]

Consolidate so Customer A is one record with both LineItems e.g.

[{"Customer":"A","LineItem":{"ProductName":"Apple","Count":"1"},"LineItem":{"ProductName":"Apple","Count":"3"}},{"Customer":"B","LineItem":{"ProductName":"Orange","Count":"4"}},{"Customer":"C","LineItem":{"ProductName":"Melon","Count":"5"}}]

Any best practices - alternate methods welcomed (not sure if a map is better here). Hopefully enough info to give me a hand.

  • 写回答

1条回答 默认 最新

  • dongyinting3179 2018-02-20 18:20
    关注

    Getting rid of the first entry is as easy as billData = billData[1:]. That, or do an initial read to pull the column names.

    On the second part, your current data structure does not tolerate a one-to-many relationship (each Account has one and only one LineItem). You'll need to do some processing on the list afterwards. CSV files are necessarily 1:1, as each line is considered a single independent record. The easiest way is to make it one-to-many is by using a map, but you can also simply loop over a slice (which retains closer to your existing code):

    https://play.golang.org/p/3uevo0taKR5

    package main
    
    import (
        "bytes"
        "encoding/csv"
        "encoding/json"
        "fmt"
        "io"
        "log"
    )
    
    var data = `Ignore,Customer,Fruit,Number
    123,A,Apple,1
    123,A,Apple,3
    123,B,Orange,4
    123,C,Melon,5`
    
    type Account struct {
        Customer  string     `json:"Customer"`
        LineItems []LineItem `json:"LineItems"`
    }
    
    type LineItem struct {
        ProductName string `json:"ProductName"`
        Count       string `json:"Count"`
    }
    
    func main() {
        reader := csv.NewReader(bytes.NewBufferString(data))
    
        // Read column label data and discard
        if _, err := reader.Read(); err != nil {
            log.Fatal(err)
        }
    
        var billData []Account
        for {
            line, err := reader.Read()
            if err == io.EOF {
                break
            }
            if err != nil {
                log.Fatal(err)
            }
            found := false
            for i := range billData {
                if billData[i].Customer == line[1] {
                    found = true
                    billData[i].LineItems = append(billData[i].LineItems, LineItem{
                        ProductName: line[2],
                        Count:       line[3],
                    })
                    break
                }
            }
            if !found {
                billData = append(billData, Account{
                    Customer: line[1],
                    LineItems: []LineItem{
                        {
                            ProductName: line[2],
                            Count:       line[3],
                        },
                    },
                })
            }
        }
    
        billingJson, err := json.MarshalIndent(billData, "", "  ")
        if err != nil {
            log.Fatal(err)
        }
        fmt.Println(string(billingJson))
    }
    

    Output:

    [
        {
            "Customer": "A",
            "LineItems": [
                {
                    "ProductName": "Apple",
                    "Count": "1"
                },
                {
                    "ProductName": "Apple",
                    "Count": "3"
                }
            ]
        },
        {
            "Customer": "B",
            "LineItems": [
                {
                    "ProductName": "Orange",
                    "Count": "4"
                }
            ]
        },
        {
            "Customer": "C",
            "LineItems": [
                {
                    "ProductName": "Melon",
                    "Count": "5"
                }
            ]
        }
    ]
    

    Lastly, I recommend using err or similar for your error variable. error is the name of the built in error type, so by naming your variable that, you're shadowing the type and making it impossible to declare a variable of that type within the same scope. While this doesn't affect your current code, it's still quite bad practice and liable to get you into trouble eventually.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 如何在scanpy上做差异基因和通路富集?
  • ¥20 关于#硬件工程#的问题,请各位专家解答!
  • ¥15 关于#matlab#的问题:期望的系统闭环传递函数为G(s)=wn^2/s^2+2¢wn+wn^2阻尼系数¢=0.707,使系统具有较小的超调量
  • ¥15 FLUENT如何实现在堆积颗粒的上表面加载高斯热源
  • ¥30 截图中的mathematics程序转换成matlab
  • ¥15 动力学代码报错,维度不匹配
  • ¥15 Power query添加列问题
  • ¥50 Kubernetes&Fission&Eleasticsearch
  • ¥15 報錯:Person is not mapped,如何解決?
  • ¥15 c++头文件不能识别CDialog