draxu26480 2018-08-16 12:07
浏览 72
已采纳

总结csv的内容

Context I'm working on creating a little program that can summarize the contents of an absolute mess of a bill, which is in csv form.

The bill has three columns I'm interested in:

  1. Event type. Here, I'm only interested in the rows where this column reads CHARGE
  2. The cost. Self explanatory.
  3. Resource name, containing Server and cluster names. The format is servername.clustername.

The idea is to select the rows that are labeled as charge, split them up first by cluster and then by server name, and sum up the total costs for each.

I can't help but feel like this should be easy, but I've been scratching my head on this for a while now, and just can't seem to figure it out. At this point I ought to state that I am fairly new to programming and entirely new to GO.

Here's what I have so far:

package main

import (
    "encoding/csv"
    "log"
    "os"
    "sort"
    "strings"
)



func main() {
    rows := readBill("bill-2018-April.csv")
    rows = calculateSummary(rows)
    writeSummary("bill-2018-April-output", rows)

}

func readBill(name string) [][]string {

    f, err := os.Open(name)

    if err != nil {
        log.Fatalf("Cannot open '%s': %s
", name, err.Error())
    }

    defer f.Close()

    r := csv.NewReader(f)

    rows, err := r.ReadAll()

    if err != nil {
        log.Fatalln("Cannot read CSV data:", err.Error())
    }

    return rows
}

type charges struct {
    impactType string
    cost       float64
    resName    string
}
func createCharges(rows [][]string){
    charges:= []charges{}
    for i,r:=range rows {
        var c charges
        c.impactType :=r [i][10]
        c.cost := r [i][15]
        c.resName := r [i][20]
        charges = append()
    }
    return charges
} 

So, as far as I can tell, I should now have isolated the columns I am interested in (i.e. columns 10, 15 and 20). Is what I have so far even correct?

How would I go about singling out the rows reading "CHARGE" and slicing everything up by cluster and server?

Summing things up shouldn't be too tricky, but for whatever reason, this is really stumping me.

  • 写回答

1条回答 默认 最新

  • dpkt31779 2018-08-16 12:42
    关注

    Just use two maps to store the sums per server and per cluster. And since you're not interested in the whole CSV but only some rows, reading everything is kind of wasteful. Just skip the rows you don't care about:

    package main
    
    import (
        "encoding/csv"
        "fmt"
        "io"
        "log"
        "strconv"
        "strings"
    )
    
    func main() {
        b := `
    ,,,,,,,,,,CHARGE,,,,,100.00,,,,,s1.c1
    ,,,,,,,,,,IGNORE,,,,,,,,,,
    ,,,,,,,,,,CHARGE,,,,,200.00,,,,,s2.c1
    ,,,,,,,,,,CHARGE,,,,,300.00,,,,,s3.c2
    `
    
        r := csv.NewReader(strings.NewReader(b))
    
        byServer := make(map[string]float64)
        byCluster := make(map[string]float64)
    
        for i := 0; ; i++ {
            row, err := r.Read()
            if err == io.EOF {
                break
            }
            if err != nil {
                log.Fatal(err)
            }
    
            if row[10] != "CHARGE" {
                continue
            }
    
            cost, err := strconv.ParseFloat(row[15], 64)
            if err != nil {
                log.Fatalf("row %d: malformed cost: %v", i, err)
            }
    
            xs := strings.SplitN(row[20], ".", 2)
            if len(xs) != 2 {
                log.Fatalf("row %d: malformed resource name", i)
            }
    
            server, cluster := xs[0], xs[1]
    
            byServer[server] += cost
            byCluster[cluster] += cost
        }
    
        fmt.Printf("byServer: %+v
    ", byServer)
        fmt.Printf("byCluster: %+v
    ", byCluster)
    }
    
    // Output:
    // byServer: map[s2:200 s3:300 s1:100]
    // byCluster: map[c1:300 c2:300]
    

    Try it on the playground: https://play.golang.org/p/1e9mJf4LyYE

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 无线电能传输系统MATLAB仿真问题
  • ¥50 如何用脚本实现输入法的热键设置
  • ¥20 我想使用一些网络协议或者部分协议也行,主要想实现类似于traceroute的一定步长内的路由拓扑功能
  • ¥30 深度学习,前后端连接
  • ¥15 孟德尔随机化结果不一致
  • ¥15 apm2.8飞控罗盘bad health,加速度计校准失败
  • ¥15 求解O-S方程的特征值问题给出边界层布拉休斯平行流的中性曲线
  • ¥15 谁有desed数据集呀
  • ¥20 手写数字识别运行c仿真时,程序报错错误代码sim211-100
  • ¥15 关于#hadoop#的问题