du656637962 2015-08-15 17:53
浏览 461
已采纳

在Go中高效读写CSV

The Go code below reads in a 10,000 record CSV (of timestamp times and float values), runs some operations on the data, and then writes the original values to another CSV along with an additional column for score. However it is terribly slow (i.e. hours, but most of that is calculateStuff()) and I'm curious if there are any inefficiencies in the CSV reading/writing I can take care of.

package main

import (
  "encoding/csv"
  "log"
  "os"
  "strconv"
)

func ReadCSV(filepath string) ([][]string, error) {
  csvfile, err := os.Open(filepath)

  if err != nil {
    return nil, err
  }
  defer csvfile.Close()

  reader := csv.NewReader(csvfile)
  fields, err := reader.ReadAll()

  return fields, nil
}

func main() {
  // load data csv
  records, err := ReadCSV("./path/to/datafile.csv")
  if err != nil {
    log.Fatal(err)
  }

  // write results to a new csv
  outfile, err := os.Create("./where/to/write/resultsfile.csv"))
  if err != nil {
    log.Fatal("Unable to open output")
  }
  defer outfile.Close()
  writer := csv.NewWriter(outfile)

  for i, record := range records {
    time := record[0]
    value := record[1]

    // skip header row
    if i == 0 {
      writer.Write([]string{time, value, "score"})
      continue
    }

    // get float values
    floatValue, err := strconv.ParseFloat(value, 64)
    if err != nil {
      log.Fatal("Record: %v, Error: %v", floatValue, err)
    }

    // calculate scores; THIS EXTERNAL METHOD CANNOT BE CHANGED
    score := calculateStuff(floatValue)

    valueString := strconv.FormatFloat(floatValue, 'f', 8, 64)
    scoreString := strconv.FormatFloat(prob, 'f', 8, 64)
    //fmt.Printf("Result: %v
", []string{time, valueString, scoreString})

    writer.Write([]string{time, valueString, scoreString})
  }

  writer.Flush()
}

I'm looking for help making this CSV read/write template code as fast as possible. For the scope of this question we need not worry about the calculateStuff method.

  • 写回答

3条回答 默认 最新

  • drt41563 2015-08-15 19:05
    关注

    You're loading the file in memory first then processing it, that can be slow with a big file.

    You need to loop and call .Read and process one line at a time.

    func processCSV(rc io.Reader) (ch chan []string) {
        ch = make(chan []string, 10)
        go func() {
            r := csv.NewReader(rc)
            if _, err := r.Read(); err != nil { //read header
                log.Fatal(err)
            }
            defer close(ch)
            for {
                rec, err := r.Read()
                if err != nil {
                    if err == io.EOF {
                        break
                    }
                    log.Fatal(err)
    
                }
                ch <- rec
            }
        }()
        return
    }
    

    <kbd>playground</kbd>

    //note it's roughly based on DaveC's comment.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(2条)

报告相同问题?

悬赏问题

  • ¥15 素材场景中光线烘焙后灯光失效
  • ¥15 请教一下各位,为什么我这个没有实现模拟点击
  • ¥15 执行 virtuoso 命令后,界面没有,cadence 启动不起来
  • ¥50 comfyui下连接animatediff节点生成视频质量非常差的原因
  • ¥20 有关区间dp的问题求解
  • ¥15 多电路系统共用电源的串扰问题
  • ¥15 slam rangenet++配置
  • ¥15 有没有研究水声通信方面的帮我改俩matlab代码
  • ¥15 ubuntu子系统密码忘记
  • ¥15 保护模式-系统加载-段寄存器