doumie6223 2017-05-16 18:55
浏览 185
已采纳

Golang:在处理CSV时,是否重新格式化单行?

My golang CSV processing routine copies almost exactly from the Package CSV example:

func processCSV(path string){

    file:= utils.OpenFile(path)
    reader:= csv.NewReader(file)
    reader.LazyQuotes = true

    cs:= []*Collision{} //defined elsewhere

    for {

        line, err := reader.Read()

        //Kill processing if we're at EOF
        if err == io.EOF {
            break
        }

        c := get(line) //defined elsewhere
        cs= append(cs, c)
    }

    //Do other stuff...
}

The code works great until it encounters a malformed (?) line of CSV, which generally looks something like this:

item1,item2,"item3,"has odd quoting"","item4",item5

The csvReader.LazyQuotes = true option doesn't seem to offer enough tolerance to read this line as I need it.

My question is this: can I ask the csv reader for the original line so that I can "massage" it to pull out what I need? The files I'm working with are moderately large (~150mb) and I'm not sure I want to re-do them, especially as only a few lines per file have such problems.

Thanks for any tips!

  • 写回答

3条回答 默认 最新

  • douliexing2195 2017-05-16 19:36
    关注

    As far as I can tell encoding/csv doesn't provide any such functionality, so you can either look for some 3rd party csv package that does that, or you can implement a solution yourself.

    If you want to go the DIY route I can offer you a tip, whether it's a good tip that you should implement is up to you.

    You could implement an io.Reader that wraps your file and tracks the last line read, then every time you encouter an error because of malformed csv you can use your reader to reread that line, massage it, add it to the results, and have the loop continue as if nothing happened.

    Here's an example of how your processCSV would change:

    func processCSV(path string){
    
        file := utils.OpenFile(path)
        myreader := NewMyReader(file)
        reader := csv.NewReader(myreader)
        reader.LazyQuotes = true
    
        cs:= []*Collision{} //defined elsewhere
    
        for {
    
            line, err := reader.Read()
    
            //Kill processing if we're at EOF
            if err == io.EOF {
                break
            }
    
            // malformed csv
            if err != nil {
                // Just reread the last line and on the next iteration of
                // the loop myreader.Read should continue returning bytes 
                // that come after this malformed line to the csv.Reader.
                l, err := myreader.CurrentLine()
                if err != nil {
                    panic(err)
                }
    
                // massage the malformed csv line
                line = fixcsv(l) 
            }
    
            c := get(line) //defined elsewhere
            cs= append(cs, c)
        }
    
        //Do other stuff...
    }
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(2条)

报告相同问题?

悬赏问题

  • ¥15 全部备份安卓app数据包括密码,可以复制到另一手机上运行
  • ¥15 Python3.5 相关代码写作
  • ¥20 测距传感器数据手册i2c
  • ¥15 RPA正常跑,cmd输入cookies跑不出来
  • ¥15 求帮我调试一下freefem代码
  • ¥15 matlab代码解决,怎么运行
  • ¥15 R语言Rstudio突然无法启动
  • ¥15 关于#matlab#的问题:提取2个图像的变量作为另外一个图像像元的移动量,计算新的位置创建新的图像并提取第二个图像的变量到新的图像
  • ¥15 改算法,照着压缩包里边,参考其他代码封装的格式 写到main函数里
  • ¥15 用windows做服务的同志有吗