doumie6223 2017-05-16 18:55
浏览 185
已采纳

Golang:在处理CSV时,是否重新格式化单行?

My golang CSV processing routine copies almost exactly from the Package CSV example:

func processCSV(path string){

    file:= utils.OpenFile(path)
    reader:= csv.NewReader(file)
    reader.LazyQuotes = true

    cs:= []*Collision{} //defined elsewhere

    for {

        line, err := reader.Read()

        //Kill processing if we're at EOF
        if err == io.EOF {
            break
        }

        c := get(line) //defined elsewhere
        cs= append(cs, c)
    }

    //Do other stuff...
}

The code works great until it encounters a malformed (?) line of CSV, which generally looks something like this:

item1,item2,"item3,"has odd quoting"","item4",item5

The csvReader.LazyQuotes = true option doesn't seem to offer enough tolerance to read this line as I need it.

My question is this: can I ask the csv reader for the original line so that I can "massage" it to pull out what I need? The files I'm working with are moderately large (~150mb) and I'm not sure I want to re-do them, especially as only a few lines per file have such problems.

Thanks for any tips!

  • 写回答

3条回答 默认 最新

  • douliexing2195 2017-05-16 19:36
    关注

    As far as I can tell encoding/csv doesn't provide any such functionality, so you can either look for some 3rd party csv package that does that, or you can implement a solution yourself.

    If you want to go the DIY route I can offer you a tip, whether it's a good tip that you should implement is up to you.

    You could implement an io.Reader that wraps your file and tracks the last line read, then every time you encouter an error because of malformed csv you can use your reader to reread that line, massage it, add it to the results, and have the loop continue as if nothing happened.

    Here's an example of how your processCSV would change:

    func processCSV(path string){
    
        file := utils.OpenFile(path)
        myreader := NewMyReader(file)
        reader := csv.NewReader(myreader)
        reader.LazyQuotes = true
    
        cs:= []*Collision{} //defined elsewhere
    
        for {
    
            line, err := reader.Read()
    
            //Kill processing if we're at EOF
            if err == io.EOF {
                break
            }
    
            // malformed csv
            if err != nil {
                // Just reread the last line and on the next iteration of
                // the loop myreader.Read should continue returning bytes 
                // that come after this malformed line to the csv.Reader.
                l, err := myreader.CurrentLine()
                if err != nil {
                    panic(err)
                }
    
                // massage the malformed csv line
                line = fixcsv(l) 
            }
    
            c := get(line) //defined elsewhere
            cs= append(cs, c)
        }
    
        //Do other stuff...
    }
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(2条)

报告相同问题?

悬赏问题

  • ¥15 电脑和power bi环境都是英文如何将日期层次结构转换成英文
  • ¥15 DruidDataSource一直closing
  • ¥20 气象站点数据求取中~
  • ¥15 如何获取APP内弹出的网址链接
  • ¥15 wifi 图标不见了 不知道怎么办 上不了网 变成小地球了
  • ¥50 STM32单片机传感器读取错误
  • ¥50 power BI 从Mysql服务器导入数据,但连接进去后显示表无数据
  • ¥15 (关键词-阻抗匹配,HFSS,RFID标签)
  • ¥50 sft下载大文阻塞卡死
  • ¥15 机器人轨迹规划相关问题