doumie6223 2017-05-16 18:55
浏览 185
已采纳

Golang:在处理CSV时,是否重新格式化单行?

My golang CSV processing routine copies almost exactly from the Package CSV example:

func processCSV(path string){

    file:= utils.OpenFile(path)
    reader:= csv.NewReader(file)
    reader.LazyQuotes = true

    cs:= []*Collision{} //defined elsewhere

    for {

        line, err := reader.Read()

        //Kill processing if we're at EOF
        if err == io.EOF {
            break
        }

        c := get(line) //defined elsewhere
        cs= append(cs, c)
    }

    //Do other stuff...
}

The code works great until it encounters a malformed (?) line of CSV, which generally looks something like this:

item1,item2,"item3,"has odd quoting"","item4",item5

The csvReader.LazyQuotes = true option doesn't seem to offer enough tolerance to read this line as I need it.

My question is this: can I ask the csv reader for the original line so that I can "massage" it to pull out what I need? The files I'm working with are moderately large (~150mb) and I'm not sure I want to re-do them, especially as only a few lines per file have such problems.

Thanks for any tips!

  • 写回答

3条回答 默认 最新

  • douliexing2195 2017-05-16 19:36
    关注

    As far as I can tell encoding/csv doesn't provide any such functionality, so you can either look for some 3rd party csv package that does that, or you can implement a solution yourself.

    If you want to go the DIY route I can offer you a tip, whether it's a good tip that you should implement is up to you.

    You could implement an io.Reader that wraps your file and tracks the last line read, then every time you encouter an error because of malformed csv you can use your reader to reread that line, massage it, add it to the results, and have the loop continue as if nothing happened.

    Here's an example of how your processCSV would change:

    func processCSV(path string){
    
        file := utils.OpenFile(path)
        myreader := NewMyReader(file)
        reader := csv.NewReader(myreader)
        reader.LazyQuotes = true
    
        cs:= []*Collision{} //defined elsewhere
    
        for {
    
            line, err := reader.Read()
    
            //Kill processing if we're at EOF
            if err == io.EOF {
                break
            }
    
            // malformed csv
            if err != nil {
                // Just reread the last line and on the next iteration of
                // the loop myreader.Read should continue returning bytes 
                // that come after this malformed line to the csv.Reader.
                l, err := myreader.CurrentLine()
                if err != nil {
                    panic(err)
                }
    
                // massage the malformed csv line
                line = fixcsv(l) 
            }
    
            c := get(line) //defined elsewhere
            cs= append(cs, c)
        }
    
        //Do other stuff...
    }
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(2条)

报告相同问题?

悬赏问题

  • ¥15 seatunnel-web使用SQL组件时候后台报错,无法找到表格
  • ¥15 fpga自动售货机数码管(相关搜索:数字时钟)
  • ¥15 用前端向数据库插入数据,通过debug发现数据能走到后端,但是放行之后就会提示错误
  • ¥30 3天&7天&&15天&销量如何统计同一行
  • ¥30 帮我写一段可以读取LD2450数据并计算距离的Arduino代码
  • ¥15 飞机曲面部件如机翼,壁板等具体的孔位模型
  • ¥15 vs2019中数据导出问题
  • ¥20 云服务Linux系统TCP-MSS值修改?
  • ¥20 关于#单片机#的问题:项目:使用模拟iic与ov2640通讯环境:F407问题:读取的ID号总是0xff,自己调了调发现在读从机数据时,SDA线上并未有信号变化(语言-c语言)
  • ¥20 怎么在stm32门禁成品上增加查询记录功能