dow46218
2017-01-30 14:15
浏览 396
已采纳

使用Golang读取csv,对列重新排序,然后使用并发将结果写入新的csv

Here's my starting point.

It is a Golang script to read in a csv with 3 columns, re-order the columns and write the result to a new csv file.

package main

import (
   "fmt"
   "encoding/csv"
   "io"
   "os"
   "math/rand"
   "time"
)

func main(){
  start_time := time.Now()

  // Loading csv file
  rFile, err := os.Open("data/small.csv") //3 columns
  if err != nil {
    fmt.Println("Error:", err)
    return
   }
  defer rFile.Close()

  // Creating csv reader
  reader := csv.NewReader(rFile)

  lines, err := reader.ReadAll()
  if err == io.EOF {
      fmt.Println("Error:", err)
      return
  }

  // Creating csv writer
  wFile, err := os.Create("data/result.csv")
  if err != nil {
      fmt.Println("Error:",err)
      return
  }
  defer wFile.Close()
  writer := csv.NewWriter(wFile)

  // Read data, randomize columns and write new lines to results.csv
  rand.Seed(int64(time.Now().Nanosecond()))
  var col_index []int
  for i,line :=range lines{
      if i == 0 {
        //randomize column index based on the number of columns recorded in the 1st line
        col_index = rand.Perm(len(line))
    }
    writer.Write([]string{line[col_index[0]], line[col_index[1]], line[col_index[2]]}) //3 columns
    writer.Flush()
}

//print report
fmt.Println("No. of lines: ",len(lines))
fmt.Println("Time taken: ", time.Since(start_time))

}

Question:

  1. Is my code idiomatic for Golang?

  2. How can I add concurrency to this code?

图片转代码服务由CSDN问答提供 功能建议

这是我的出发点。

这是一个Golang脚本,用于读取具有3列的csv,重新排序各列并将结果写入新的csv文件。

 包main 
 
import(
“ fmt” 
“ encoding / csv” 
“ io” 
“ os” 
“ math / rand” 
“时间” 
)
  
func main(){
 start_time:= time.Now()
 
 //正在加载csv文件
 rFile,err:= os.Open(“ data / small.csv”)// 3列
  if err!= nil {
 fmt.Println(“ Error:”,err)
返回
} 
延迟rFile.Close()
 
 //创建csv阅读器
 reader:= csv.NewReader  (rFile)
 
行,err:= reader.ReadAll()
,如果err == io.EOF {
 fmt.Println(“ Error:”,err)
 return 
} 
 
  //创建csv编写器
 wFile,err:= os.Create(“ data / result.csv”)
如果err!= nil {
 fmt.Println(“ Error:”,err)
 return 
  } 
延迟wFile.Close()
 writer:= csv.NewWriter(wFile)
 
 //读取数据,随机化列并向结果中添加新行。csv
 rand.Seed(int64(time.Now  ().Nanosecond()))
 var col_index [] int 
 for i,line:= range  lines {
 if i == 0 {
 //根据第一行记录的列数来随机化列索引
 col_index = rand.Perm(len(line))
} 
 writer.Write(  [] string {line [col_index [0]],line [col_index [1]],line [col_index [2]]})// 3列
 writer.Flush()
} 
 
 // print 报告
fmt.Println(“没有。 行数:“,len(行))
fmt.Println(”花费时间:“,时间。自(start_time))
 
} 
   
 
 

问题:

  1. 我的代码对Golang是惯用的吗?

  2. 如何在此代码中添加并发?

  • 写回答
  • 好问题 提建议
  • 关注问题
  • 收藏
  • 邀请回答

2条回答 默认 最新

  • dps123456789 2017-01-30 19:06
    已采纳

    Your code is OK. There are no much case for concurrency. But you can at least reduce memory consumption reordering on the fly. Just use Read() instead of ReadAll() to avoid allocating slice for hole input file.

    for line, err := reader.Read(); err == nil; line, err = reader.Read(){
        if err = writer.Write([]string{line[col_index[0]], line[col_index[1]], line[col_index[2]]}); err != nil {
                fmt.Println("Error:", err)
                break
        }
        writer.Flush()
    }
    
    已采纳该答案
    评论
    解决 无用
    打赏 举报
  • dongque1462 2017-02-01 15:16

    Move the col_index initialisation outside the write loop:

    if len(lines) > 0 {
        //randomize column index based on the number of columns recorded in the 1st line
        col_index := rand.Perm(len(lines[0]))
        newLine := make([]string, len(col_index))
    
        for _, line :=range lines[1:] {
            for from, to := range col_index {
                newLine[to] = line[from]
            }
            writer.Write(newLine)
            writer.Flush()
        }
    }
    

    To use concurrency, you must not use reader.ReadAll. Instead make a goroutine that calls reader.Read and write the output on a channel that would replace the lines array. The main goroutine would read the channel and do the shuffle and the write.

    评论
    解决 无用
    打赏 举报

相关推荐 更多相似问题