dongyao1915 2015-08-06 12:59
浏览 43

如何在Go中编辑阅读器

I'm trying to work out what the best practise is to change some data in a stream without ioutil.ReadAll.

I need to remove lines beginning with a certain character and strip all instances of another.

package main

import (
    "bufio"
    "bytes"
    "fmt"
    "os"

    "gopkg.in/pg.v3"
)

func main() {
    fieldSep := "\x01"
    badChar := "\x02"
    comment := "#"
    dbName := "foo"
    db := pg.Connect(&pg.Options{})

    file, err := os.Open("/path/to/file")
    if err != nil {
        fmt.Fprintf(os.Stderr, "ERROR: %s
", err)
    }
    defer file.Close()

    // I need to iterate my file Reader here
    // all lines that begin with comment and remove them
    scanner := bufio.NewScanner(file)
    for scanner.Scan() {
        file := bytes.TrimRight(file, comment)
    }
    // all instances of badChar should be dropped
    file := bytes.Trim(file, badChar)

    _, err = db.CopyFrom(file, fmt.Sprintf("COPY %s FROM STDIN WITH DELIMITER e'%s'", dbName, fieldSep))
    if err != nil {
        fmt.Fprintf(os.Stderr, "ERROR: %s
", err)
    }

    err = db.Close()
    if err != nil {
        fmt.Fprintf(os.Stderr, "ERROR: %s
", err)
    }
    fmt.Println("Import Done")
}

Context:

I'm to importing a large amount (>10GB) of data into a database, it's spread across several files.

My database interface accepts a reader to load the data.

The data has non-standard line endings and I need to strip comments (because PG's COPY FROM is no fun).

I know the code I've got to edit the stream is woeful, I just can't find a good reference - thanks!

  • 写回答

1条回答 默认 最新

  • dongqiuwei8667 2015-08-06 13:38
    关注

    If I was in your position, I'd make my own Reader, and insert it between the source and the destination. That's what consistent interfaces are for. Your reader would work easily on the small chunks of data along as they flow past.

    Source (io.Reader)   ==>  Your filter (io.Reader) ==>  Destination (expects an io.Reader)
    provides the data         does the transformations       rock'n'rolls
    

    A library example of such a reader that's made to be inserted between a reader and its client is bufio.Reader, that'll let you speed up many types of readers by buffering larger calls to the source, and letting the client consume the data in small bits if it likes it so. You can check out its source : http://golang.org/src/bufio/bufio.go

    评论

报告相同问题?

悬赏问题

  • ¥15 Oracle中如何从clob类型截取特定字符串后面的字符
  • ¥15 想通过pywinauto自动电机应用程序按钮,但是找不到应用程序按钮信息
  • ¥15 MATLAB中streamslice问题
  • ¥15 如何在炒股软件中,爬到我想看的日k线
  • ¥15 51单片机中C语言怎么做到下面类似的功能的函数(相关搜索:c语言)
  • ¥15 seatunnel 怎么配置Elasticsearch
  • ¥15 PSCAD安装问题 ERROR: Visual Studio 2013, 2015, 2017 or 2019 is not found in the system.
  • ¥15 (标签-MATLAB|关键词-多址)
  • ¥15 关于#MATLAB#的问题,如何解决?(相关搜索:信噪比,系统容量)
  • ¥500 52810做蓝牙接受端