doufuxi7093
2018-05-20 16:06
浏览 71
已采纳

如何在不共享bufio.Scanner的情况下反复从os.Stdin中读取

In Go, can a single line of input be read from stdin in a simple way, which also meets the following requirements?

  • can be called by disparate parts of a larger interactive application without having to create coupling between these different parts of the application (e.g. by passing a global bufio.Scanner between them)
  • works whether users are running an interactive terminal or using pre-scripted input

I'd like to modify an existing large Go application which currently creates a bufio.Scanner instance every time it asks users for a line of input. Multiple instances work fine when standard input is from a terminal, but when standard input is piped from another process, calls to Scan only succeed on the first instance of bufio.Scanner. Calls from all other instances fail.

Here's some toy code that demonstrates the problem:

package main
import (
    "bufio"
    "fmt"
    "os"
)

func main() {
    // read with 1st scanner -> works for both piped stdin and terminal
    scanner1 := readStdinLine(1)
    // read with 2nd scanner -> fails for piped stdin, works for terminal
    readStdinLine(2)
    // read with 1st scanner -> prints line 2 for piped stdin, line 3 for terminal
    readLine(scanner1, 3)
}

func readStdinLine(lineNum int64) (scanner *bufio.Scanner) {
    scanner = readLine(bufio.NewScanner(os.Stdin), lineNum)
    return
}

func readLine(scannerIn *bufio.Scanner, lineNum int64) (scanner *bufio.Scanner) {
    scanner = scannerIn
    scanned := scanner.Scan()
    fmt.Printf("%d: ", lineNum)
    if scanned {
        fmt.Printf("Text=%s
", scanner.Text())
        return
    }
    if scanErr := scanner.Err(); scanErr != nil {
        fmt.Printf("Error=%s
", scanErr)
        return
    }
    fmt.Println("EOF")
    return
}

I build this as print_stdinand run interactively from a bash shell:

~$ ./print_stdin
ab
1: Text=ab
cd
2: Text=cd
ef
3: Text=ef

But if I pipe in the text, the second bufio.Scanner fails:

~$ echo "ab
> cd
> ef" | ./print_stdin
1: Text=ab
2: EOF
3: Text=cd
  • 写回答
  • 好问题 提建议
  • 关注问题
  • 收藏
  • 邀请回答

2条回答 默认 最新

  • doujun5009 2018-05-22 18:01
    已采纳

    The suggestion in the comment by ThunderCat works.

    The alternative to buffered read is reading a byte a time. Read single bytes until or some terminator is found and return the data up to that point.

    Here's my implementation, heavily inspired by Scanner.Scan:

    package lineio
    import (
        "errors"
        "io"
    )
    
    const startBufSize = 4 * 1024
    const maxBufSize = 64 * 1024
    const maxConsecutiveEmptyReads = 100
    
    var ErrTooLong = errors.New("lineio: line too long")
    
    func ReadLine(r io.Reader) (string, error) {
        lb := &lineBuf {r:r, buf: make([]byte, startBufSize)}
        for {
            lb.ReadByte()
            if lb.err != nil || lb.TrimCrlf() {
                return lb.GetResult()
            }
        }
    }
    
    type lineBuf struct {
        r       io.Reader
        buf     []byte
        end     int
        err     error
    }
    
    func (lb *lineBuf) ReadByte() {
        if lb.EnsureBufSpace(); lb.err != nil {
            return
        }
        for empties := 0; ; {
            n := 0
            if n, lb.err = lb.r.Read(lb.buf[lb.end:lb.end+1]); lb.err != nil {
                return
            }
            if n > 0 {
                lb.end++
                return
            }
            empties++
            if empties > maxConsecutiveEmptyReads {
                lb.err = io.ErrNoProgress
                return
            }
        }
    }
    
    func (lb *lineBuf) TrimCrlf() bool {
        if !lb.EndsLf() {
            return false
        }
        lb.end--
        if lb.end > 0 && lb.buf[lb.end-1] == '' {
            lb.end--
        }
        return true
    }
    
    func (lb *lineBuf) GetResult() (string, error) {
        if lb.err != nil && lb.err != io.EOF {
            return "", lb.err
        }
        return string(lb.buf[0:lb.end]), nil
    }
    
    func (lb *lineBuf) EndsLf() bool {
        return lb.err == nil && lb.end > 0 && (lb.buf[lb.end-1] == '
    ')
    }
    
    func (lb *lineBuf) EnsureBufSpace() {
        if lb.end < len(lb.buf) {
            return
        }
        newSize := len(lb.buf) * 2
        if newSize > maxBufSize {
            lb.err = ErrTooLong
            return
        }
        newBuf := make([]byte, newSize)
        copy(newBuf, lb.buf[0:lb.end])
        lb.buf = newBuf
        return
    }
    

    TESTING

    Compiled lineio with go install and main (see below) with go build -o read_each_byte.

    Tested scripted input:

    $ seq 12 22 78 | ./read_each_byte
    1: Text: "12"
    2: Text: "34"
    3: Text: "56"
    

    Tested input from an interactive terminal:

    $ ./read_each_byte
    abc
    1: Text: "abc"
    123
    2: Text: "123"
    x\y"z
    3: Text: "x\\y\"z"
    

    Here's main:

    package main
    import (
        "fmt"
        "lineio"
        "os"
    )
    
    func main() {
        for i := 1; i <= 3; i++ {
            text, _ := lineio.ReadLine(os.Stdin)
            fmt.Printf("%d: Text: %q
    ", i, text)
        }
    }
    
    已采纳该答案
    评论
    解决 无用
    打赏 举报
  • dongshenghe1833 2018-05-20 18:45

    Your sequence is:

    1. create scanner
    2. wait read terminal
    3. print result
    4. repeat 1 to 3 (creating new scanner about stdin)
    5. repeat 2 to 3
    6. exit program

    When you exec echo in pipeline, only exists a stdin/stdout file being read/write, but you are trying to use two.

    UPDATE: The flow of execution for echo is:

    1. read args
    2. process args
    3. write args in stdout
    4. terminal read stdout and print its

    See that this occur on press ENTER key. The argument whole is sent to echo program and not by line.

    The echo utility writes its arguments to standard output, followed by a . If there are no arguments, only the is written.

    More here: http://pubs.opengroup.org/onlinepubs/9699919799/utilities/echo.html.

    See in source code how echo work:

    while (argc > 0) 
    {
      fputs (argv[0], stdout);//<-- send args to the same stdout
      argc--;
      argv++;
      if (argc > 0)
        putchar (' ');
    }
    

    So your code will work fine with this:

    $ (n=1; while sleep 1; do echo a$n; n=$((n+1)); done) | ./print_stdin 
    $ 1: Text=a1
    $ 2: Text=a2
    $ 3: Text=a3
    

    If you need repeat args in differents stdout, use "yes" program or alternatives. yes program repeats the wrote args in stdout. More in: https://git.savannah.gnu.org/cgit/coreutils.git/tree/src/yes.c

    Example:

    $ yes a | ./print_stdin 
    $ 1: Text=a
    $ 2: Text=a
    $ 3: Text=a
    
    评论
    解决 无用
    打赏 举报

相关推荐 更多相似问题