dqy27359 2017-11-22 11:31
浏览 325
已采纳

bufio.Reader和bufio.Scanner的功能和性能

I had seen several blurbs on the interwebs which had loosely talked about why one should use bufio.Scanner instead of bufio.Reader.

I don't know if my test case is relevant, but I decided to test one vs the other when it comes to reading 1,000,000 lines from a text file:

package main

import (
    "fmt"
    "strconv"
    "bufio"
    "time"
    "os"
    //"bytes"
)

func main() {

    fileName := "testfile.txt"

    // Create 1,000,000 integers as strings
    numItems := 1000000
    startInitStringArray := time.Now()

    var input [1000000]string
    //var input []string

    for i:=0; i < numItems; i++ {
        input[i] = strconv.Itoa(i)
        //input = append(input,strconv.Itoa(i))
    }

    elapsedInitStringArray := time.Since(startInitStringArray)
    fmt.Printf("Took %s to populate string array.
", elapsedInitStringArray)

    // Write to a file
    fo, _ := os.Create(fileName)
    for i:=0; i < numItems; i++ {
        fo.WriteString(input[i] + "
")
    }

    fo.Close()

    // Use reader
    openedFile, _ := os.Open(fileName)

    startReader := time.Now()
    reader := bufio.NewReader(openedFile)

    for i:=0; i < numItems; i++ {
        reader.ReadLine()
    }
    elapsedReader := time.Since(startReader)
    fmt.Printf("Took %s to read file using reader.
", elapsedReader)
    openedFile.Close()

    // Use scanner
    openedFile, _ = os.Open(fileName)

    startScanner := time.Now()
    scanner := bufio.NewScanner(openedFile)

    for i:=0; i < numItems; i++ {
        scanner.Scan()
        scanner.Text()
    }

    elapsedScanner := time.Since(startScanner)
    fmt.Printf("Took %s to read file using scanner.
", elapsedScanner)
    openedFile.Close()
}

A pretty average output I receive on the timings looks like this:

Took 44.1165ms to populate string array.
Took 17.0465ms to read file using reader.
Took 23.0613ms to read file using scanner.

I am curious, when is it better to use a reader vs. a scanner, and is it based on performance, or functionality?

展开全部

  • 写回答

1条回答 默认 最新

  • douping4436 2017-11-22 12:09
    关注

    It's a flawed benchmark. They are not doing the same thing.

    func (b *Reader) ReadLine() (line []byte, isPrefix bool, err error)
    

    returns []byte.

    func (s *Scanner) Text() string
    

    returns string([]byte)

    To be comparable, use,

    func (s *Scanner) Bytes() []byte
    

    It's a flawed benchmark. It reads short strings, the integers from "0 " to "999999 ". What real-world data set looks like that?

    In the real world we read Shakespeare: http://www.gutenberg.org/ebooks/100: Plain Text UTF-8: pg100.txt.

    Took 2.973307ms to read file using reader.   size: 5340315 lines: 124787
    Took 2.940388ms to read file using scanner.  size: 5340315 lines: 124787
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
编辑
预览

报告相同问题?

手机看
程序员都在用的中文IT技术交流社区

程序员都在用的中文IT技术交流社区

专业的中文 IT 技术社区,与千万技术人共成长

专业的中文 IT 技术社区,与千万技术人共成长

关注【CSDN】视频号,行业资讯、技术分享精彩不断,直播好礼送不停!

关注【CSDN】视频号,行业资讯、技术分享精彩不断,直播好礼送不停!

客服 返回
顶部