dougao9864 2015-08-06 14:04
浏览 100
已采纳

如何在Go中逐个字符读取文件

I have some large json files I want to parse, and I want to avoid loading all of the data into memory at once. I'd like a function/loop that can return me each character one at a time.

I found this example for iterating over words in a string, and the ScanRunes function in the bufio package looks like it could return a character at a time. I also had the ReadRune function from bufio mostly working, but that felt like a pretty heavy approach.

EDIT

I compared 3 approaches. All used a loop to pull content from either a bufio.Reader or a bufio.Scanner.

  1. Read runes in a loop using .ReadRune on a bufio.Reader. Checked for errors from the call to .ReadRune.
  2. Read bytes from a bufio.Scanner after calling .Split(bufio.ScanRunes) on the scanner. Called .Scan and .Bytes on each iteration, checking .Scan call for errors.
  3. Same as #2 but read text from a bufio.Scanner instead of bytes using .Text. Instead of joining a slice of runes with string([]runes), I joined an slice of strings with strings.Join([]strings, "") to form the final blobs of text.

The timing for 10 runs of each on a 23 MB json file was:

  1. 0.65 s
  2. 2.40 s
  3. 0.97 s

So it looks like ReadRune is not too bad after all. It also results in smaller less verbose call because each rune is fetched in 1 operation (.ReadRune) instead of 2 (.Scan and .Bytes).

  • 写回答

3条回答 默认 最新

  • dongtun1683 2015-08-06 15:46
    关注

    Just read each rune one by one in the loop... See example

    EDIT: Adding code for posterity, in case link ever dies:

    package main
    
    import (
        "bufio"
        "fmt"
        "io"
        "log"
        "strings"
    )
    
    var text = `
    The quick brown fox jumps over the lazy dog #1.
    Быстрая коричневая лиса перепрыгнула через ленивую собаку.
    `
    
    func main() {
        r := bufio.NewReader(strings.NewReader(text))
        for {
            if c, sz, err := r.ReadRune(); err != nil {
                if err == io.EOF {
                    break
                } else {
                    log.Fatal(err)
                }
            } else {
                fmt.Printf("%q [%d]
    ", string(c), sz)
            }
        }
    }
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(2条)

报告相同问题?

悬赏问题

  • ¥50 gki vendor hook
  • ¥15 centos7中sudo命令无法使用
  • ¥15 灰狼算法和蚁群算法如何结合
  • ¥15 这是一个利用ESP32自带按键和LED控制的录像代码,编译过程出现问题,请解决并且指出错误,指导如何处理 ,协助完成代码并上传代码
  • ¥20 stm32f103,hal库 hal_usart_receive函数接收不到数据。
  • ¥20 求结果和代码,sas利用OPTEX程序和D-efficiency生成正交集
  • ¥50 adb连接不到手机是怎么回事?
  • ¥20 抓取数据时发生错误: get_mooncake_data() missing 1 required positional argument: 'driver'的问题,怎么改出正确的爬虫代码?
  • ¥15 vs2022无法联网
  • ¥15 TCP的客户端和服务器的互联