dongyue0263 2015-04-04 02:13
浏览 477

在Golang中读取非常大的文件

I'm currently trying to read in a file with 200+ columns and 1000+ rows. I use the following code:

var result []string

file, err := os.Open("t8.txt")
if (err != nil) {
  fmt.Println(err)
}
defer file.Close()
scan := bufio.NewScanner(file)
for scan.Scan() {
  result = append(result, scan.Text())

}


fmt.Println(scan.Err()) //token too long

However when I print out the results, all I get is the first line because it says the token is too long. When I try it on smaller files, it works fine. Is there a way in Golang that I could scan in large files?

  • 写回答

1条回答 默认 最新

  • dongwen3410 2015-04-04 07:53
    关注

    As already pointed out by @Dave C in the comments you are running into MaxScanTokenSize = 64 * 1024

    To get around that limitation, use bufio.Reader which has a ReadString(delim byte) method which seems appropriate for your case.

    From the Scanner go doc (specifically the last sentence):

    Scanner provides a convenient interface for reading data such as a file of newline-delimited lines of text. Successive calls to the Scan method will step through the 'tokens' of a file, skipping the bytes between the tokens. The specification of a token is defined by a split function of type SplitFunc; the default split function breaks the input into lines with line termination stripped. Split functions are defined in this package for scanning a file into lines, bytes, UTF-8-encoded runes, and space-delimited words. The client may instead provide a custom split function.

    Scanning stops unrecoverably at EOF, the first I/O error, or a token too large to fit in the buffer. When a scan stops, the reader may have advanced arbitrarily far past the last token. Programs that need more control over error handling or large tokens, or must run sequential scans on a reader, should use bufio.Reader instead.

    评论

报告相同问题?

悬赏问题

  • ¥30 python代码,帮调试
  • ¥15 #MATLAB仿真#车辆换道路径规划
  • ¥15 java 操作 elasticsearch 8.1 实现 索引的重建
  • ¥15 数据可视化Python
  • ¥15 要给毕业设计添加扫码登录的功能!!有偿
  • ¥15 kafka 分区副本增加会导致消息丢失或者不可用吗?
  • ¥15 微信公众号自制会员卡没有收款渠道啊
  • ¥100 Jenkins自动化部署—悬赏100元
  • ¥15 关于#python#的问题:求帮写python代码
  • ¥20 MATLAB画图图形出现上下震荡的线条