dongyue0263 2015-04-04 02:13
浏览 477

在Golang中读取非常大的文件

I'm currently trying to read in a file with 200+ columns and 1000+ rows. I use the following code:

var result []string

file, err := os.Open("t8.txt")
if (err != nil) {
  fmt.Println(err)
}
defer file.Close()
scan := bufio.NewScanner(file)
for scan.Scan() {
  result = append(result, scan.Text())

}


fmt.Println(scan.Err()) //token too long

However when I print out the results, all I get is the first line because it says the token is too long. When I try it on smaller files, it works fine. Is there a way in Golang that I could scan in large files?

  • 写回答

1条回答

  • dongwen3410 2015-04-04 07:53
    关注

    As already pointed out by @Dave C in the comments you are running into MaxScanTokenSize = 64 * 1024

    To get around that limitation, use bufio.Reader which has a ReadString(delim byte) method which seems appropriate for your case.

    From the Scanner go doc (specifically the last sentence):

    Scanner provides a convenient interface for reading data such as a file of newline-delimited lines of text. Successive calls to the Scan method will step through the 'tokens' of a file, skipping the bytes between the tokens. The specification of a token is defined by a split function of type SplitFunc; the default split function breaks the input into lines with line termination stripped. Split functions are defined in this package for scanning a file into lines, bytes, UTF-8-encoded runes, and space-delimited words. The client may instead provide a custom split function.

    Scanning stops unrecoverably at EOF, the first I/O error, or a token too large to fit in the buffer. When a scan stops, the reader may have advanced arbitrarily far past the last token. Programs that need more control over error handling or large tokens, or must run sequential scans on a reader, should use bufio.Reader instead.

    评论

报告相同问题?

悬赏问题

  • ¥15 fluent的在模拟压强时使用希望得到一些建议
  • ¥15 STM32驱动继电器
  • ¥15 Windows server update services
  • ¥15 关于#c语言#的问题:我现在在做一个墨水屏设计,2.9英寸的小屏怎么换4.2英寸大屏
  • ¥15 模糊pid与pid仿真结果几乎一样
  • ¥15 java的GUI的运用
  • ¥15 Web.config连不上数据库
  • ¥15 我想付费需要AKM公司DSP开发资料及相关开发。
  • ¥15 怎么配置广告联盟瀑布流
  • ¥15 Rstudio 保存代码闪退