dso407787736 2013-04-05 18:44
浏览 21
已采纳

如何确定空白fmt.Fscanf消耗的数量?

I am trying to implement a PPM decoder in Go. PPM is an image format that consists of a plaintext header and then some binary image data. The header looks like this (from the spec):

Each PPM image consists of the following:

  1. A "magic number" for identifying the file type. A ppm image's magic number is the two characters "P6".
  2. Whitespace (blanks, TABs, CRs, LFs).
  3. A width, formatted as ASCII characters in decimal.
  4. Whitespace.
  5. A height, again in ASCII decimal.
  6. Whitespace.
  7. The maximum color value (Maxval), again in ASCII decimal. Must be less than 65536 and more than zero.
  8. A single whitespace character (usually a newline).

I try to decode this header with the fmt.Fscanf function. The following call to fmt.Fscanf parses the header (not addressing the caveat explained below):

var magic string
var width, height, maxVal uint

fmt.Fscanf(input,"%2s %d %d %d",&magic,&width,&height,&maxVal)

The documentation of fmt states:

Note: Fscan etc. can read one character (rune) past the input they return, which means that a loop calling a scan routine may skip some of the input. This is usually a problem only when there is no space between input values. If the reader provided to Fscan implements ReadRune, that method will be used to read characters. If the reader also implements UnreadRune, that method will be used to save the character and successive calls will not lose data. To attach ReadRune and UnreadRune methods to a reader without that capability, use bufio.NewReader.

As the very next character after the final whitespace is already the beginning of the image data, I have to be certain about how many whitespace fmt.Fscanf did consume after reading MaxVal. My code must work on whatever reader the was provided by the caller and parts of it must not read past the end of the header, therefore wrapping stuff into a buffered reader is not an option; the buffered reader might read more from the input than I actually want to read.

Some testing suggests that parsing a dummy character at the end solves the issues:

var magic string
var width, height, maxVal uint
var dummy byte

fmt.Fscanf(input,"%2s %d %d %d%c",&magic,&width,&height,&maxVal,&dummy)

Is that guaranteed to work according to the specification?

  • 写回答

1条回答 默认 最新

  • doujian7132 2013-04-05 19:46
    关注

    No, I would not consider that safe. While it works now, the documentation states that the function reserves the right to read past the value by one character unless you have an UnreadRune() method.

    By wrapping your reader in a bufio.Reader, you can ensure the reader has an UnreadRune() method. You will then need to read the final whitespace yourself.

    buf := bufio.NewReader(input)
    fmt.Fscanf(buf,"%2s %d %d %d",&magic,&width,&height,&maxVal)
    buf.ReadRune() // remove next rune (the whitespace) from the buffer.
    


    Edit:

    As we discussed in the chat, you can assume the dummy char method works and then write a test so you know when it stops working. The test can be something like:

    func TestFmtBehavior(t *testing.T) {
        // use multireader to prevent r from implementing io.RuneScanner
        r := io.MultiReader(bytes.NewReader([]byte("data  ")))
    
        n, err := fmt.Fscanf(r, "%s%c", new(string), new(byte))
        if n != 2 || err != nil {
            t.Error("failed scan", n, err)
        }
    
        // the dummy char read 1 extra char past "data".
        // one byte should still remain
        if n, err := r.Read(make([]byte, 5)); n != 1 {
            t.Error("assertion failed", n, err)
        }
    }
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 抖音咸鱼付款链接转码支付宝
  • ¥15 ubuntu22.04上安装ursim-3.15.8.106339遇到的问题
  • ¥15 求螺旋焊缝的图像处理
  • ¥15 blast算法(相关搜索:数据库)
  • ¥15 请问有人会紧聚焦相关的matlab知识嘛?
  • ¥15 网络通信安全解决方案
  • ¥50 yalmip+Gurobi
  • ¥20 win10修改放大文本以及缩放与布局后蓝屏无法正常进入桌面
  • ¥15 itunes恢复数据最后一步发生错误
  • ¥15 关于#windows#的问题:2024年5月15日的win11更新后资源管理器没有地址栏了顶部的地址栏和文件搜索都消失了