dongxi1943 2011-12-05 13:03
浏览 61
已采纳

除了正确的匹配之外,PHP preg_match_all给出的偏移量为-1

This appears to be strange behavior, or perhaps I don't understand regular expressions so well...

I'm using this to find all the xref and trailer objects in a PDF file:

preg_match_all('@(
xref?
)|(\strailer\s)@',$pdfcontent,$matches,PREG_OFFSET_CAPTURE);

print_r gives me this:

Array
(
    [0] => Array
        (
            [0] => Array
                (
                    [0] =>
xref
                    [1] => 13235519
                )

            [1] => Array
                (
                    [0] =>
trailer
                    [1] => 13299371
                )
        )

    [1] => Array
        (
            [0] => Array
                (
                    [0] =>
xref
                    [1] => 13235519
                )

            [1] => Array
                (
                    [0] =>
                    [1] => -1
                )
        )

    [2] => Array
        (
            [0] =>
            [1] => Array
                (
                    [0] =>
trailer
                    [1] => 13299371
                )
        )
)

Why is there a position of -1 for xref?

  • 写回答

2条回答 默认 最新

  • dongshi9407 2011-12-05 13:42
    关注

    It seems this is the normal behaviour, mostly undocumented though. The -1 offset is also used for absent matches.

    To answer your title, the -1 offset is returned alternatively, not in addition. You have an alternative (a)|(b) match group in your pattern. So it can very well return offsets and matches for the xref, but a non-match for the trailer.

    This is not mentioned explicitely in the PHP manual page. But PCRE documents it cursorily with:

    [...] When this happens, both values in the offset pairs corre- sponding to unused subpatterns are set to -1.

    You can reproduce it with a simpler example:

    preg_match_all('/(a)|(b)|(c)/', "abc", $m, PREG_OFFSET_CAPTURE)
    and print_r($m);
    

    [Have a look]. The behaviour is a bit confusing. It seems the -1 is used as offset for the early non-matches. But subsequent failed matches are just absent in the result array. This example gives [0,-1,-1] and [undef,1,-1] and [undef,undef,2] for example. I would conclude it's some hazy behaviour in the PHP wrapper.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥30 自适应 LMS 算法实现 FIR 最佳维纳滤波器matlab方案
  • ¥15 lingo18勾选global solver求解使用的算法
  • ¥15 全部备份安卓app数据包括密码,可以复制到另一手机上运行
  • ¥15 Python3.5 相关代码写作
  • ¥20 测距传感器数据手册i2c
  • ¥15 RPA正常跑,cmd输入cookies跑不出来
  • ¥15 求帮我调试一下freefem代码
  • ¥15 matlab代码解决,怎么运行
  • ¥15 R语言Rstudio突然无法启动
  • ¥15 关于#matlab#的问题:提取2个图像的变量作为另外一个图像像元的移动量,计算新的位置创建新的图像并提取第二个图像的变量到新的图像