drbe16008 2011-02-09 04:16
浏览 41
已采纳

使用命名模式子例程的PCRE正则表达式

I am experimenting with the named subpattern/'subroutine' regex features in PHP's PCRE and I'm hoping someone can explain the following strange output:

$re = "/
(?(DEFINE)
    (?<a> a )
)

^(?&a)$

/x";

var_dump(preg_match($re, 'a', $match)); // (int) 1 as expected
var_dump($match); // Array( [0] => 'a' ) <-- Why?

I can't understand why the named group "a" is not in the result (with the contents "a"). Changing preg_match to preg_match_all puts "a" and "1" in the match data but both contain only an empty string.

I really like the idea of writing regular expressions this way, as you can make them incredibly powerful whilst keeping them very maintainable (see this answer for a good example of this), however if the subpatterns are not available in the match data then it's not much use really.

Am I missing something here or should I just mourn what could have been and move on?

  • 写回答

1条回答 默认 最新

  • dongqin8652 2011-02-09 06:11
    关注

    It makes perfect sense these subpatterns would not capture a group - their main purpose it to be used more than once, so you can't really capture them all. In addition, if the default was to capture all subpatterns it wouldn't give you an option not to capture a group where you don't want it - not the best default behavior. The opposite is trivial - you can capture by adding another group around the (?&a) statement.
    I couldn't find a reference to this on PCRE.org. The closest is this, which is relevant because you don't match (?<a>...) directly (though you might expect an empty group):

    Any capturing parentheses that are set during the subroutine call revert to their previous values afterwards.

    It is clearer on the Perl manual (relevant part highlighted):

    An example of how this might be used is as follows:

    /(?<NAME>(?&NAME_PAT))(?<ADDR>(?&ADDRESS_PAT))
    (?(DEFINE)
    (?<NAME_PAT>....)
    (?<ADRESS_PAT>....)
    )/x
    

    Note that capture buffers matched inside of recursion are not accessible after the recursion returns, so the extra layer of capturing buffers is necessary.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 R语言Rstudio突然无法启动
  • ¥15 关于#matlab#的问题:提取2个图像的变量作为另外一个图像像元的移动量,计算新的位置创建新的图像并提取第二个图像的变量到新的图像
  • ¥15 改算法,照着压缩包里边,参考其他代码封装的格式 写到main函数里
  • ¥15 用windows做服务的同志有吗
  • ¥60 求一个简单的网页(标签-安全|关键词-上传)
  • ¥35 lstm时间序列共享单车预测,loss值优化,参数优化算法
  • ¥15 Python中的request,如何使用ssr节点,通过代理requests网页。本人在泰国,需要用大陆ip才能玩网页游戏,合法合规。
  • ¥100 为什么这个恒流源电路不能恒流?
  • ¥15 有偿求跨组件数据流路径图
  • ¥15 写一个方法checkPerson,入参实体类Person,出参布尔值