douwu5428 2015-07-28 13:17
浏览 42
已采纳

RegEx在PHP中命名捕获组

I have the following regex to capture a list of numbers (it will be more complex than this eventually):

$list = '10,9,8,7,6,5,4,3,2,1';

$regex = 
<<<REGEX
    /(?x)
    (?(DEFINE)
        (?<number> (\d+) )
        (?<list> (?&number)(,(?&number))* )
    )
    ^(?&list)/
REGEX;

$matches = array();
if (preg_match($regex,$list,$matches)==1) {
    print_r($matches);
}

Which outputs:

Array ( [0] => 10,9,8,7,6,5,4,3,2,1 ) 

How do I capture the individual numbers in the list in the $matches array? I don't seem to be able to do it, despite putting a capturing group around the digits (\d+).

EDIT

Just to make it clearer, I want to eventually use recursion, so explode is not ideal:

$match = 
<<<REGEX
    /(?x)
    (?(DEFINE)
        (?<number> (\d+) )
        (?<member> (?&number)|(?&list) )
        (?<list> \( ((?&number)|(?&member))(,(?&member))* \) ) 
    )
    ^(?&list)/
REGEX;
  • 写回答

3条回答 默认 最新

  • doufu1939 2015-07-28 14:02
    关注

    The purpose of a (?(DEFINE)...) section is only to define named sub-patterns you can use later in the define section itself or in the main pattern. Since these sub-patterns are not defined in the main pattern they don't capture anything, and a reference (?&number) is only a kind of alias for the sub-pattern \d+ and doesn't capture anything too.

    Example with the string: 1abcde2

    If I use this pattern: /^(?<num>\d).....(?&num)$/ only 1 is captured in the group num, (?&num) doesn't capture anything, it's only an alias for \d.
    /^(?<num>\d).....\d$/ produces exactly the same result.

    An other point to clarify. With PCRE (the PHP regex engine), a capture group (named or not) can only store one value, even if you repeat it.

    The main problem of your approach is that you are trying to do two things at the same time:

    1. you want to check the format of the string.
    2. you want to extract an unknown number of items.

    Doing this is only possible in particular situations, but impossible in general.

    For example, with a flat list like: $list = '10,9,8,7,6,5,4,3,2,1'; where there are no nested elements, you can use a function like preg_match_all to reuse the same pattern several times in this way:

    if (preg_match_all('~\G(\d+)(,|$)~', $list, $matches) && !end($matches[2])) {
        // \G ensures that results are contiguous
        // you have all the items in $matches[1] 
        // if the last item of $matches[2] is empty, this means
        // that the end of the string is reached and the string
        // format is correct
        echo '<°)))))))>';
    }
    

    Now if you have a nested list like $list = '10,9,(8,(7,6),5),4,(3,2),1'; and you want for example to check the format and to produce a tree structure like:

    [ 10, 9, [ 8, [ 7, 6 ], 5 ], 4 , [ 3, 2 ], 1 ]
    

    You can't do it with a single pass. You need one pattern to check the whole string format and an other pattern to extract elements (and a recursive function to use it).

    <<<FORGET_THIS_IMMEDIATELY

    As an aside you can do it with eval and strtr, but it's a very dirty and dangerous way:

    eval('$result=[' . strtr($list, '()', '[]') . '];');
    

    FORGET_THIS_IMMEDIATELY;

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(2条)

报告相同问题?

悬赏问题

  • ¥15 如何在scanpy上做差异基因和通路富集?
  • ¥20 关于#硬件工程#的问题,请各位专家解答!
  • ¥15 关于#matlab#的问题:期望的系统闭环传递函数为G(s)=wn^2/s^2+2¢wn+wn^2阻尼系数¢=0.707,使系统具有较小的超调量
  • ¥15 FLUENT如何实现在堆积颗粒的上表面加载高斯热源
  • ¥30 截图中的mathematics程序转换成matlab
  • ¥15 动力学代码报错,维度不匹配
  • ¥15 Power query添加列问题
  • ¥50 Kubernetes&Fission&Eleasticsearch
  • ¥15 報錯:Person is not mapped,如何解決?
  • ¥15 c++头文件不能识别CDialog