drcmg28484 2013-01-16 23:14
浏览 31
已采纳

为什么这个php正则表达式与所有输入都不匹配

Why the following regex: $regex = '/\b(V|E)?\d{1,2}? ?\d{3} ?\d{3}\b/i'; does not match all the input below

I did think that the this (V|E)?\d{1,2}? ? would made optional the letters, the first one or two number and the first space

INPUT

<?php

$sms = array(
    'test test test 11 111 111 test test test',
    'test test test 1 111 111 test test test',
    'test test test 111 111 test test test', // does not match
    'test test test test test test 11111111',
    'test test test 1111111 test test test',
    'test test test 111111 test test test', // does not match
    'test test test E11 111 111 test test test',
    'test test test V1 111 111 test test test',
    'test test test V111 111 test test test', // does not match
    'test test test V11111111 test test test',
    'test test test V1111111 test test test',
    'test test test E111111 test test test', // does not match
    'test test test V 11 111 111 test test test',
    'test test test V 1 111 111 test test test',
    'test test test E 111 111 test test test', // does not match
    'test test test V 11111111 test test test',
    'test test test V 1111111 test test test',
    'test test test V 111111 test test test', //does not match
    'test test test V11 111 111 test test test',
    'test test test V1 111 111 test test test',
    'test test test E111 111 test test test', //does not match
    'test test test V11111111 test test test',
    'V1111111 test test test  test test test',
    'test test test V111111 test test test', // does not match
);

$regex = '/\b(V|E)?\d{1,2}? ?\d{3} ?\d{3}\b/i';
$noMatches = 0;
$index = 0;
foreach($sms as $v) {
    $match = preg_match($regex, $v, $matches);



    if($match) {
        //print_r($matches);
        //echo "$v match!
";
        //$matches++;
    }
    else {
        echo "$index - $v does NOT match!
";
        $noMatches++;
    }
    $index++;
}
$total = count($sms);
echo "

Total: $total
No Matches: $noMatches
";

OUTPUT

$ php test-regex.php 
2 - test test test 111 111 test test test does NOT match!
5 - test test test 111111 test test test does NOT match!
8 - test test test V111 111 test test test does NOT match!
11 - test test test E111111 test test test does NOT match!
14 - test test test E 111 111 test test test does NOT match!
17 - test test test V 111111 test test test does NOT match!
20 - test test test E111 111 test test test does NOT match!
23 - test test test V111111 test test test does NOT match!


Total: 24
No Matches: 8

EDIT:

Using mario suggestion the regex is now $regex = '/\b(V|E)?\d{0,2} ?\d{3} ?\d{3}\b/i';, why in some cases, this regex does not capture the letter V or E

$output = array(
    'test test test E11 111 111 test test test' => 'E11 111 111',
    'test test test V1 111 111 test test test' => 'V1 111 111',
    'test test test V111 111 test test test' => 'V111 111',
    'test test test V11111111 test test test' => 'V11111111',
    'test test test V1111111 test test test' => 'V1111111',
    'test test test E111111 test test test' => 'E111111',
    'test test test V 11 111 111 test test test' => '11 111 111', // Missing Letter
    'test test test V 1 111 111 test test test' => '1 111 111', // Missing Leter
    'test test test E 111 111 test test test' => 'E 111 111',
    'test test test V 11111111 test test test' => '11111111', // Missing Letter
    'test test test V 1111111 test test test' => '1111111', // Missing Letter
    'test test test V 111111 test test test' => 'V 111111',
    'test test test V11 111 111 test test test' => 'V11 111 111',
    'test test test V1 111 111 test test test' => 'V1 111 111',
    'test test test E111 111 test test test' => 'E111 111',
    'test test test V11111111 test test test' => 'V11111111',
    'V1111111 test test test  test test test' => 'V1111111',
    'test test test V111111 test test test' => 'V111111',
    'V 1111111 test test test' => '1111111', // Missing Letter
    'test test test V 1111111 test test test' => '1111111', // Missing Letter
);
  • 写回答

3条回答 默认 最新

  • duan246558 2013-01-16 23:19
    关注

    ? only is a quantifier after groups or literal chars or characters classes e.g.

    If ? occurs after another quantifier * or + and {n,m} it will just make the matching less greedy. Meaning the regex will try to match the least amount.

    So \d{1,2}? does not mean optional. It means match one or two, but prefer to match just one. You meant to write \d{0,2} instead.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(2条)

报告相同问题?

悬赏问题

  • ¥30 VMware 云桌面水印如何添加
  • ¥15 用ns3仿真出5G核心网网元
  • ¥15 matlab答疑 关于海上风电的爬坡事件检测
  • ¥88 python部署量化回测异常问题
  • ¥30 酬劳2w元求合作写文章
  • ¥15 在现有系统基础上增加功能
  • ¥15 远程桌面文档内容复制粘贴,格式会变化
  • ¥15 这种微信登录授权 谁可以做啊
  • ¥15 请问我该如何添加自己的数据去运行蚁群算法代码
  • ¥20 用HslCommunication 连接欧姆龙 plc有时会连接失败。报异常为“未知错误”