Why the following regex: $regex = '/\b(V|E)?\d{1,2}? ?\d{3} ?\d{3}\b/i';
does not match all the input below
I did think that the this (V|E)?\d{1,2}? ?
would made optional the letters, the first one or two number and the first space
INPUT
<?php
$sms = array(
'test test test 11 111 111 test test test',
'test test test 1 111 111 test test test',
'test test test 111 111 test test test', // does not match
'test test test test test test 11111111',
'test test test 1111111 test test test',
'test test test 111111 test test test', // does not match
'test test test E11 111 111 test test test',
'test test test V1 111 111 test test test',
'test test test V111 111 test test test', // does not match
'test test test V11111111 test test test',
'test test test V1111111 test test test',
'test test test E111111 test test test', // does not match
'test test test V 11 111 111 test test test',
'test test test V 1 111 111 test test test',
'test test test E 111 111 test test test', // does not match
'test test test V 11111111 test test test',
'test test test V 1111111 test test test',
'test test test V 111111 test test test', //does not match
'test test test V11 111 111 test test test',
'test test test V1 111 111 test test test',
'test test test E111 111 test test test', //does not match
'test test test V11111111 test test test',
'V1111111 test test test test test test',
'test test test V111111 test test test', // does not match
);
$regex = '/\b(V|E)?\d{1,2}? ?\d{3} ?\d{3}\b/i';
$noMatches = 0;
$index = 0;
foreach($sms as $v) {
$match = preg_match($regex, $v, $matches);
if($match) {
//print_r($matches);
//echo "$v match!
";
//$matches++;
}
else {
echo "$index - $v does NOT match!
";
$noMatches++;
}
$index++;
}
$total = count($sms);
echo "
Total: $total
No Matches: $noMatches
";
OUTPUT
$ php test-regex.php
2 - test test test 111 111 test test test does NOT match!
5 - test test test 111111 test test test does NOT match!
8 - test test test V111 111 test test test does NOT match!
11 - test test test E111111 test test test does NOT match!
14 - test test test E 111 111 test test test does NOT match!
17 - test test test V 111111 test test test does NOT match!
20 - test test test E111 111 test test test does NOT match!
23 - test test test V111111 test test test does NOT match!
Total: 24
No Matches: 8
EDIT:
Using mario suggestion the regex is now $regex = '/\b(V|E)?\d{0,2} ?\d{3} ?\d{3}\b/i';
,
why in some cases, this regex does not capture the letter V
or E
$output = array(
'test test test E11 111 111 test test test' => 'E11 111 111',
'test test test V1 111 111 test test test' => 'V1 111 111',
'test test test V111 111 test test test' => 'V111 111',
'test test test V11111111 test test test' => 'V11111111',
'test test test V1111111 test test test' => 'V1111111',
'test test test E111111 test test test' => 'E111111',
'test test test V 11 111 111 test test test' => '11 111 111', // Missing Letter
'test test test V 1 111 111 test test test' => '1 111 111', // Missing Leter
'test test test E 111 111 test test test' => 'E 111 111',
'test test test V 11111111 test test test' => '11111111', // Missing Letter
'test test test V 1111111 test test test' => '1111111', // Missing Letter
'test test test V 111111 test test test' => 'V 111111',
'test test test V11 111 111 test test test' => 'V11 111 111',
'test test test V1 111 111 test test test' => 'V1 111 111',
'test test test E111 111 test test test' => 'E111 111',
'test test test V11111111 test test test' => 'V11111111',
'V1111111 test test test test test test' => 'V1111111',
'test test test V111111 test test test' => 'V111111',
'V 1111111 test test test' => '1111111', // Missing Letter
'test test test V 1111111 test test test' => '1111111', // Missing Letter
);