I have the following code to extract parameter name and value from string, here it is : (yes, regex has to be this long, it has other purposes)
$sample = 'href="http://google.com/"';
$reg = "#([a-zA-Z\-\/]+)\s*(?:=\s*(?:\"([^\">]*)\"?|'([^'>]*)'?|([^'\"\s]*)))?#S";
preg_match_all($reg, $sample, $m);
$result = print_r($m, true);
echo $result;
which returns this:
Array ( [0] => Array ( [0] => href="http://google.com/ )
[1] => Array ( [0] => href )
[2] => Array ( [0] => http://google.com/ )
[3] => Array ( [0] => )
[4] => Array ( [0] => ) )
And it works fine.The problem is that I can also have strings with parameter values escaped, something like this:
$sample = 'href="\http://google.com/\"';
So I had to modify the regex, adding "\?" to allow one backalash before the quotes, and it looks something like this:
$sample = 'href="http://google.com/"';
$reg = "#([a-zA-Z\-\/]+)\s*(?:=\s*(?:\\?\"([^\">]*)\"?|'([^'>]*)'?|([^'\"\s]*)))?#S";
preg_match_all($reg, $sample, $m);
$out = print_r($m, true);
echo $out;
So I tried this new regex in a few online testers, and all of them returned correct result. However, preg_match_all returns this:
Array ( [0] =>
Array ( [0] => href=
[1] => http
[2] => //google [3] => com/ )
[1] => Array (
[0] => href
[1] => http
[2] => //google
[3] => com/ )
[2] => Array (
[0] =>
[1] =>
[2] =>
[3] => )
[3] => Array (
[0] =>
[1] =>
[2] =>
[3] => )
[4] => Array (
[0] =>
[1] =>
[2] =>
[3] => ) )
So why this second regex doesn't work as expected, but it works in online testing tools?