I'm using PHP version 5.6 and I can't figure out why the regular expression won't match the second row correctly.
$str = '<tr><td class="DH">Sale Date</td></tr><tr><td class="DD">10-MAR-15</td></tr><tr><td class="DD">18-APR-17</td></tr>';
preg_match_all('/<tr>.*?class="D.*?<\/tr>/', $str, $matches);
print_r($matches);
preg_match_all('/<tr>.*?class="DH.*?<\/tr>/', $str, $matches);
print_r($matches);
preg_match_all('/<tr>.*?class="DD.*?<\/tr>/', $str, $matches);
print_r($matches);
This code outputs:
Array
(
[0] => Array
(
[0] => <tr><td class="DH">Sale Date</td></tr>
[1] => <tr><td class="DD">10-MAR-15</td></tr>
[2] => <tr><td class="DD">18-APR-17</td></tr>
)
)
Array
(
[0] => Array
(
[0] => <tr><td class="DH">Sale Date</td></tr>
)
)
Array
(
[0] => Array
(
[0] => <tr><td class="DH">Sale Date</td></tr><tr><td class="DD">10-MAR-15</td></tr>
[1] => <tr><td class="DD">18-APR-17</td></tr>
)
)
The regex essentially means match all shortest sequences between
<tr>
and </tr>
that contain class="D
.
Notice how the first regex matches all 3 rows individually correctly.
The second one does the same but wants the row to contain class="DH
which it does correctly.
The third regex is supposed to match the other rows which contain class="DD
. For some reason only the first result (corresponding to the second table row) wants to include the previous row.
Even if I add a space between </tr>
and <tr>
as in </tr> <tr>
I'm getting the same result. However, if I insert a line break things work.
Can anyone explain what's going on and how to fix my code?