doubi7346 2018-07-30 22:29
浏览 12
已采纳

PHP preg_match_all谜语

I'm using PHP version 5.6 and I can't figure out why the regular expression won't match the second row correctly.

 $str = '<tr><td class="DH">Sale Date</td></tr><tr><td class="DD">10-MAR-15</td></tr><tr><td class="DD">18-APR-17</td></tr>';

 preg_match_all('/<tr>.*?class="D.*?<\/tr>/', $str, $matches);
 print_r($matches);

 preg_match_all('/<tr>.*?class="DH.*?<\/tr>/', $str, $matches);
 print_r($matches);

 preg_match_all('/<tr>.*?class="DD.*?<\/tr>/', $str, $matches);
 print_r($matches);

This code outputs:

Array
(
    [0] => Array
        (
            [0] => <tr><td class="DH">Sale Date</td></tr>
            [1] => <tr><td class="DD">10-MAR-15</td></tr>
            [2] => <tr><td class="DD">18-APR-17</td></tr>
        )

)
Array
(
    [0] => Array
        (
            [0] => <tr><td class="DH">Sale Date</td></tr>
        )

)
Array
(
    [0] => Array
        (
            [0] => <tr><td class="DH">Sale Date</td></tr><tr><td class="DD">10-MAR-15</td></tr>
            [1] => <tr><td class="DD">18-APR-17</td></tr>
        )

)

The regex essentially means match all shortest sequences between <tr> and </tr> that contain class="D.

Notice how the first regex matches all 3 rows individually correctly.

The second one does the same but wants the row to contain class="DH which it does correctly.

The third regex is supposed to match the other rows which contain class="DD. For some reason only the first result (corresponding to the second table row) wants to include the previous row.

Even if I add a space between </tr> and <tr> as in </tr> <tr>I'm getting the same result. However, if I insert a line break things work.

Can anyone explain what's going on and how to fix my code?

  • 写回答

1条回答 默认 最新

  • dscbxou1900343 2018-07-30 22:36
    关注
    /<tr>.*?class="DD.*?/
    

    says "find <tr>, then match everything until you find class="DD". So it sees:

    <tr><td class="DH">Sale Date</td></tr><tr><td class="DD">
    

    and matches the first <tr>, then the .* matches <td class="DH">Sale Date</td></tr><tr><td, then it sees class="DH" which matches the next part.

    When you add a line break, .* stops matching, so it makes it work.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 PADS Logic 原理图
  • ¥15 PADS Logic 图标
  • ¥15 电脑和power bi环境都是英文如何将日期层次结构转换成英文
  • ¥20 气象站点数据求取中~
  • ¥15 如何获取APP内弹出的网址链接
  • ¥15 wifi 图标不见了 不知道怎么办 上不了网 变成小地球了