doubi7346 2018-07-30 22:29
浏览 12
已采纳

PHP preg_match_all谜语

I'm using PHP version 5.6 and I can't figure out why the regular expression won't match the second row correctly.

 $str = '<tr><td class="DH">Sale Date</td></tr><tr><td class="DD">10-MAR-15</td></tr><tr><td class="DD">18-APR-17</td></tr>';

 preg_match_all('/<tr>.*?class="D.*?<\/tr>/', $str, $matches);
 print_r($matches);

 preg_match_all('/<tr>.*?class="DH.*?<\/tr>/', $str, $matches);
 print_r($matches);

 preg_match_all('/<tr>.*?class="DD.*?<\/tr>/', $str, $matches);
 print_r($matches);

This code outputs:

Array
(
    [0] => Array
        (
            [0] => <tr><td class="DH">Sale Date</td></tr>
            [1] => <tr><td class="DD">10-MAR-15</td></tr>
            [2] => <tr><td class="DD">18-APR-17</td></tr>
        )

)
Array
(
    [0] => Array
        (
            [0] => <tr><td class="DH">Sale Date</td></tr>
        )

)
Array
(
    [0] => Array
        (
            [0] => <tr><td class="DH">Sale Date</td></tr><tr><td class="DD">10-MAR-15</td></tr>
            [1] => <tr><td class="DD">18-APR-17</td></tr>
        )

)

The regex essentially means match all shortest sequences between <tr> and </tr> that contain class="D.

Notice how the first regex matches all 3 rows individually correctly.

The second one does the same but wants the row to contain class="DH which it does correctly.

The third regex is supposed to match the other rows which contain class="DD. For some reason only the first result (corresponding to the second table row) wants to include the previous row.

Even if I add a space between </tr> and <tr> as in </tr> <tr>I'm getting the same result. However, if I insert a line break things work.

Can anyone explain what's going on and how to fix my code?

  • 写回答

1条回答 默认 最新

  • dscbxou1900343 2018-07-30 22:36
    关注
    /<tr>.*?class="DD.*?/
    

    says "find <tr>, then match everything until you find class="DD". So it sees:

    <tr><td class="DH">Sale Date</td></tr><tr><td class="DD">
    

    and matches the first <tr>, then the .* matches <td class="DH">Sale Date</td></tr><tr><td, then it sees class="DH" which matches the next part.

    When you add a line break, .* stops matching, so it makes it work.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥50 用易语言http 访问不了网页
  • ¥50 safari浏览器fetch提交数据后数据丢失问题
  • ¥15 matlab不知道怎么改,求解答!!
  • ¥15 永磁直线电机的电流环pi调不出来
  • ¥15 用stata实现聚类的代码
  • ¥15 请问paddlehub能支持移动端开发吗?在Android studio上该如何部署?
  • ¥20 docker里部署springboot项目,访问不到扬声器
  • ¥15 netty整合springboot之后自动重连失效
  • ¥15 悬赏!微信开发者工具报错,求帮改
  • ¥20 wireshark抓不到vlan