doubi7346 2018-07-30 22:29
浏览 12
已采纳

PHP preg_match_all谜语

I'm using PHP version 5.6 and I can't figure out why the regular expression won't match the second row correctly.

 $str = '<tr><td class="DH">Sale Date</td></tr><tr><td class="DD">10-MAR-15</td></tr><tr><td class="DD">18-APR-17</td></tr>';

 preg_match_all('/<tr>.*?class="D.*?<\/tr>/', $str, $matches);
 print_r($matches);

 preg_match_all('/<tr>.*?class="DH.*?<\/tr>/', $str, $matches);
 print_r($matches);

 preg_match_all('/<tr>.*?class="DD.*?<\/tr>/', $str, $matches);
 print_r($matches);

This code outputs:

Array
(
    [0] => Array
        (
            [0] => <tr><td class="DH">Sale Date</td></tr>
            [1] => <tr><td class="DD">10-MAR-15</td></tr>
            [2] => <tr><td class="DD">18-APR-17</td></tr>
        )

)
Array
(
    [0] => Array
        (
            [0] => <tr><td class="DH">Sale Date</td></tr>
        )

)
Array
(
    [0] => Array
        (
            [0] => <tr><td class="DH">Sale Date</td></tr><tr><td class="DD">10-MAR-15</td></tr>
            [1] => <tr><td class="DD">18-APR-17</td></tr>
        )

)

The regex essentially means match all shortest sequences between <tr> and </tr> that contain class="D.

Notice how the first regex matches all 3 rows individually correctly.

The second one does the same but wants the row to contain class="DH which it does correctly.

The third regex is supposed to match the other rows which contain class="DD. For some reason only the first result (corresponding to the second table row) wants to include the previous row.

Even if I add a space between </tr> and <tr> as in </tr> <tr>I'm getting the same result. However, if I insert a line break things work.

Can anyone explain what's going on and how to fix my code?

  • 写回答

1条回答 默认 最新

  • dscbxou1900343 2018-07-30 22:36
    关注
    /<tr>.*?class="DD.*?/
    

    says "find <tr>, then match everything until you find class="DD". So it sees:

    <tr><td class="DH">Sale Date</td></tr><tr><td class="DD">
    

    and matches the first <tr>, then the .* matches <td class="DH">Sale Date</td></tr><tr><td, then it sees class="DH" which matches the next part.

    When you add a line break, .* stops matching, so it makes it work.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 工业数据采集技术+存储架构推荐
  • ¥20 树莓派4b使用Camera Module 3时出现the system should be configured for the legacy camera stack问题
  • ¥200 GitHub开源程序配置在VScode调试
  • ¥15 爬虫保存的scv文件0kb
  • ¥20 如何实现基于强化学习的带电作业机械臂的运动规划与控制
  • ¥15 使用wpf制作打砖块游戏时遇到的一个Bug
  • ¥15 qrCodeDetector.detectAndDecode
  • ¥15 海洋可控源和大地电磁一维联合反演
  • ¥15 MFC多文档程序获取视图指针问题
  • ¥15 如何把matlabR2023遗传算法工具箱里面的各类选项对应的代码调出来呢?