dongyan6503 2016-11-26 10:21
浏览 61
已采纳

编写多个正则表达式模式来解析HTML [重复]

This question already has an answer here:

I'm fetching an HTML webpage with file_get_contents(), I get a table like below, there are more than 150 rows:

<tr class="tabrow ">
    <td class="tabcol  tdmin_2l">FIRST+DATA</td>
    <td class="tabcol">
        <a class="modal-button" title="SECOND+DATA"  href="THIRD+DATA" rel="{handler: 'iframe', size: {x: 800, y: 640}, overlayOpacity: 0.9, classWindow: 'phocamaps-plugin-window', classOverlay: 'phocamaps-plugin-overlay'}">
            asdxxx
        </a>
    </td>
    <td class="tabcol"></td>
    <td class="tabcol">FOURTH+DATA</td>
</tr>

I want to get the FIRST DATA, SECOND DATA, THIRD DATA and FOURTH DATA with a preg_match_all() call. I tried to write multiple patterns, but I couldn't succeed. Here's what I tried:

preg_match_all('/(<td class="tabcol  tdmin_2l">|title=")(.*?)(<\/td>|")/s', $raw, $matches, PREG_SET_ORDER);

What's the true patterns?

</div>
  • 写回答

2条回答 默认 最新

  • duanpanbo9476 2016-11-26 10:28
    关注

    Try this:

    $str = <<<HTML
    <tr class="tabrow ">
    <td class="tabcol  tdmin_2l">FIRST+DATA</td>
    <td class="tabcol"><a class="modal-button" title="SECOND+DATA"  href="THIRD+DATA" rel="{handler: 'iframe', size: {x: 800, y: 640}, overlayOpacity: 0.9, classWindow: 'phocamaps-plugin-window', classOverlay: 'phocamaps-plugin-overlay'}">asdxxx</a></td>
    <td class="tabcol"></td>
    <td class="tabcol">FOURTH+DATA</td>
    </tr>
    HTML;
    
    preg_match_all('/<td[^>]*>(.*?)<\/td>/im', $str, $td_matches);
    preg_match('/ title="([^"]*)"/i', $td_matches[1][1], $title);
    preg_match('/ href="([^"]*)"/i', $td_matches[1][1], $href);
    
    echo $td_matches[1][0] . "
    ";
    echo $title[1] . "
    ";
    echo $href[1] . "
    ";
    echo $td_matches[1][3];
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥15 孟德尔随机化结果不一致
  • ¥20 求用stm32f103c6t6在lcd1206上显示Door is open和password:
  • ¥15 apm2.8飞控罗盘bad health,加速度计校准失败
  • ¥15 求解O-S方程的特征值问题给出边界层布拉休斯平行流的中性曲线
  • ¥15 谁有desed数据集呀
  • ¥20 手写数字识别运行c仿真时,程序报错错误代码sim211-100
  • ¥15 关于#hadoop#的问题
  • ¥15 (标签-Python|关键词-socket)
  • ¥15 keil里为什么main.c定义的函数在it.c调用不了
  • ¥50 切换TabTip键盘的输入法