douya7282 2013-09-04 12:11
浏览 27
已采纳

在特定字符串后使用preg_match查找模式的所有连续出现

I have a huge html document that has different tables with unique table IDs. Something like:

<table class="my_table" id="table_id1">
  <tr class="odd"><td>Line 1</td></tr>
  <tr class="even"><td>Line 2</td></tr>
  <tr class="odd"><td>Line 3</td></tr>
  <tr class="even"><td>Line 4</td></tr>
</table>
<table class="my_table" id="table_id2">
  <tr class="odd"><td>Line 1</td></tr>
  <tr class="even"><td>Line 2</td></tr>
  <tr class="odd"><td>Line 3</td></tr>
</table>

Is it possible using preg_match to find HTML of all rows of a specific table?

I tried the following code:

preg_match('/<table[^>]*id="table_id2">(<tr[^>]*><td>[^>]*<\/td><\/tr>)+/', $html, $matches); 
//$html variable contains the html.

but it returns the output like:

Array
(
    [0] => Array
        (
            [0] => <table class="my_table" id="table_id2"><tr class="odd"><td>Line 1</td></tr><tr class="even"><td>Line 2</td></tr><tr class="odd"><td>Line 3</td></tr>
        )

    [1] => Array
        (
            [0] => <tr class="odd"><td>Line 3</td></tr>
        )

)

But I need the output like this:

Array
(
    [0] => Array
        (
            [0] => <table class="my_table" id="table_id2"><tr class="odd"><td>Line 1</td></tr><tr class="even"><td>Line 2</td></tr><tr class="odd"><td>Line 3</td></tr>
        )

    [1] => Array
        (
            [0] => <tr class="odd"><td>Line 1</td></tr>
            [1] => <tr class="odd"><td>Line 2</td></tr>
            [2] => <tr class="odd"><td>Line 3</td></tr>
        )

)

Is it possible? Please help.

  • 写回答

2条回答 默认 最新

  • doutu1939 2013-09-04 12:24
    关注

    You should not use regex for parsing HTML. PHP has a great tool for that - DOMDocument. Using it, you can do many things, that are impossible/near impossible with regex. Your sample will look like:

    $sHtml = '<table class="my_table" id="table_id1">
      <tr class="odd"><td>Line 1</td></tr>
      <tr class="even"><td>Line 2</td></tr>
      <tr class="odd"><td>Line 3</td></tr>
      <tr class="even"><td>Line 4</td></tr>
    </table>
    <table class="my_table" id="table_id2">
      <tr class="odd"><td>Line 1</td></tr>
      <tr class="even"><td>Line 2</td></tr>
      <tr class="odd"><td>Line 3</td></tr>
    </table>';
    
    $rDoc   = new DOMDocument();
    $rDoc->loadHTML($sHtml);
    $sId    = 'table_id2';
    //found table:
    $rTable = $rDoc->getElementById($sId);
    foreach($rTable->childNodes as $rItem)
    {
       //do something with item:
       //var_dump($rItem);
    }
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥15 用comsol模拟大气湍流通过底部加热(温度不同)的腔体
  • ¥50 安卓adb backup备份子用户应用数据失败
  • ¥20 有人能用聚类分析帮我分析一下文本内容嘛
  • ¥15 请问Lammps做复合材料拉伸模拟,应力应变曲线问题
  • ¥30 python代码,帮调试
  • ¥15 #MATLAB仿真#车辆换道路径规划
  • ¥15 java 操作 elasticsearch 8.1 实现 索引的重建
  • ¥15 数据可视化Python
  • ¥15 要给毕业设计添加扫码登录的功能!!有偿
  • ¥15 kafka 分区副本增加会导致消息丢失或者不可用吗?