dongtanjian9310 2014-03-14 17:35
浏览 41
已采纳

在preg_match请求中获取所有匹配项

I'm having the following problem, i have that structure:

$table = '
<table>
    <tbody>
        <tr valign="top">
            <td>foo</td>
            <td>bar</td>
        </tr>
    </tbody>
</table>
<table>
    <tbody>
        <tr valign="top">
            <td>bee</td>
            <td>dog</td>
        </tr>
    </tbody>
</table>';

I'm trying to retrieve an array with all <tr> but with no success. The closest pattern I've could made it, return all messed up.

$pattern = "/<tr valign[^>]*>(.*)<\/tr>/s";
preg_match_all($pattern, $table, $matches, PREG_PATTERN_ORDER);

If i put var_dump($matches), I want an array that returns:

array(
    [0] => "<td>foo</td><td>bar</td>",
    [1] => "<td>bee</td><td>dog</td>"
);

...or something close to that.

But I receive:

string(301) "
    foo
    bar
    "
<table>
        <tbody>
            <tr valign="top">
                <td>bee</td>
                <td>dog</td>
            </tr>
    </tbody></table>

Anyone know what I'm doing wrong?

Thanks in advance.

  • 写回答

1条回答 默认 最新

  • dongqi4085 2014-03-14 17:37
    关注

    You must make your quantifier lazy: .* => .*?

    When you use a greedy quantifier, .* will take all possible characters, When you use a lazy quantifier, .*? will take the minimum number of characters.

    When you use a lazy quantifier, the regex engine will take characters one by one and test the pattern completion for each character.

    When you use a greedy quantifier (default behavior) the regex engine will take all possible characters (until the end in your case) and will backtrack character by character until the pattern completion succeed.

    Notes:

    It is useless to add PREG_PATTERN_ORDER since it is the default set of preg_match_all.

    DOMDocument is probably a more adapted tool to deal with html. Example:

    $dom = new DOMDocument();
    @$dom->loadHTML($table);
    
    $trs = $dom->getElementsByTagName('tr');
    
    $results = array();
    
    foreach ($trs as $tr) {
        if ($tr->hasAttribute('valign')) {
            $children = $tr->childNodes;
    
            $tmp = '';
            foreach ($children as $child) {
                $tmp .= trim($dom->saveHTML($child));
            }
            if (!empty($tmp)) $results[] = $tmp;
        }
    }
    
    echo htmlspecialchars(print_r($results, true));
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 虚幻5 UE美术毛发渲染
  • ¥15 CVRP 图论 物流运输优化
  • ¥15 Tableau online 嵌入ppt失败
  • ¥100 支付宝网页转账系统不识别账号
  • ¥15 基于单片机的靶位控制系统
  • ¥15 真我手机蓝牙传输进度消息被关闭了,怎么打开?(关键词-消息通知)
  • ¥15 装 pytorch 的时候出了好多问题,遇到这种情况怎么处理?
  • ¥20 IOS游览器某宝手机网页版自动立即购买JavaScript脚本
  • ¥15 手机接入宽带网线,如何释放宽带全部速度
  • ¥30 关于#r语言#的问题:如何对R语言中mfgarch包中构建的garch-midas模型进行样本内长期波动率预测和样本外长期波动率预测