dsklzerpx64815631 2015-10-13 16:44 采纳率: 100%
浏览 208
已采纳

如何用php DOMDocument输出纯文本?

I'm using this code (thank you Lawrence) to parse HTML table:

<?php
$html = file_get_contents('http://www.example.com');
$dom = new DOMDocument();
@$dom->loadHTML($html);

//TUE 1 1 4.37 6.39 1.08 5.35 9.18 6.00 1.30 6.30 7.42 9.40                 
echo '
<table>
    <tr>';
foreach($dom->getElementsByTagName('table') as $table) {
    echo innerHTML($table->getElementsByTagName('tr')->item(9));
}
echo '
    </tr>
</table>';

function innerHTML($current){
    $ret = "";
    $nodes = @$current->childNodes;
    if(!empty($nodes)){
        foreach($nodes as $v){
            $tmp = new DOMDocument();
            $tmp->appendChild($tmp->importNode($v, true));
            $ret .= $tmp->saveHTML();
        }
        return $ret;
    }
    return;
}
?>

The problem is that it outputs original HTML code, so how can I output plain text?

I have tried these changes, but it didn't work:

return $ret->textContent;
return $ret->nodeValue;
return $ret->plaintext;

echo innerHTML($table->getElementsByTagName('tr')->item(9)->textContent);
echo innerHTML($table->getElementsByTagName('tr')->item(9)->nodeValue);
echo innerHTML($table->getElementsByTagName('tr')->item(9)->plaintext);
  • 写回答

2条回答 默认 最新

  • duanqiang5722 2015-10-13 17:12
    关注

    The solution is actually very simple - strip_tags function.

    echo strip_tags(innerHTML($table->getElementsByTagName('tr')->item(9)));
    

    It takes the value and removes all of the HTML code, which results in plain text value.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥15 关于IMageEnView 图标定位问题
  • ¥20 求解答(matlab)
  • ¥30 ffmpeg库使用过程中遇到的问题
  • ¥15 pyqt5 中python如何通过Qtwebchannel主动发消息给web前端
  • ¥15 关于HTML中title获取xml内容的问题
  • ¥15 fanuc机器人PRIO083数字信号未复原错误,如何解决?
  • ¥20 如何为现有电路板增加远程控制功能
  • ¥15 UE5打包失败,求解决
  • ¥15 请问STM32G431的CANOPEN协议函数怎么写
  • ¥15 graphpad prism 三因素重复测定报错