dream752614590 2012-05-22 15:27
浏览 31

要删除的冗余标签[重复]

Possible Duplicate:
Cleaning HTML by removing extra/redundant formatting tags

I have been trying to remove redundant tags which are generated from HTML composers. This apparently is not able to remove all the empty ones. I have been looking at it for sometime and I am not able to figure out. There might be something I am missing.

Below is the code. Thanks a lot ppl..

//Check for reduntant tags
function removeRedundantTags($pathname) {
$dom = new DOMDocument();
$dom->loadHTMLFile($pathname);
$allTags = $dom->getElementsByTagName('*');
for($i = 0; $i < $allTags->length; $i++) {
    $currentTag = $allTags->item($i);
    echo "Accessed Tags: ".$currentTag->nodeName.'<br>';
    if($currentTag->hasChildNodes()) continue;
    if($currentTag->nodeName == 'br' || $currentTag->nodeName == 'img' || $currentTag->nodeName == 'meta') continue;
    if($currentTag->nodeValue == NULL) {                        
        $parentNode = $currentTag->parentNode;
        $oldChild = $parentNode->removeChild($currentTag);      
        echo "Removed Tags----: ".$oldChild->nodeName.'<br>';
    }
}   
echo "Redandant Removed<br>";
$dom->saveHTMLFile($pathname);
}

Edit (output added) Lets saying I am trying to cleanup span tags (sorry I am not able to post HTML code) It is just removing half of it.. It is like two of the span tags are present it removes only one, and the same applies to all of the empty tags

I am using DOM structure which happens to be very fast as I will be using this piece of code to hundreds of HTML files. So some of the answers use regular expressions which are not helpful.

  • 写回答

1条回答 默认 最新

  • dongliu6848 2012-05-22 15:30
    关注
    function clean($txt)
    {
        $txt=preg_replace("{(<br[\\s]*(>|\/>)\s*){2,}}i", "<br /><br />", $txt);
        $txt=preg_replace("{(<br[\\s]*(>|\/>)\s*)}i", "<br />", $txt);
        return $txt;
    }
    

    Answer by H9kDroid in How to remove redundant <br /> tags from HTML code using PHP?

    评论

报告相同问题?

悬赏问题

  • ¥15 用hfss做微带贴片阵列天线的时候分析设置有问题
  • ¥50 我撰写的python爬虫爬不了 要爬的网址有反爬机制
  • ¥15 Centos / PETSc / PETGEM
  • ¥15 centos7.9 IPv6端口telnet和端口监控问题
  • ¥120 计算机网络的新校区组网设计
  • ¥20 完全没有学习过GAN,看了CSDN的一篇文章,里面有代码但是完全不知道如何操作
  • ¥15 使用ue5插件narrative时如何切换关卡也保存叙事任务记录
  • ¥20 海浪数据 南海地区海况数据,波浪数据
  • ¥20 软件测试决策法疑问求解答
  • ¥15 win11 23H2删除推荐的项目,支持注册表等