dongyou6909 2019-05-01 20:07
浏览 138

如何忽略span标签dom html

Hi i am trying to scrape Brand New Apple iPhone 8 64GB or 256GB - Sealed - GSM Unlocked in this code but it also scrape span with it, how do i ignore span text.

<h1 class="it-ttl" itemprop="name" id="itemTitle"><span class="g-hdn">Details about  &nbsp;</span>Brand New Apple iPhone 8 64GB or 256GB - Sealed - GSM Unlocked</h1>

This is the code :

$productname = $html->find("h1[class='it-ttl']",0)->plaintext;

echo $productname;
  • 写回答

1条回答 默认 最新

  • dpca4790 2019-05-01 20:22
    关注

    strip_tags_content is a function which is written in PHP Strip Tags and owner of the function is explains with these words. You can find more examples inside the link.

    Output is: Brand New Apple iPhone 8 64GB or 256GB - Sealed - GSM Unlocked

    "Hi. I made a function that removes the HTML tags along with their contents "

     function strip_tags_content($text, $tags = '', $invert = FALSE) {
    
            preg_match_all('/<(.+?)[\s]*\/?[\s]*>/si', trim($tags), $tags);
            $tags = array_unique($tags[1]);
    
            if(is_array($tags) AND count($tags) > 0) {
                if($invert == FALSE) {
                    return preg_replace('@<(?!(?:'. implode('|', $tags) .')\b)(\w+)\b.*?>.*?</\1>@si', '', $text);
                }
                else {
                    return preg_replace('@<('. implode('|', $tags) .')\b.*?>.*?</\1>@si', '', $text);
                }
            }
            elseif($invert == FALSE) {
                return preg_replace('@<(\w+)\b.*?>.*?</\1>@si', '', $text);
            }
            return $text;
        }
    
    
        $string = '<h1 class="it-ttl" itemprop="name" id="itemTitle"><span class="g-hdn">Details about  &nbsp;</span>Brand New Apple iPhone 8 64GB or 256GB - Sealed - GSM Unlocked</h1>';
        $string = strip_tags_content($string,'<span>',true);
        $string = strip_tags($string);
    
        echo $string;
    

    For your problem after defining this function just call

    $productname = $html->find("h1[class='it-ttl']",0)->plaintext; 
    $productname = strip_tags_content($productname ,'<span>',true); 
    $productname = strip_tags($string);
    
    评论

报告相同问题?

悬赏问题

  • ¥35 lstm时间序列共享单车预测,loss值优化,参数优化算法
  • ¥15 基于卷积神经网络的声纹识别
  • ¥15 Python中的request,如何使用ssr节点,通过代理requests网页。本人在泰国,需要用大陆ip才能玩网页游戏,合法合规。
  • ¥100 为什么这个恒流源电路不能恒流?
  • ¥15 有偿求跨组件数据流路径图
  • ¥15 写一个方法checkPerson,入参实体类Person,出参布尔值
  • ¥15 我想咨询一下路面纹理三维点云数据处理的一些问题,上传的坐标文件里是怎么对无序点进行编号的,以及xy坐标在处理的时候是进行整体模型分片处理的吗
  • ¥15 CSAPPattacklab
  • ¥15 一直显示正在等待HID—ISP
  • ¥15 Python turtle 画图