如何排除加倍的DOMDocument元素

I have trying to pull titles from a page. Everything seems to work so far but I've got doubled results. For example I'm getting h3 titles. On the page is one time but in the source is 2 times.

Here is the example

<span data-img-type='cvr' data-img-att-alt='Cover of Greek Mythology' data-img-size-xs='image.jpg'></span>
<h3> Cover of Greek Mythology </h3>

This will return

Cover of Greek Mythology
Cover of Greek Mythology

I'm targeting only h3 elements but they still appear doubled. How can I remove repeated elements?

Here is what I have so far

$html = file_get_contents('https://example.com/'); 

$scriptDocument = new DOMDocument();

libxml_use_internal_errors(TRUE); 

if(!empty($html)){ 

    $scriptDocument->loadHTML($html);
    libxml_clear_errors(); 
    $scriptDOMXPath = new DOMXPath($scriptDocument);
    //get all the h3's with an class
    $scriptRow = $scriptDOMXPath->query('//h3[@class]');
    //check
    if($scriptRow->length > 0){
        foreach($scriptRow as $row){
            echo $row->nodeValue . "<br/>";
        }
    }
}

写回答
好问题 0 提建议
追加酬金
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

报告相同问题？

关注问题

通过DOMDocument PHP获取DIV元素内容 html php
2014-10-30 13:46

回答 2 已采纳 Yes you can, just adjust the ->item() index. Just like what you have done already in the other
PHP DOMDocument：从id获取属性值 php
2017-10-10 14:52

回答 2 已采纳 Try this : $data->getAttribute('value'); PHP: DomElement->getAttribute $attrs = array()
使用DOMDocument解析HTML时的Rogue元素 html php
2018-01-29 07:23

回答 1 已采纳 It comes from the line : <script type"text/javascript" src="/includes/js/video-js/video.js"&gt
php css 改变宽度,HTML+CSS部分前端基础面试题
2021-04-23 08:10

weixin_39566773的博客 1.行内元素和块级元素?img算什么?行内元素怎么转化为块级元素?行内元素：和有他元素都在一行上，高度、行高及外边距和内边距都不可改变，文字图片的宽度不可改变，只能容纳文本或者其他行内元素；其中img是行元素...
PHP DOMDocument XML html php xml
2018-10-04 11:55

回答 3 已采纳 Use DOMXpath to selecting elements with special condition. In your case, base on child id attribut
PHP DOMDocument：按类删除元素 php
2014-08-27 10:03

回答 3 已采纳 You need to use the removeChild() method of the parent element: $xpath = new DOMXPath($dom); fore
操纵PHP domdocument字符串 php
2013-08-27 21:21

回答 2 已采纳 You are looking for a method called DOMNode::replaceChild(). To make use of that you need to crea
php 兼容火狐,HTML_总结CSS中火狐浏览器与IE浏览器的兼容代码，如何让你写的代码更兼容火狐 - phpStudy...
2021-04-27 05:44

叶子472的522的博客相关阅读: ASP3.0高级编程（十六） php 生成随机验证码图片代码 CSS教程:滑动门与选项卡 js对象关系图方便dom操作 VBS教程：运算符-赋值运算符 (=) 代码人生学习品之EJB入门篇 jQuery formValidator表单验证插件...
如何用php DOMDocument输出纯文本？ php
2015-10-13 16:44

回答 2 已采纳 The solution is actually very simple - strip_tags function. echo strip_tags(innerHTML($table->
使用PHP删除父DOM元素 php xml
2015-11-23 11:46

回答 1 已采纳 DOMNode::getElementsByTagName() returns a live result. The list actually changes if you remove nod
如何使用PHP DOMDocument（）检索子元素中的值？ php
2019-06-17 18:20

回答 1 已采纳 What you can do is to look at the next element from the <img> tag (using nextSibling) and if
web漏洞总结大全（基础）
2024-01-20 14:07

Ba1_Ma0的博客在以下示例中，应用程序使用一些 JavaScript 从输入字段读取值并将该值写入 HTML 中的元素： var search = document.getElementById('search').value; var results = document.getElementById('results'); results....
在PHP中为内容刮取DOMDocument表 php
2015-11-12 22:30

回答 1 已采纳 Can this be of any help? $table = $dom->getElementsByTagName('table')->item(1); foreach ($t
web漏洞总结大全（基础）_web漏洞文章 csdn，21年网络安全面经分享
2024-04-18 17:33

2401_83974087的博客在以下示例中，应用程序使用一些 JavaScript 从输入字段读取值并将该值写入 HTML 中的元素： var search = document.getElementById('search').value; var results = document.getElementById('results'); results....
CSS面试题
2021-11-11 09:22

qq_40055200的博客方法二：css3 position + translate 子元素宽高不确定兼容性：ie9+, chrome全支持缺点：采用了绝对定位，子元素不能撑开父元素，正式使用要加各种前缀(-webkit-,-ms-等) 方法三：css3 box布局子元素宽高不确定 ...
没有解决我的问题, 去提问

悬赏问题

¥15 Vue3 大型图片数据拖动排序
¥15 划分vlan后不通了
¥15 GDI处理通道视频时总是带有白色锯齿
¥20 用雷电模拟器安装百达屋apk一直闪退
¥15 算能科技20240506咨询（拒绝大模型回答）
¥15 自适应 AR 模型参数估计Matlab程序
¥100 角动量包络面如何用MATLAB绘制
¥15 merge函数占用内存过大
¥15 使用EMD去噪处理RML2016数据集时候的原理
¥15 神经网络预测均方误差很小但是图像上看着差别太大

码龄粉丝数原力等级 --

如何排除加倍的DOMDocument元素

0条回答默认最新

悬赏问题

如何排除加倍的DOMDocument元素

0条回答 默认 最新

悬赏问题

0条回答默认最新