解析HTML几个表DOM [关闭]

When preparing to do the following I found a lot of info that was not clear so I thought id ask this to see if someone could clear somethings up for me.

what exactly is the @ symbol doing to the following

 $domOb = new DOMDocument();
 $html  = @$domOb->loadHTMLFile('http:...');

This did remove an error and actually parse the data but is this a good practice solution. I have used this without the @ symbol and got expected results.

Given that I have several tables what is the best/simplist way to get all the <td> from lets say table 3. I was going to list all the <td> and then simply start and end with the value that correlates with the needed data

If looking to parse HTML via PHP I like the Idea of using DOM so when getting the file what should I use. loadHTMLFile() loadHTML()... can I still use Xpath?...If its very busy/badly marked up HTML does this matter?

Whats good practice for looking through the data

    $items = $domOb->getElementsByTagName('td');

    $k    = 0;
    $num  = $items->length;
    while ($k < $num)
    {
        echo $item_web = $items->item($k)->, '<br>';
        $k++;
    }

I found this which is good How do you parse and process HTML/XML in PHP? but its 2 years old so I thought id pose a few questions.

Just a small clip of the 3rd table... At first glance I noticed a space at the 3rd tag does this affect the results?

 <td>Parcel ID: <a href=... style=text-decoration:underline;><b>666666</b></a></td>
 <td>Name: Mr. help</td></tr><tr>
 <td >Parcel Address: 666 help RD&nbsp;</td>
 <td>Name2: Ms. help F</td></tr><tr><td>City: Helpover 66666</td>
 <td>Address: 6666 6TH AVE NE UNIT 333</td>

写回答
好问题 0 提建议
追加酬金
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

2条回答默认最新

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
doutangkao2789 2013-06-25 03:00
关注
what exactly is the @ symbol doing to the following

It's supposed to suppress errors, but this is not the right way to do it on DomDocument and related extensions. The correct way is calling libxml_use_internal_errors(true); before loading the malformed HTML.

can I still use Xpath?.

Yes:

$xpath = new DomXPath($domOb); $tds = $xpath->query('//td');

I noticed a space at the 3rd tag does this affect the results?

Entities are converted when you access the textContent property from your TD nodes.
本回答被题主选为最佳回答 , 对您是否有帮助呢?

解决无用
评论打赏
分享
举报

评论

按下Enter换行，Ctrl+Enter发表内容

查看更多回答(1条)

报告相同问题？

关注问题

具有多个表的PHP简单HTML DOM解析器 html json php
2018-01-21 23:51

回答 1 已采纳 Found the answer to my question with help from user sms who commented above. This php pulls the da
简单的HTML DOM解析不起作用 html php
2016-03-11 23:28

回答 3 已采纳 You mix Simple HTML Dom third part class commands (as per your question title) with DOMDocument bu
PHP - 通过DOM解析html表 php
2013-05-05 10:50

回答 2 已采纳 There you go (you have to play with the attributes a bit to get your desire output): In this solut
前端解析EXCEL.zip
2020-07-27 17:17

我们将主要关注以下几个关键知识点： 1. **Excel文件格式**： Excel文件通常以`.xlsx`为扩展名，采用OpenXML标准存储，它是一种基于XML的压缩文件格式。文件内部包含多个XML文件，分别定义了工作表、样式、公式等...
使用DOMDocument解析HTML时的Rogue元素 html php
2018-01-29 07:23

回答 1 已采纳 It comes from the line : <script type"text/javascript" src="/includes/js/video-js/video.js"&gt
PHP DOM解析HTML表 php
2011-11-13 23:46

回答 1 已采纳 Use DOMNodelist->item() (item() expects as argument the index, it's zero-based so 1 will return
使用append添加dom，为什么添加进去成字符串了而不是html解析，各位给分析一下原因 html javascript 前端
2022-03-25 17:06

回答 3 已采纳 $('.bottom-center').eq($('.bottom-center').length - 1).append(yejiao)
HTML文档解析器 HTMLParser
2022-06-06 09:28

解析器的工作流程大致分为以下几个步骤： 1. **令牌化**：将输入的HTML源码分解成一系列的令牌（tokens），比如开始标签、结束标签、文本节点等。 2. **树构建**：根据令牌创建DOM树。每个令牌对应树中的一个节点，...
如何通过Simple Html Dom解析html部分中的多个元素 html php
2015-05-26 02:06

回答 1 已采纳 You should probably use a DOM parser. PHP comes bundled with one, and there are many other's you c
PHP简单的HTML DOM解析器“字符问题 html php
2015-09-08 08:45

回答 1 已采纳 If i escape the characters, i lose them. But you can use addslashes() method for removing them. H
PHP简单的HTML DOM解析器：保存Dom树 html javascript jquery php
2013-12-27 11:55

回答 1 已采纳 I'd still use ->outertext, but simply save the content to an array, and then you can use file_p
前端基础——DOM
2022-02-05 14:55

银酱不是酱的博客前端基础学习第六天
[js高手之路]HTML标签解释成DOM节点的实现方法
2020-10-19 07:16

文章中提到的关键知识点包括以下几个方面： 1. HTML标签结构：HTML标签由开始标签、结束标签、属性以及标签内容组成。例如一个具有id、name属性，点击事件以及包含一些文本内容的div元素。 2. 正则表达式匹配：...
【前端高频面试题--虚拟DOM篇】
2024-02-08 14:45

码上有前的博客最新前端高频虚拟DOM面试题，来看看吧，让面试官大吃一惊哟！
几个大公司前端面试题
2018-08-02 14:15

- 了解渲染过程（HTML解析、CSSOM构建、布局、绘制）和JavaScript执行机制（Event Loop、微任务、宏任务）对优化前端性能至关重要。 - 渲染层的优化，如避免重排（reflow）和重绘（repaint）也是面试中的常客。 6...
没有解决我的问题, 去提问

悬赏问题

¥30 YOLO检测微调结果p为1
¥20 求快手直播间榜单匿名采集ID用户名简单能学会的
¥15 DS18B20内部ADC模数转换器
¥15 做个有关计算的小程序
¥15 MPI读取tif文件无法正常给各进程分配路径
¥15 如何用MATLAB实现以下三个公式（有相互嵌套）
¥30 关于#算法#的问题：运用EViews第九版本进行一系列计量经济学的时间数列数据回归分析预测问题求各位帮我解答一下
¥15 setInterval 页面闪烁，怎么解决
¥15 如何让企业微信机器人实现消息汇总整合
¥50 关于#ui#的问题：做yolov8的ui界面出现的问题

解析HTML几个表DOM [关闭]

2条回答 默认 最新

悬赏问题

2条回答默认最新