douyan2821 2013-06-25 02:14
浏览 24
已采纳

解析HTML几个表DOM [关闭]

When preparing to do the following I found a lot of info that was not clear so I thought id ask this to see if someone could clear somethings up for me.

what exactly is the @ symbol doing to the following

 $domOb = new DOMDocument();
 $html  = @$domOb->loadHTMLFile('http:...'); 

This did remove an error and actually parse the data but is this a good practice solution. I have used this without the @ symbol and got expected results.

Given that I have several tables what is the best/simplist way to get all the <td> from lets say table 3. I was going to list all the <td> and then simply start and end with the value that correlates with the needed data

If looking to parse HTML via PHP I like the Idea of using DOM so when getting the file what should I use. loadHTMLFile() loadHTML()... can I still use Xpath?...If its very busy/badly marked up HTML does this matter?

Whats good practice for looking through the data

    $items = $domOb->getElementsByTagName('td');

    $k    = 0;
    $num  = $items->length;
    while ($k < $num)
    {
        echo $item_web = $items->item($k)->, '<br>';
        $k++;
    }

I found this which is good How do you parse and process HTML/XML in PHP? but its 2 years old so I thought id pose a few questions.

Just a small clip of the 3rd table... At first glance I noticed a space at the 3rd tag does this affect the results?

 <td>Parcel ID: <a href=... style=text-decoration:underline;><b>666666</b></a></td>
 <td>Name: Mr. help</td></tr><tr>
 <td >Parcel Address: 666 help RD&nbsp;</td>
 <td>Name2: Ms. help F</td></tr><tr><td>City: Helpover 66666</td>
 <td>Address: 6666 6TH AVE NE UNIT 333</td>
  • 写回答

2条回答 默认 最新

  • doutangkao2789 2013-06-25 03:00
    关注

    what exactly is the @ symbol doing to the following

    It's supposed to suppress errors, but this is not the right way to do it on DomDocument and related extensions. The correct way is calling libxml_use_internal_errors(true); before loading the malformed HTML.

    can I still use Xpath?.

    Yes:

    $xpath = new DomXPath($domOb);
    $tds = $xpath->query('//td');
    

    I noticed a space at the 3rd tag does this affect the results?

    Entities are converted when you access the textContent property from your TD nodes.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥15 nginx中的CORS策略应该如何配置
  • ¥30 信号与系统实验:采样定理分析
  • ¥100 我想找人帮我写Python 的股票分析代码,有意请加mathtao
  • ¥20 Vite 打包的 Vue3 组件库,图标无法显示
  • ¥15 php 同步电商平台多个店铺增量订单和订单状态
  • ¥15 关于logstash转发日志时发生的部分内容丢失问题
  • ¥17 pro*C预编译“闪回查询”报错SCN不能识别
  • ¥15 微信会员卡接入微信支付商户号收款
  • ¥15 如何获取烟草零售终端数据
  • ¥15 数学建模招标中位数问题