douyan2821 2013-06-25 02:14
浏览 24
已采纳

解析HTML几个表DOM [关闭]

When preparing to do the following I found a lot of info that was not clear so I thought id ask this to see if someone could clear somethings up for me.

what exactly is the @ symbol doing to the following

 $domOb = new DOMDocument();
 $html  = @$domOb->loadHTMLFile('http:...'); 

This did remove an error and actually parse the data but is this a good practice solution. I have used this without the @ symbol and got expected results.

Given that I have several tables what is the best/simplist way to get all the <td> from lets say table 3. I was going to list all the <td> and then simply start and end with the value that correlates with the needed data

If looking to parse HTML via PHP I like the Idea of using DOM so when getting the file what should I use. loadHTMLFile() loadHTML()... can I still use Xpath?...If its very busy/badly marked up HTML does this matter?

Whats good practice for looking through the data

    $items = $domOb->getElementsByTagName('td');

    $k    = 0;
    $num  = $items->length;
    while ($k < $num)
    {
        echo $item_web = $items->item($k)->, '<br>';
        $k++;
    }

I found this which is good How do you parse and process HTML/XML in PHP? but its 2 years old so I thought id pose a few questions.

Just a small clip of the 3rd table... At first glance I noticed a space at the 3rd tag does this affect the results?

 <td>Parcel ID: <a href=... style=text-decoration:underline;><b>666666</b></a></td>
 <td>Name: Mr. help</td></tr><tr>
 <td >Parcel Address: 666 help RD&nbsp;</td>
 <td>Name2: Ms. help F</td></tr><tr><td>City: Helpover 66666</td>
 <td>Address: 6666 6TH AVE NE UNIT 333</td>
  • 写回答

2条回答 默认 最新

  • doutangkao2789 2013-06-25 03:00
    关注

    what exactly is the @ symbol doing to the following

    It's supposed to suppress errors, but this is not the right way to do it on DomDocument and related extensions. The correct way is calling libxml_use_internal_errors(true); before loading the malformed HTML.

    can I still use Xpath?.

    Yes:

    $xpath = new DomXPath($domOb);
    $tds = $xpath->query('//td');
    

    I noticed a space at the 3rd tag does this affect the results?

    Entities are converted when you access the textContent property from your TD nodes.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥15 关于#stm32#的问题:CANOpen的PDO同步传输问题
  • ¥20 yolov5自定义Prune报错,如何解决?
  • ¥15 电磁场的matlab仿真
  • ¥15 mars2d在vue3中的引入问题
  • ¥50 h5唤醒支付宝并跳转至向小荷包转账界面
  • ¥15 算法题:数的划分,用记忆化DFS做WA求调
  • ¥15 chatglm-6b应用到django项目中,模型加载失败
  • ¥15 CreateBitmapFromWicBitmap内存释放问题。
  • ¥30 win c++ socket
  • ¥15 C# datagridview 栏位进度