dssjxvbv918586 2009-09-17 11:43
浏览 57
已采纳

PHP使用domdocument从html中提取数据

I have a table with the following structure. I cannot seem to get the data I want.

<table class="gsborder" cellspacing="0" cellpadding="2" rules="cols" border="1" id="d00">
    <tr class="gridItem">
        <td>Code</td><td>0adf</td>
    </tr><tr class="AltItem">
        <td>CompanyName</td><td>Some Company</td>
    </tr><tr class="Item">
        <td>Owner</td><td>Jim Jim</td>
    </tr><tr class="AltItem">
        <td>DivisionName</td><td>&nbsp;</td>
    </tr><tr class="Item">
        <td>AddressLine1</td><td>9314 W. SPRING ST.</td>
    </tr>
</table>

This table is of course nested within another table within the page. How can I use DomDocument for example to refer to "Code" and "0adf" as a key value pair? They actually don't need to be in a key value pair but I should be able to call them each separately.

EDIT:

Using PHP Simple HTML, I was able to extract the data I needed using this:

  $foo = $html->getElementById("d00")->childNodes(1)->childNodes(1);

The problem with this though is that I am getting the two <td></td> tags with my data. Is there a way to only grab the raw data without the tags?

Also, is this the right way to get my data out of this table?

  • 写回答

1条回答 默认 最新

  • dounaidu0204 2009-09-17 12:05
    关注

    If you're not dead set on using DOMDocument, try using the PHP Simple HTML DOM Parser. This has the benefit of allowing you to parse HTML which is not valid XML as well as providing a nicer interface to the parsed document.

    You could write something like:

    $html = str_get_html(...);
    foreach($html->find('tr') as $tr) 
    {
      print 'First td: ' . $tr->find('td', 0)->plaintext;
      print 'Second td: ' . $tr->find('td', 1)->plaintext;
    }
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?