I am really struggling attempting to scrape a table either via XPath or any sort of 'getElement' method. I have searched around and attempted various different approaches to solve my problem below but have come up short and really appreciate any help.
First, the HTML portion I am trying to scrape is the 2nd table on the document and looks like:
<table class="table2" border="1" cellspacing="0" cellpadding="3">
<tbody>
<tr><th colspan="8" align="left">Status Information</th></tr>
<tr><th align="left">Status</th><th align="left">Type</th><th align="left">Address</th><th align="left">LP</th><th align="left">Agent Info</th><th align="left">Agent Email</th><th align="left">Phone</th><th align="center">Email Tmplt</th></tr>
<tr></tr>
<tr>
<td align="left">Active</td>
<td align="left">Resale</td>
<td align="center">*Property Address*</td>
<td align="right">*Price*</td>
<td align="center">*Agent Info*</td>
<td align="center">*Agent Email*</td>
<td align="center">*Agent Phone*</td>
<td align="center"> </td>
</tr>
<tr>
<td align="left">Active</td>
<td align="left">Resale</td>
<td align="center">*Property Address*</td>
<td align="right">*Price*</td>
<td align="center">*Agent Info*</td>
<td align="center">*Agent Email*</td>
<td align="center">*Agent Phone*</td>
<td align="center"> </td>
</tr>
...etc
With additional trs continuing containing 8 tds with the same information as detailed above.
What I need to do is iterate through the trs and internal tds to pick up each piece of information (inside the td) for each entry (inside of the tr).
Here is the code I have been struggling with:
<?php
$payload = array(
'http'=>array(
'method'=>"POST",
'content'=>'key=value'
)
);
stream_context_set_default($payload);
$dom = new DOMDocument();
libxml_use_internal_errors(TRUE);
$dom->loadHTMLFile('website-scraping-from.com');
libxml_clear_errors();
foreach ($dom->getElementsByTagName('tr') as $row){
foreach($dom->$row->getElementsByTagName('td') as $node){
echo $node->textContent . "<br/>";
}
}
?>
This code is not returning nearly what I need and I am having a lot of trouble trying to figure out how to fix it, perhaps XPath is a better route to go to find the table / information I need, but I have come up empty with that method as well. Any information is much appreciated.
If it matters, my end goal is to be able to take the table data and dump it into a database if the first td has a value of "Active".