I have this code to extract statements from a multiple pages of a forum site. it works great except it prints each statement thrice instead of once. I checked each and every line still i don't understand why.
My code goes as:
<?php
set_time_limit(3600);
$i = 0;
while($i < 100)
{
$e = 839303 - $i;
require_once('dom/simple_html_dom.php');
$html =file_get_html('http://www.usmleforum.com/files/forum/2017/1/'.$e.'.php');
foreach ($html->find("tr") as $row)
{
$element = $row->find('td.Text2',0);
if ($element == null) { continue; }
$textNode = array_filter($element->nodes, function ($n)
{
return $n->nodetype == 3; //Text node type, like in jQuery
});
if (!empty($textNode))
{
$text = current($textNode);
echo $text."<br>";
}
}
$i++;
}
?>
In other hand, if the site that we are extracting contains more than on statements of the hidden somewhere, can we only ask the parser to echo once?
Any help is appreciated..
Trying to parse user details...but not working,, kinda skipping..
//User
$element = $html->find('td.FootNotes2',0);
if ($element == null) { continue; }
$textNode = array_filter($element->nodes, function ($n) {
return $n->nodetype == 3;
});
if (!empty($textNode)) {
$text = current($textNode);
echo $text."<br><hr><hr>";
}