Please see my script below :
<?php
function getContent ()
{
$ch = curl_init();
curl_setopt($ch,CURLOPT_URL, 'http://localhost/test.php/test2.php');
curl_setopt($ch,CURLOPT_RETURNTRANSFER,true);
$output=curl_exec($ch);
curl_close($ch);
return $output;
}
function getHrefFromLinks ($cString){
libxml_use_internal_errors(true);
$dom = new DomDocument();
$dom->loadHTML($cString);
$xpath = new DOMXPath($dom);
$nodes = $xpath->query('//a/@href');
foreach($nodes as $href) {
echo $href->nodeValue; echo "<br />"; // echo current attribute value
$href->nodeValue = 'new value'; // set new attribute value
$href->parentNode->removeAttribute('href'); // remove attribute
}
foreach (libxml_get_errors() as $error) {
}
libxml_clear_errors();
}
echo getHrefFromLinks (getContent());
?>
The output of http://localhost/test.php/test2.php is :
<a href='/oncelink/index.html'><span class="lsbold">Luck</span> Lucky</a><a href='/oncelink-2/lucky'locki'><span class="lsbold">Luck</span>'s Locki</a>
When echo getHrefFromLinks (getContent()); runs, the output is :
/oncelink/index.html<br />/oncelink-2/lucky<br />
This is wrong, as the output should be :
/oncelink/index.html<br />/oncelink-2/lucky'locki<br />
I understand that the href value generated from the link is somehow incorrect as it includes an additional apostrophe but I won't be able to change that as it is pre-generated.
The other question is, how can I get the value of the span tag :
<span class="lsbold">
Thanks in advance!