2017-08-23 10:20
Please see my script below :


    function getContent ()
        $ch = curl_init();  
        curl_setopt($ch,CURLOPT_URL, 'http://localhost/test.php/test2.php');
        return $output;


    function getHrefFromLinks ($cString){


        $dom = new DomDocument();

        $xpath = new DOMXPath($dom);
        $nodes = $xpath->query('//a/@href');
        foreach($nodes as $href) {

            echo $href->nodeValue;   echo "<br />";                    // echo current attribute value
            $href->nodeValue = 'new value';              // set new attribute value
            $href->parentNode->removeAttribute('href');  // remove attribute

        foreach (libxml_get_errors() as $error) {




echo getHrefFromLinks (getContent());


The output of http://localhost/test.php/test2.php is :

<a href='/oncelink/index.html'><span class="lsbold">Luck</span> Lucky</a><a href='/oncelink-2/lucky'locki'><span class="lsbold">Luck</span>'s Locki</a>

When echo getHrefFromLinks (getContent()); runs, the output is :

/oncelink/index.html<br />/oncelink-2/lucky<br />

This is wrong, as the output should be :

/oncelink/index.html<br />/oncelink-2/lucky'locki<br />

I understand that the href value generated from the link is somehow incorrect as it includes an additional apostrophe but I won't be able to change that as it is pre-generated.

The other question is, how can I get the value of the span tag :

<span class="lsbold">

Thanks in advance!

  • doulv1760 2017-08-23 11:18

    SOLVED :)

    Well. If it's stupid but it works, then it aint stupid :D

    Just added the following code in the end :

    $fix = str_replace("href='", 'href="', getContent());
    $fix = str_replace("'>", '">', $fix);
    echo getHrefFromLinks ($fix);
