Possible Duplicate:
How to parse and process HTML with PHP?
I'm trying to use PHP and regex to grab all the hyperlinks from an external page. The links I care about scraping are structured as follows:
<li class="magic"><a href="http://blah.com">TargetText1</a></li>
<li class="magic"><a href="http://blah.com">TargetText2</a></li>
Please bear in mind I'm trying to get the anchor text NOT the url. I've got the code below working however it simply scrapes all the links on the page. I'm trying to scrape links only wrapped with the li class listed above.
$url = "http://www.example.com";
$input = @file_get_contents($url) or die("Could not access file: $url");
$regexp = "<a\s[^>]*href=(\"??)([^\" >]*?)\\1[^>]*>(.*)<\/a>";
if(preg_match_all("/$regexp/siU", $input, $matches)) {
print_r($matches);
}