I'm trying to pull some data from my website. It is pretty simple, but I can't find any good examples/docs, so I am having a tough time. I'm trying to make an API for my friends to use my blog, but it's a bit difficult. Let's assume I have a website at http://www.sample.com, and the html source for that website is:
<div class="container">
<a href="/mywebsiteblogpost/">
<h2 class="title">im the best</h2>
</a>
<span class="author">Josue Espinosa</span>
<div class="thumb"> <img src="http://www.sample.com/imgsrc" alt="">
<span class="category">sports</span>
</div>
<p>preview text</p>
<a class="more" href="/mywebsiteblogpost/">full text...</a>
</div>
I want to get all of .container
's children, the first a
child's href
value, the text value of the class title
, author
, the img src
for the child inside .thumb
, and the text value for category
.
I started with the a href src
, but I didn't even get that far. I thought $title would be echoing the href value of the first anchor tag inside of container, but it doesn't work.
$text = file_get_contents('http://www.sample.com');
$doc = new DOMDocument('1.0');
$doc->loadHTML($text);
foreach($doc->getElementsByTagName('div') AS $div) {
$class = $div->getAttribute('class');
if(strpos($class, 'container') !== FALSE) {
// title doesnt retrieve the href value of title :(
$title = 'TITLE'.$div->getElementsByTagName('a')->getAttribute('href').'<br>';
//this echos all the text in all of the children of $div
echo $div->textContent.'<br>';
}
}
Can anyone explain why please?