I have a string containing also HTML in a $html
variable:
'Here is some <a href="#">text</a> which I do not need to extract but then there are
<figure class="class-one">
<img src="/example.jpg" alt="example alt" class="some-image-class">
<figcaption>example caption</figcaption>
</figure>
And another one (and many more)
<figure class="class-one some-other-class">
<img src="/example2.jpg" alt="example2 alt">
</figure>'
I want to extract all <figure>
elements and everything they contain including their attributes and other html-elements and put this in an array in PHP so I would get something like:
$figures = [
0 => [
"class" => "class-one",
"img" => [
"src" => "/example.jpg",
"alt" => "example alt",
"class" => "some-image-class"
],
"figcaption" => "example caption"
],
1 => [
"class" => "class-one some-other-class",
"img" => [
"src" => "/example2.jpg",
"alt" => "example2 alt",
"class" => null
],
"figcaption" => null
]];
So far I have tried:
$dom = new DOMDocument();
libxml_use_internal_errors(true);
$dom->loadHTML($html);
libxml_clear_errors();
$figures = array();
foreach ($figures as $figure) {
$figures['class'] = $figure->getAttribute('class');
// here I tried to create the whole array but I can't seem to get the values from the HTML
// also I'm not sure how to get all html-elements within <figure>
}
Here is a Demo.