Hi I am using the following to extract the li content, everything works fine but I would like to also include the and tags as well as the text. what do i need to write instead of "$element2->plaintext"???
// Create DOM from URL or file
$html = file_get_html('2002-10-01.html');
// Find all images
foreach($html->find('table tr[bgcolor=#FFCCCC]') as $element) {
foreach($element->find('li') as $element2) {
///just plain text
echo $element2->plaintext . '<br>';
///plain text and html elements
echo $element2->html. '<br>';
}
}
here is the html i am extracting
<tr bgcolor="#FFCCCC"> <!-- HEADLINE TEXT -->
<td class="blue_body">
<ul>
<li><font size="2" face="Arial, Helvetica, sans-serif" color="#000000">As
<b>Bertelsmann</b> continues to haggle with Clive
Calder over how much it must pay to buy his Zomba
independent record company, it plans to consolidate
<b>Zomba</b> under its RCA label. <a href="http://www.nypost.com/business/58425.htm">NYPost</a>
</font>
<li><font size="2" face="Arial, Helvetica, sans-serif" color="#000000">News
Corporation and Telecom Italia are expected to announce
a deal today to acquire the Italian satellite television
operation of <b>Vivendi Universal</b> for $464m in
cash. <a href="http://www.nytimes.com/2002/10/01/business/media/01RUPE.html">NYTimes</a>
</font>
<li><font size="2" face="Arial, Helvetica, sans-serif" color="#000000">Two
years after <b>America Online</b> agreed to acquire
Time Warner, Ted Turner has soured on both the merger
and Stephen Case, its principal architect. <a href="http://nytimes.com/2002/10/01/technology/01AOL.html">NYTimes</a>
</font>
</ul>
</td>
</tr>