dszpyf4859 2016-03-29 14:29
浏览 132

提取标签以及文本PHP Simple HTML DOM Parser Manual

Hi I am using the following to extract the li content, everything works fine but I would like to also include the and tags as well as the text. what do i need to write instead of "$element2->plaintext"???

 // Create DOM from URL or file
    $html = file_get_html('2002-10-01.html');

    // Find all images 
    foreach($html->find('table tr[bgcolor=#FFCCCC]') as $element) {
           foreach($element->find('li') as $element2) { 

///just plain text
            echo $element2->plaintext . '<br>'; 

///plain text and html elements

echo $element2->html. '<br>';

           }

           }

here is the html i am extracting

<tr bgcolor="#FFCCCC"> <!-- HEADLINE TEXT --> 
                      <td class="blue_body"> 
                        <ul>
                          <li><font size="2" face="Arial, Helvetica, sans-serif" color="#000000">As 
                            <b>Bertelsmann</b> continues to haggle with Clive 
                            Calder over how much it must pay to buy his Zomba 
                            independent record company, it plans to consolidate 
                            <b>Zomba</b> under its RCA label. <a href="http://www.nypost.com/business/58425.htm">NYPost</a> 
                            </font>
                          <li><font size="2" face="Arial, Helvetica, sans-serif" color="#000000">News 
                            Corporation and Telecom Italia are expected to announce 
                            a deal today to acquire the Italian satellite television 
                            operation of <b>Vivendi Universal</b> for $464m in 
                            cash. <a href="http://www.nytimes.com/2002/10/01/business/media/01RUPE.html">NYTimes</a> 
                            </font>
                          <li><font size="2" face="Arial, Helvetica, sans-serif" color="#000000">Two 
                            years after <b>America Online</b> agreed to acquire 
                            Time Warner, Ted Turner has soured on both the merger 
                            and Stephen Case, its principal architect. <a href="http://nytimes.com/2002/10/01/technology/01AOL.html">NYTimes</a> 
                            </font> 
                        </ul>
                      </td>
                    </tr>
  • 写回答

1条回答 默认 最新

  • dousha7645 2016-03-29 14:38
    关注

    found the answer

    echo $element2->outertext . '<br>';  
    
    评论

报告相同问题?

悬赏问题

  • ¥15 c程序不知道为什么得不到结果
  • ¥40 复杂的限制性的商函数处理
  • ¥15 程序不包含适用于入口点的静态Main方法
  • ¥15 素材场景中光线烘焙后灯光失效
  • ¥15 请教一下各位,为什么我这个没有实现模拟点击
  • ¥15 执行 virtuoso 命令后,界面没有,cadence 启动不起来
  • ¥50 comfyui下连接animatediff节点生成视频质量非常差的原因
  • ¥20 有关区间dp的问题求解
  • ¥15 多电路系统共用电源的串扰问题
  • ¥15 slam rangenet++配置