duanlie3187 2017-11-08 08:13
浏览 31
已采纳

存储DOM元素以用作网站的新闻部分

I have been able to use the file_get_contents to go through a websites news section and grab the title text from each article. How would I then store that information and use it in a section on my website?

my php:

<?php
$html = file_get_contents("https://www.coindesk.com/category/news/");

$dom = new DomDocument();
$internalErrors = libxml_use_internal_errors(true);
$dom->loadHTML($html);
libxml_use_internal_errors($internalErrors);
$finder = new DomXPath($dom);
$classname="fade";
$nodes = $finder->query("//*[contains(@class, '$classname')]");
foreach ($nodes as $node) {
    echo $node->nodeValue."<br>"; 
} 
?>

where I want to store it:

<div id="box5" class="toggle" style="display: none;">
        <div id="services" class="services">
                <div class="container" >
                    <div class="service-head text-center">
                        <h2>NEWS</h2>
                        <span> </span>

                    </div>
                <button class="accordion">STORE THE POST TITLE HERE</button>
                <div class="panel1">
                  <p>STORE THE POST SUMMARY HERE WITH LINKS TO ARTICLE</p>
                </div>

                <button class="accordion">Section 2</button>
                <div class="panel1">
                  <p></p>
                </div>

                <button class="accordion">Section 3</button>
                <div class="panel1">
                  <p></p>
                </div>
          </div>
        </div>
      </div>
  • 写回答

1条回答 默认 最新

  • douxin2003 2017-11-08 08:41
    关注

    Fairly straightforward to do - once the XPath expressions have matched the content you store the node contents into an array or object which can be used later in the same page, saved to db or added to a session to use on a-n-other page.

    /* source url */
    $url='https://www.coindesk.com/category/news/';
    
    /* store results in this array */
    $output=array();
    
    /* XPath expressions */
    $exp=new stdClass;
    $exp->articles='//div[@id="content"]/div[ contains(@class,"article") ]/div[@class="post-info"]';
    $exp->title='h3/a';
    $exp->description='p[@class="desc"]';
    
    /* Load the source url directly into DOMDocument */
    $dom=new DOMDocument;
    $dom->validateOnParse=false;
    $dom->standalone=true;
    $dom->preserveWhiteSpace=true;
    $dom->strictErrorChecking=false;
    $dom->substituteEntities=false;
    $dom->recover=true;
    $dom->formatOutput=true;
    $dom->loadHTMLFile( $url );
    libxml_clear_errors();
    
    /* Query the DOM and process nodes found */
    $xp=new DOMXPath( $dom );
    $col=$xp->query( $exp->articles );
    
    if( !empty( $col ) && $col->length > 0 ){
        foreach( $col as $node ){
            $output[]=(object)array(
                'title'         =>  $xp->query($exp->title,$node)->item(0)->nodeValue,
                'description'   =>  $xp->query($exp->description,$node)->item(0)->nodeValue
            );
        }
    }
    $dom = $xp = $col = $node = null;
    
    
    /* 
        The contents of the scrape are stored in the $output array
        and can be used whereever on the page you wish - or stored
        as a session variable and used elsewhere etc etc
    */
    if( !empty( $output ) ){
        /*
            removed `display:none` from div below.....
        */
        echo "
        <div id='box5' class='toggle'>
            <div id='services' class='services'>
                <div class='container' >
                    <div class='service-head text-center'>
                        <h2>NEWS</h2>
                        <span> </span>
                    </div>";
    
        /* iterate through output array where each member is an object */
        foreach( $output as $i => $obj ){
            echo "
                    <button class='accordion'>{$obj->title}</button>
                    <div class='panel1'>
                        <p>{$obj->description}</p>
                    </div>";
        }
    
        echo "
                </div>
            </div>
        </div>";
    }
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥20 关于#anlogic#sdram#的问题,如何解决?(关键词-performance)
  • ¥15 相敏解调 matlab
  • ¥15 求lingo代码和思路
  • ¥15 公交车和无人机协同运输
  • ¥15 stm32代码移植没反应
  • ¥15 matlab基于pde算法图像修复,为什么只能对示例图像有效
  • ¥100 连续两帧图像高速减法
  • ¥15 如何绘制动力学系统的相图
  • ¥15 对接wps接口实现获取元数据
  • ¥20 给自己本科IT专业毕业的妹m找个实习工作