duanlie3187
2017-11-08 08:13 阅读 28

存储DOM元素以用作网站的新闻部分

I have been able to use the file_get_contents to go through a websites news section and grab the title text from each article. How would I then store that information and use it in a section on my website?

my php:

<?php
$html = file_get_contents("https://www.coindesk.com/category/news/");

$dom = new DomDocument();
$internalErrors = libxml_use_internal_errors(true);
$dom->loadHTML($html);
libxml_use_internal_errors($internalErrors);
$finder = new DomXPath($dom);
$classname="fade";
$nodes = $finder->query("//*[contains(@class, '$classname')]");
foreach ($nodes as $node) {
    echo $node->nodeValue."<br>"; 
} 
?>

where I want to store it:

<div id="box5" class="toggle" style="display: none;">
        <div id="services" class="services">
                <div class="container" >
                    <div class="service-head text-center">
                        <h2>NEWS</h2>
                        <span> </span>

                    </div>
                <button class="accordion">STORE THE POST TITLE HERE</button>
                <div class="panel1">
                  <p>STORE THE POST SUMMARY HERE WITH LINKS TO ARTICLE</p>
                </div>

                <button class="accordion">Section 2</button>
                <div class="panel1">
                  <p></p>
                </div>

                <button class="accordion">Section 3</button>
                <div class="panel1">
                  <p></p>
                </div>
          </div>
        </div>
      </div>
  • 点赞
  • 写回答
  • 关注问题
  • 收藏
  • 复制链接分享

1条回答 默认 最新

  • 已采纳
    douxin2003 douxin2003 2017-11-08 08:41

    Fairly straightforward to do - once the XPath expressions have matched the content you store the node contents into an array or object which can be used later in the same page, saved to db or added to a session to use on a-n-other page.

    /* source url */
    $url='https://www.coindesk.com/category/news/';
    
    /* store results in this array */
    $output=array();
    
    /* XPath expressions */
    $exp=new stdClass;
    $exp->articles='//div[@id="content"]/div[ contains(@class,"article") ]/div[@class="post-info"]';
    $exp->title='h3/a';
    $exp->description='p[@class="desc"]';
    
    /* Load the source url directly into DOMDocument */
    $dom=new DOMDocument;
    $dom->validateOnParse=false;
    $dom->standalone=true;
    $dom->preserveWhiteSpace=true;
    $dom->strictErrorChecking=false;
    $dom->substituteEntities=false;
    $dom->recover=true;
    $dom->formatOutput=true;
    $dom->loadHTMLFile( $url );
    libxml_clear_errors();
    
    /* Query the DOM and process nodes found */
    $xp=new DOMXPath( $dom );
    $col=$xp->query( $exp->articles );
    
    if( !empty( $col ) && $col->length > 0 ){
        foreach( $col as $node ){
            $output[]=(object)array(
                'title'         =>  $xp->query($exp->title,$node)->item(0)->nodeValue,
                'description'   =>  $xp->query($exp->description,$node)->item(0)->nodeValue
            );
        }
    }
    $dom = $xp = $col = $node = null;
    
    
    /* 
        The contents of the scrape are stored in the $output array
        and can be used whereever on the page you wish - or stored
        as a session variable and used elsewhere etc etc
    */
    if( !empty( $output ) ){
        /*
            removed `display:none` from div below.....
        */
        echo "
        <div id='box5' class='toggle'>
            <div id='services' class='services'>
                <div class='container' >
                    <div class='service-head text-center'>
                        <h2>NEWS</h2>
                        <span> </span>
                    </div>";
    
        /* iterate through output array where each member is an object */
        foreach( $output as $i => $obj ){
            echo "
                    <button class='accordion'>{$obj->title}</button>
                    <div class='panel1'>
                        <p>{$obj->description}</p>
                    </div>";
        }
    
        echo "
                </div>
            </div>
        </div>";
    }
    
    点赞 评论 复制链接分享

相关推荐