duanjurong1347 2015-09-23 22:18
浏览 43

PHP DOM解析器会破坏页面,无法加载页面内容

I have created a php parser that must extract the price in a span tag, but when I echo the $html so that I could see how the page loads, it shows me a broken page with no contents. Instead only header and footer loads, but not the content. The content seems to load by JavaScript externally and my question is how can I load the html page with Dom so that JavaScript also loads? I need to let the whole content load so that I can get the divs and spans. This is my code:

<?php

require_once('simple_html_dom.php');

$url = 'http://oldnavy.gap.com/browse/product.do?cid=99570&vid=1&pid=714649002';

$dom = new domDocument('1.0', 'UTF-8');
$html = file_get_html($url);

echo $html;

if(is_object($html)){

    foreach ( $html->find('span#priceText') as $data){

        $raw_price = $data->innertext;

        echo $raw_price;


    }
 }
?>
  • 写回答

1条回答 默认 最新

  • douxiong4250 2015-09-24 14:00
    关注

    Alt aproach

    The link you are actually looking for (in his minimal expression) is this: http://oldnavy.gap.com/browse/productData.do?pid=714649

    Now load that using curl, put a value to the unknownShopperId cookie, explode it into an array and get the price you need:

    <?php
    $ch = curl_init();
    curl_setopt($ch, CURLOPT_VERBOSE, true);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
    curl_setopt($ch, CURLOPT_URL, "http://oldnavy.gap.com/browse/productData.do?pid=714649");
    curl_setopt($ch, CURLOPT_HTTPHEADER, array("Cookie: unknownShopperId=E853DA3B2607DDAA5F2FE13CE8D32ACF"));
    
    $result = curl_exec($ch);
    $explode = explode(',', $result);
    
    echo 'Original price: ' . $explode[92] . '<br/>' .
    'New price: ' . $explode[93] . '<br/>' .
    'Both prices: ' . $explode[13];
    

    The result will be: '$14.94'

    From now on, if you need another price you must know the intem's pid

    评论

报告相同问题?

悬赏问题

  • ¥100 set_link_state
  • ¥15 虚幻5 UE美术毛发渲染
  • ¥15 CVRP 图论 物流运输优化
  • ¥15 Tableau online 嵌入ppt失败
  • ¥100 支付宝网页转账系统不识别账号
  • ¥15 基于单片机的靶位控制系统
  • ¥15 真我手机蓝牙传输进度消息被关闭了,怎么打开?(关键词-消息通知)
  • ¥15 装 pytorch 的时候出了好多问题,遇到这种情况怎么处理?
  • ¥20 IOS游览器某宝手机网页版自动立即购买JavaScript脚本
  • ¥15 手机接入宽带网线,如何释放宽带全部速度