douhe1002 2014-07-16 19:08
浏览 51
已采纳

从不同的域PHP加载网站

In my application I am loading product info from a supplier:

$start_url = "http://www.example.com/product/product_code";

These URLs are usually redirected by the supplier's website, and I have written a function that successfully finds the destination URL, like so:

$end_url = destination( $start_url );
echo "<a href=\"$start_url\">start url</a>"; // link get redirected to correct page
echo "<a href=\"$end_url\">end url</a>"; // links straight to correct page, no redirection

However, if I want to get the HTML from the page...

echo file_get_contents( $start_url );  // 404
echo file_get_contents( $end_url );  // 404

...I just get the supplier's 404 page (not a generic one but their custom one).

I have allow_url_fopen enabled; file_get_contents( "http://www.example.com/" ) works fine.

I can use either URL to load the expected content in an iframe client-side, but XSS security prevents me extracting the data I need.

The only thing I can think of is if the site is using an URL rewriter, could this mess things up?

The PHP is running on my local machine, so it should appear no different from me looking at the website via a browser as far as I'm aware.

  • 写回答

1条回答 默认 最新

  • dongmale0656 2014-07-17 11:24
    关注

    Thanks to @Loz Cherone ツ's comments, using cURL and changing the user agent worked.

    $user_agent = "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.13) Gecko/20080311 Firefox/2.0.0.13";
    
    $url = $_REQUEST["url"];  // e.g. www.example.com/product/ABC123            
    
    $ch = curl_init($url);
    curl_setopt($ch, CURLOPT_HEADER, 0);
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);  // follows any redirection
    curl_setopt($ch, CURLOPT_AUTOREFERER, true);
    curl_setopt($ch, CURLOPT_USERAGENT, $user_agent);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    
    echo curl_exec($ch);
    
    curl_close($ch);
    

    I then put the response into the srcdoc attribute of an iframe client-side so I can access the DOM.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 netty整合springboot之后自动重连失效
  • ¥15 悬赏!微信开发者工具报错,求帮改
  • ¥20 wireshark抓不到vlan
  • ¥20 关于#stm32#的问题:需要指导自动酸碱滴定仪的原理图程序代码及仿真
  • ¥20 设计一款异域新娘的视频相亲软件需要哪些技术支持
  • ¥15 stata安慰剂检验作图但是真实值不出现在图上
  • ¥15 c程序不知道为什么得不到结果
  • ¥40 复杂的限制性的商函数处理
  • ¥15 程序不包含适用于入口点的静态Main方法
  • ¥15 素材场景中光线烘焙后灯光失效