duangang4940 2011-11-30 19:14
浏览 198
已采纳

jQuery / PHP - 从外部页面抓取所有链接

I am trying to make a program that grabs all the links from an external website and display them using jQuery and PHP. Here are my steps:

  1. Get the html of a page using php (load.php)
  2. Put that html into a div
  3. Get all elements in that div

Here is my code (index.html):

<html>
<head>
    <title>Test</title>
    <script type="text/javascript" src="jquery.js">//jquery</script>
    <script type="text/javascript">
        $(function() { //on load
            var url = "http://google.com";
            $.post('load.php', { url: url},
                function(html) {
                    $('#page').html(html); //loads html from the page into a div

                    var links = $('#page > a');
                    alert('links.length: ' + links.length); //PROBLEM: returns 0 
                    for(var i=0; i < links.length; i++)
                    {
                        alert(links[i]);
                    }
            });
        });
    </script>
</head>
<body>
<div id="page" style=""></div>
</body>
</html>

And the php code (load.php):

<?php
$url = $_POST['url'];
$html = file_get_contents($url);
echo $html;
?>

The page is being loaded into the div correctly, so I know it is grabbing the html, but links.length is returning 0. So it is something wrong with this line:

var links = $('#page > a');

However, when I try to load it on my test page with html:

<a href="http://google.com">link1</a>
<a href="http://yahoo.com">link2</a>

links.length returns 2. Why does it work with my test page and not google?

  • 写回答

3条回答 默认 最新

  • douji5523 2011-11-30 19:18
    关注

    probably because your test page contains a document fragment (only the 2 links) while a page like google contains a whole document (starting with a doctype declaration and <html> and so on...).

    inserting such html into a div element probably breaks your DOM.

    I'd advise to

    1. parse the HTML serverside and pass only the results to your JS app.
      OR
    2. load the page (from your server) in an iframe and access it's document to get to its link collection (documentOfIframe.links)
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(2条)

报告相同问题?

悬赏问题

  • ¥15 微信会员卡等级和折扣规则
  • ¥15 微信公众平台自制会员卡可以通过收款码收款码收款进行自动积分吗
  • ¥15 随身WiFi网络灯亮但是没有网络,如何解决?
  • ¥15 gdf格式的脑电数据如何处理matlab
  • ¥20 重新写的代码替换了之后运行hbuliderx就这样了
  • ¥100 监控抖音用户作品更新可以微信公众号提醒
  • ¥15 UE5 如何可以不渲染HDRIBackdrop背景
  • ¥70 2048小游戏毕设项目
  • ¥20 mysql架构,按照姓名分表
  • ¥15 MATLAB实现区间[a,b]上的Gauss-Legendre积分