csdn产品小助手 2016-04-05 19:52 采纳率: 0%
浏览 34

用js抓取HTML

I'm trying to get the html of www.soccerway.com. In particular this:

enter image description here

that have the label-wrapper class I also tried with: select.nav-select but I can't get any content. What I did is:

1) Created a php filed called grabber.php, this file have this code:

<?php echo file_get_contents($_GET['url']); ?>

2) Created a index.html file with this content:

<!DOCTYPE html>
<html>
<head>
    <script src="http://ajax.googleapis.com/ajax/libs/jquery/1/jquery.min.js"></script>
    <meta charset=utf-8 />
    <title>test</title>
</head>
<body>

<div id="response"></div>

</body>

<script>
    $(function(){
        var contentURI= 'http://soccerway.com';    
        $('#response').load('grabber.php?url='+ encodeURIComponent(contentURI) + ' #label-wrapper');
    });
    var LI = document.querySelectorAll(".list li");
    var result = {};

    for(var i=0; i<LI.length; i++){
        var el = LI[i];
        var elData = el.dataset.value;
        if(elData) result[el.innerHTML] = elData; // Only if element has data-value attr
    }

    console.log( result );
</script>

</html>

in the div there is no content grabbed, I tested my js code for get all the link and working but I've inserted the html page manually.

  • 写回答

1条回答 默认 最新

  • weixin_33674437 2016-04-05 20:02
    关注

    I see a couple issues here.

    var contentURI= 'http:/soccerway.com #label-wrapper';
    

    You're missing the second slash in http://, and you're passing a URL with a space and an ID to file_get_contents. You'll want this instead:

    var contentURI = 'http://soccerway.com/';
    

    and then you'll need to parse out the item you're interested in from the resulting HTML.

    The #label-wrapper needs to be in the jQuery load() call, not the file_get_contents, and the contentURI variable needs to be properly escaped with encodeURIComponent:

    $('#response').load('grabber.php?url='+ encodeURIComponent(contentURI) + ' #label-wrapper');
    

    Your code also contains a massive vulnerability that's potentially very dangerous, as it allows anyone to access grabber.php with a url value that's a file location on your server. This could compromise your database password or other sensitive data on the server.

    评论

报告相同问题?

悬赏问题

  • ¥15 请问GPT语言模型怎么训练?
  • ¥15 已知平面坐标系(非直角坐标系)内三个点的坐标,反求两坐标轴的夹角
  • ¥15 webots有问题,无响应
  • ¥15 数据量少可以用MK趋势分析吗
  • ¥15 使用VH6501干扰RTR位,CANoe上显示的错误帧不足32个就进入bus off快慢恢复,为什么?
  • ¥15 大智慧怎么编写一个选股程序
  • ¥100 python 调用 cgps 命令获取 实时位置信息
  • ¥15 两台交换机分别是trunk接口和access接口为何无法通信,通信过程是如何?
  • ¥15 C语言使用vscode编码错误
  • ¥15 用KSV5转成本时,如何不生成那笔中间凭证