doupai8533 2015-11-05 15:54
浏览 4

适用于Ajax的Web刮板/爬虫

I need to crawl a site and get all the links from it, the problem is - this site uses ajax, and Go's standart http.Get(..) will return something like:

 <body>
    //javascript here       
     <div class="content"></div>
    //javascript here
 </body>

Div is empty. Is there some solution with golang?

  • 写回答

2条回答 默认 最新

  • drxkx6149 2015-11-06 08:27
    关注

    http.Get(Url) just get the response of the Url. resp.Content is like:

    <body>
    //javascript here       
     <div class="content"></div>
    //javascript here
    </body>
    

    if you want to get the content in the div, you need to analysis the javascript and know how the ajax to get data. Then you can simulate the processes the get what you want.

    评论

报告相同问题?

悬赏问题

  • ¥15 python的qt5界面
  • ¥15 无线电能传输系统MATLAB仿真问题
  • ¥50 如何用脚本实现输入法的热键设置
  • ¥20 我想使用一些网络协议或者部分协议也行,主要想实现类似于traceroute的一定步长内的路由拓扑功能
  • ¥30 深度学习,前后端连接
  • ¥15 孟德尔随机化结果不一致
  • ¥15 apm2.8飞控罗盘bad health,加速度计校准失败
  • ¥15 求解O-S方程的特征值问题给出边界层布拉休斯平行流的中性曲线
  • ¥15 谁有desed数据集呀
  • ¥20 手写数字识别运行c仿真时,程序报错错误代码sim211-100