douxian7117 2013-03-02 03:42
浏览 35

scraperwiki:为什么我的刮刀适用于1个网址而不是另一个?

This is my first scraper https://scraperwiki.com/scrapers/my_first_scraper_1/

I managed to scrape google.com but not this page.

http://subeta.net/pet_extra.php?act=read&petid=1014561

any reasons why?

I have followed the documentation from here.

https://scraperwiki.com/docs/php/php_intro_tutorial/

And there is no reason why the code should not work.

  • 写回答

1条回答 默认 最新

  • doulei8475 2013-03-02 06:15
    关注

    It looks like you are specifying to find a specific element. Elements change dependent on the site you are scraping. So if it doesn't find the element you are looking for you get no return. Also I would look into creating your own scraping/spidering tool with curl. Not only will you learn a lot but you will find out a lot about how to scrape sites.

    Also a side not you might want to consider abiding by the robots.txt file on the website you are scraping from or ask permission before scraping as it is considered impolite.

    评论

报告相同问题?

悬赏问题

  • ¥17 pro*C预编译“闪回查询”报错SCN不能识别
  • ¥15 微信会员卡接入微信支付商户号收款
  • ¥15 如何获取烟草零售终端数据
  • ¥15 数学建模招标中位数问题
  • ¥15 phython路径名过长报错 不知道什么问题
  • ¥15 深度学习中模型转换该怎么实现
  • ¥15 HLs设计手写数字识别程序编译通不过
  • ¥15 Stata外部命令安装问题求帮助!
  • ¥15 从键盘随机输入A-H中的一串字符串,用七段数码管方法进行绘制。提交代码及运行截图。
  • ¥15 TYPCE母转母,插入认方向