weixin_33724059 2012-09-20 08:38 采纳率: 0%
浏览 24

通过URL检索元内容

I have a script which allows to retrieve all sorts of information of a given url: JsFiddle

As you can see the meta content is derived from the 'baseUrl' (at the begining of the script). There is also a div (#links) for all the a href on that page (baseUrl). My question: How do I get the meta content of the links instead of the baseUrl?

  • 写回答

1条回答 默认 最新

  • weixin_33688840 2012-09-20 09:42
    关注

    What your script is doing is loading the main page and parsing out the data. In order to get the meta tags on the linked urls you need to basically run the script again with the link URLs instead of just your baseUrl. If you loop this indefinitely, you have basically built a web crawler.

    评论

报告相同问题?

悬赏问题

  • ¥15 基于卷积神经网络的声纹识别
  • ¥15 Python中的request,如何使用ssr节点,通过代理requests网页。本人在泰国,需要用大陆ip才能玩网页游戏,合法合规。
  • ¥100 为什么这个恒流源电路不能恒流?
  • ¥15 有偿求跨组件数据流路径图
  • ¥15 写一个方法checkPerson,入参实体类Person,出参布尔值
  • ¥15 我想咨询一下路面纹理三维点云数据处理的一些问题,上传的坐标文件里是怎么对无序点进行编号的,以及xy坐标在处理的时候是进行整体模型分片处理的吗
  • ¥15 CSAPPattacklab
  • ¥15 一直显示正在等待HID—ISP
  • ¥15 Python turtle 画图
  • ¥15 stm32开发clion时遇到的编译问题