dshm8998473 2012-05-06 16:09
浏览 15

使用PHP检查网站链接

I am building a script on my website but find myself a little confused how to get this to work well with minimal coding.

Basically all that is needed is so i can input a url for instance, domain.com and it should scan that homepage for a link that points to my domain and also check and see if rel="nofollow" is assigned or not and return true if the link is there with no rel="nofollow" or false if no link, or a link with rel="nofollow" on it.

How would i go about this, or where would i start.

I've googled how to create a spider but its all far to much information and complex for a basic script i am trying to create!

  • 写回答

2条回答 默认 最新

  • douqiao7958 2012-05-06 16:14
    关注

    What you ask for isn't as simple as you might think. To do this properly, you need to use a DOM parser, such as DOMDocument.

    http://www.php.net/manual/en/class.domdocument.php

    You can use its loadHTML() method to parse the web page you want to scan through. From there, you can use its variety of functions to find the specific link you're looking for, and check its attributes to make sure the URL is correct, and your rel="nofollow" is in there.

    I assure you that in the end, this is much easier than just a string search for your URL. Going down the blind search road will lead you to inaccurate results, and will be much more of a hassle than you realize.

    评论

报告相同问题?

悬赏问题

  • ¥50 如何用脚本实现输入法的热键设置
  • ¥20 我想使用一些网络协议或者部分协议也行,主要想实现类似于traceroute的一定步长内的路由拓扑功能
  • ¥30 深度学习,前后端连接
  • ¥15 孟德尔随机化结果不一致
  • ¥15 apm2.8飞控罗盘bad health,加速度计校准失败
  • ¥15 求解O-S方程的特征值问题给出边界层布拉休斯平行流的中性曲线
  • ¥15 谁有desed数据集呀
  • ¥20 手写数字识别运行c仿真时,程序报错错误代码sim211-100
  • ¥15 关于#hadoop#的问题
  • ¥15 (标签-Python|关键词-socket)