dongxia2030 2013-07-13 12:03
浏览 65

PHP正则表达式将rel =“nofollow”添加到外部链接[重复]

This question already has an answer here:

I need to add rel="nofollow" to all external links (not leading to my site or its subdomains).

I have done this in two steps, at first I add rel="nofollow" to all links (even internal links) using the following regular expression:

<a href="http([s]?)://(.*?)"

Then in the second step I eliminate rel="nofollow" for internal links (my site and its subdomains) using the following regular expression:

<a href="http([s]?)://(www\.|forum\.|blog\.)mysite.com(.*?)" rel="nofollow"

How can I do this only in one step? Is it possible?

</div>
  • 写回答

1条回答 默认 最新

  • dongyuan1870 2013-07-13 12:22
    关注

    The DOM way:

    $doc = new DOMDocument();
    @$doc -> loadHTMLFile($url); // url of the html file
    $links = $doc->getElementsByTagName('a');
    
    foreach($links as $link) {
        $href = $link->getAttribute('href');
        if (preg_match('~^https?://(?>[^/m]++|m++(?!ysite.com\b))*~', $href))
            $link->setAttribute('rel', 'nofollow');
    }
    
    $doc->saveHTMLFile($url);
    
    评论

报告相同问题?

悬赏问题

  • ¥15 求差集那个函数有问题,有无佬可以解决
  • ¥15 【提问】基于Invest的水源涵养
  • ¥20 微信网友居然可以通过vx号找到我绑的手机号
  • ¥15 寻一个支付宝扫码远程授权登录的软件助手app
  • ¥15 解riccati方程组
  • ¥15 display:none;样式在嵌套结构中的已设置了display样式的元素上不起作用?
  • ¥15 使用rabbitMQ 消息队列作为url源进行多线程爬取时,总有几个url没有处理的问题。
  • ¥15 Ubuntu在安装序列比对软件STAR时出现报错如何解决
  • ¥50 树莓派安卓APK系统签名
  • ¥65 汇编语言除法溢出问题