douqiao3453 2013-07-01 12:49
浏览 37
已采纳

正则表达式解析链接的URL,但仅当它们不是链接时[重复]

This question already has an answer here:

We use the following regular expression to convert URLs in text to links, which are shortened with ellipsis in the middle if they are too long:

/**
 * Replace all links with <a> tags (shortening them if needed)
 */
$match_arr[] = '/((http|ftp)+(s)?:\/\/[^<>\s,!\)]+)/ie';
$replace_arr[] = "'<a href=\"\\0\" title=\"\\0\" target=\"_blank\">' . " .
    "( mb_strlen( '$0' ) > {$maxlength} ? mb_substr( '$0', 0, " . ( $maxlength / 2 ) . " ) . '…' . " .
    "mb_substr( '$0', -" . ( $maxlength / 2 ) . " ) : '$0' ) . " .
"'</a>'";

This is working. However, I found that if there is a link in the text already, like:

$text = '... <a href="http://www.google.com">http://www.google.com</a> ...';

it will match both URLs, so it will try to create two more <a> tags, totally messing up the DOM of course.

How can I prevent the regex from matching if the link is already inside an <a> tag? It will also be in the title attribute, so basically I just want to skip every <a> tag completely.

</div>
  • 写回答

1条回答 默认 最新

  • dragon8997474 2013-07-01 12:53
    关注

    The simplest way (with a regex, which arguably is not the most reliable tool in this situation) would probably be to make sure that no </a> follows after your link:

    #(http|ftp)+(s)?://[^<>\s,!\)]++(?![^<]*</a>)#ie
    

    I'm using possessive quantifiers to make sure that the entire URL will be matched (i. e. no backtracking in order to satisfy the lookahead).

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?