正则表达式解析链接的URL，但仅当它们不是链接时[重复]

This question already has an answer here:

PHP autolink if not already linked 1 answer

We use the following regular expression to convert URLs in text to links, which are shortened with ellipsis in the middle if they are too long:

/**
 * Replace all links with <a> tags (shortening them if needed)
 */
$match_arr[] = '/((http|ftp)+(s)?:\/\/[^<>\s,!\)]+)/ie';
$replace_arr[] = "'<a href=\"\\0\" title=\"\\0\" target=\"_blank\">' . " .
    "( mb_strlen( '$0' ) > {$maxlength} ? mb_substr( '$0', 0, " . ( $maxlength / 2 ) . " ) . '…' . " .
    "mb_substr( '$0', -" . ( $maxlength / 2 ) . " ) : '$0' ) . " .
"'</a>'";

This is working. However, I found that if there is a link in the text already, like:

$text = '... <a href="http://www.google.com">http://www.google.com</a> ...';

it will match both URLs, so it will try to create two more <a> tags, totally messing up the DOM of course.

How can I prevent the regex from matching if the link is already inside an <a> tag? It will also be in the title attribute, so basically I just want to skip every <a> tag completely.

</div>

写回答
好问题 0 提建议
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

1条回答默认最新

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
dragon8997474 2013-07-01 12:53
关注
The simplest way (with a regex, which arguably is not the most reliable tool in this situation) would probably be to make sure that no </a> follows after your link:

#(http|ftp)+(s)?://[^<>\s,!\)]++(?![^<]*</a>)#ie

I'm using possessive quantifiers to make sure that the entire URL will be matched (i. e. no backtracking in order to satisfy the lookahead).
本回答被题主选为最佳回答 , 对您是否有帮助呢?

解决无用
评论打赏
分享
举报

评论

按下Enter换行，Ctrl+Enter发表内容

报告相同问题？

关注问题

php使用正则表达式获取图片url的方法
2020-10-24 21:12

本文介绍了在PHP中利用正则表达式来提取...此外，过度依赖正则表达式进行复杂的HTML解析可能会导致代码的可读性和可维护性下降，因此在面对复杂的HTML文档时，推荐使用专门的HTML解析库，如PHP中的`DOMDocument`类。
php 正则表达式提取网页超级链接url的函数
2020-10-29 10:42

- **正则表达式解析**： - `'匹配起始尖括号 `。 - `\s*a\s`: 匹配空格后紧跟字母`a`，然后跟一个或多个空格，即`<a>`标签。 - `.*?href\s*=\s*`: 匹配`href`属性之前的所有字符，包括空格等。 - `([\"\'])?`: ...
php用正则表达式匹配URL的简单方法
2020-10-26 17:04

但有时，我们可能需要更灵活的方式来解析URL，这时正则表达式就显得非常有用。 #### 一、理解URL的组成部分在深入讨论如何使用正则表达式匹配URL之前，我们需要先了解URL的基本结构。URL通常包含以下几部分： 1....
正则表达式解析数据源url中的ip、port、dbName
2024-07-24 15:43

叽哩咕噜~~的博客【代码】正则表达式解析数据源url中的ip、port、dbName。
PHP 中的正则表达式
2025-10-03 19:33

带土1的博客用定界符包裹正则 + 用专门函数执行操作。掌握三个函数，结合元字符（?()等）和修饰符，就能处理大多数字符串场景。多动手测试（比如修改上面的示例参数），很快就能熟练使用～
php URL验证正则表达式
2020-10-28 11:30

#### 正则表达式解析该正则表达式为：`/http:[\/]{2}[a-z]+[.]{1}[a-z\d\-]+[.]{1}[a-z\d]*[\/]*[A-Za-z\d]*[\/]*[A-Za-z\d]*[.]*html/`。 1. **协议部分**： - `http:` 匹配"http:"。 - `[\/]{2}` 匹配两个斜杠...
爬虫基础教程：使用 PHP 和正则表达式解析HTML
2024-07-17 18:46

IT大数据小助手的博客 PHP作为一种服务器端脚本语言，具有非常方便的HTML解析功能，常用的HTML解析类库包括simple_html_dom、 phpQuery等。该爬虫可以完成简单的链接提取功能，当然，我们可以结合其他正则...三、使用正则表达式解析HTML。
php匹配url任意字符串,php正则表达式解析字符串里的所有URL地址
2021-04-22 07:25

郑某猫的博客本文章给大家介绍在php正则表达式解析字符串里的所有URL地址实现代码，有需要了解学习的朋友可进入参考。分享一个同事写的URL正则表达式，缺点不支持中文URL：代码如下复制代码 (http[s]{0,1}|ftp)://[a-zA-Z0-9\....
PHP 正则表达式分析RSS
2020-10-30 02:47

在本文中，我们将深入探讨如何使用PHP处理RSS（Really Simple Syndication） feed，特别是通过正则表达式解析和处理RSS中的内容。首先，我们需要了解RSS是一种XML格式，用于发布新闻、博客和其他在线内容的摘要，...
php正则表达式完全教程之精华篇
2020-12-13 12:34

([^# ]*)/`：这是一个用于解析URL的正则表达式，`w+`匹配任何单词字符，`://`表示协议分隔符，`([^/:]+)`捕获域名，`(:d*)?`可选的端口号，`([^# ]*)`捕获路径信息。 - ` /^(?:Chapter|Section) [1-9][0-9]{0,1}$/`...
没有解决我的问题, 去提问

正则表达式解析链接的URL，但仅当它们不是链接时[重复]

1条回答 默认 最新

1条回答默认最新