dsxjot8620 2012-03-28 16:36
浏览 63
已采纳

正则表达式正则表达式匹配大多数URL需要改进

I need a function which will check for the existing URLs in a string.

function linkcleaner($url) {
$regex="(?i)\b((?:[a-z][\w-]+:(?:/{1,3}|[a-z0-9%])|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:'".,<>?«»“”‘’]))";

if(preg_match($regex, $url, $matches)) {
echo $matches[0];
}
}

The regular expression is taken from the John Gruber's blog, where he addressed the problem of creating a regex matching all the URLs. Unfortunately, I can't make it work. It seems the problem is coming from the double quotes inside the regex or the other punct symbols at the end of the expression. Any help is appreciated. Thank you!

  • 写回答

4条回答 默认 最新

  • douxiajia6309 2012-03-28 17:05
    关注

    Apart from @tandu's answer, you also need delimiters for a regex in php.

    The easiest would be to start and end your pattern with an # as that character does not appear in it:

    $regex="#(?i)\b((?:[a-z][\w-]+:(?:/{1,3}|[a-z0-9%])|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:\'\".,<>?«»“”‘’]))#";
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(3条)

报告相同问题?

悬赏问题

  • ¥15 写一个方法checkPerson,入参实体类Person,出参布尔值
  • ¥15 我想咨询一下路面纹理三维点云数据处理的一些问题,上传的坐标文件里是怎么对无序点进行编号的,以及xy坐标在处理的时候是进行整体模型分片处理的吗
  • ¥15 CSAPPattacklab
  • ¥15 一直显示正在等待HID—ISP
  • ¥15 Python turtle 画图
  • ¥15 关于大棚监测的pcb板设计
  • ¥15 stm32开发clion时遇到的编译问题
  • ¥15 lna设计 源简并电感型共源放大器
  • ¥15 如何用Labview在myRIO上做LCD显示?(语言-开发语言)
  • ¥15 Vue3地图和异步函数使用