douao1854 2015-04-28 13:39
浏览 31
已采纳

PHP中的REGEXP用于捕获特定的域链接

So I'm working on a regexp to catch all links in a string, meaning wordsthat start with with a protocol like http, https etc, words that start with www. or words that end in some specific domains, ".com", ".hr" and ".net". But somehow this regexp I made always returns all the links that start with a protocol, but only the last one of those that end in a specific domain. What am I doing wrong :|? Many thanks!

$description='test.com test2.hr http://www.test3.hr https://test4.com test3.net';
$pattern = '/\b(?:(?:https?|ftp|file):\/\/|www\.|ftp\.)[-A-Z0-9+&@#\/%=~_|$?!:,.]*[A-Z0-9+&@#\/%=~_|$]|(?:\b((?:[\w]+\.com$)|(?:[\w]+\.hr$)|(?:[\w]+\.net$)))/i';
preg_match_all($pattern, $description, $out);
var_dump($out[0]);
  • 写回答

1条回答 默认 最新

  • douhanzhuo6905 2015-04-28 13:52
    关注

    There are a few problems with your original regex. First, you should be treating the protocol with the conditional modifier ?. I'm not sure why you're using the second block of [A-Z0-9+&@#\/%=~_|$] or why you're using the | operator after that; if there's a specific reason, please let me know. Finally, $ only works as end-of-string when you use it at the very end of the regex; otherwise, you should use \Z, which matches end-of-string at any point in the regex, although I don't think you want to be matching end-of-string in here anyway. I've rewritten the regex below in the way I think you want it to work:

    $description='test.com test2.hr http://www.test3.hr https://test4.com test3.net trash string don\'t match test4.net';
    $pattern = '/(?:(?:https?|ftp|file):\/\/(?:www|ftp)\.)?[-A-Z0-9+&@#\/%=~_|$?!:,.]*(\.[A-Z]+)/i';
    preg_match_all($pattern, $description, $out);
    var_dump($out[0]);
    

    returns:

    array(6) {
      [0]=>
      string(8) "test.com"
      [1]=>
      string(8) "test2.hr"
      [2]=>
      string(19) "http://www.test3.hr"
      [3]=>
      string(17) "https://test4.com"
      [4]=>
      string(9) "test3.net"
      [5]=>
      string(9) "test4.net"
    }
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 基于卷积神经网络的声纹识别
  • ¥15 Python中的request,如何使用ssr节点,通过代理requests网页。本人在泰国,需要用大陆ip才能玩网页游戏,合法合规。
  • ¥100 为什么这个恒流源电路不能恒流?
  • ¥15 有偿求跨组件数据流路径图
  • ¥15 写一个方法checkPerson,入参实体类Person,出参布尔值
  • ¥15 我想咨询一下路面纹理三维点云数据处理的一些问题,上传的坐标文件里是怎么对无序点进行编号的,以及xy坐标在处理的时候是进行整体模型分片处理的吗
  • ¥15 CSAPPattacklab
  • ¥15 一直显示正在等待HID—ISP
  • ¥15 Python turtle 画图
  • ¥15 stm32开发clion时遇到的编译问题