dongzou9047 2017-04-19 12:19
浏览 30
已采纳

从字符串中提取url,之间没有空格

Let's say I have a string like this:

$urlsString = "http://foo.com/barhttps://bar.com//foo.com/foo/bar"

and I want to get an array like this:

array(
    [0] => "http://foo.com/bar",
    [1] => "https://bar.com",
    [0] => "//foo.com/foo/bar"
);

I'm looking to something like:

preg_split("~((https?:)?//)~", $urlsString, PREG_SPLIT_NO_EMPTY|PREG_SPLIT_DELIM_CAPTURE);

Where PREG_SPLIT_DELIM_CAPTURE definition is:

If this flag is set, parenthesized expression in the delimiter pattern will be captured and returned as well.

That said, the above preg_split returns:

array (size=3)
  0 => string '' (length=0)
  1 => string 'foo.com/bar' (length=11)
  2 => string 'bar.com//foo.com/foo/bar' (length=24)

Any idea of what I'm doing wrong or any other idea?

PS: I was using this regex until I've realized that it doesn't cover this case.

Edit:

As @sidyll pointed, I'm missing the $limit in the preg_split parameters. Anyway, there is something wrong with my regex, so I will use @WiktorStribiżew suggestion.

  • 写回答

1条回答 默认 最新

  • douzhan8303 2017-04-19 12:25
    关注

    You may use a preg_match_all with the following regex:

    '~(?:https?:)?//.*?(?=$|(?:https?:)?//)~'
    

    See the regex demo.

    Details:

    • (?:https?:)? - https: or http:, optional (1 or 0 times)
    • // - double /
    • .*? - any 0+ chars other than line break as few as possible up to the first
    • (?=$|(?:https?:)?//) - either of the two:
      • $ - end of string
      • (?:https?:)?// - https: or http:, optional (1 or 0 times), followed with a double /

    Below is a PHP demo:

    $urlsString = "http://foo.com/barhttps://bar.com//foo.com/foo/bar";
    preg_match_all('~(?:https?:)?//.*?(?=$|(?:https?:)?//)~', $urlsString, $urls);
    print_r($urls);
    // => Array ( [0] => http://foo.com/bar [1] => https://bar.com [2] => //foo.com/foo/bar )
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 微信公众平台自制会员卡可以通过收款码收款码收款进行自动积分吗
  • ¥15 随身WiFi网络灯亮但是没有网络,如何解决?
  • ¥15 gdf格式的脑电数据如何处理matlab
  • ¥20 重新写的代码替换了之后运行hbuliderx就这样了
  • ¥100 监控抖音用户作品更新可以微信公众号提醒
  • ¥15 UE5 如何可以不渲染HDRIBackdrop背景
  • ¥70 2048小游戏毕设项目
  • ¥20 mysql架构,按照姓名分表
  • ¥15 MATLAB实现区间[a,b]上的Gauss-Legendre积分
  • ¥15 delphi webbrowser组件网页下拉菜单自动选择问题