dongzou9047 2017-04-19 12:19
浏览 30
已采纳

从字符串中提取url,之间没有空格

Let's say I have a string like this:

$urlsString = "http://foo.com/barhttps://bar.com//foo.com/foo/bar"

and I want to get an array like this:

array(
    [0] => "http://foo.com/bar",
    [1] => "https://bar.com",
    [0] => "//foo.com/foo/bar"
);

I'm looking to something like:

preg_split("~((https?:)?//)~", $urlsString, PREG_SPLIT_NO_EMPTY|PREG_SPLIT_DELIM_CAPTURE);

Where PREG_SPLIT_DELIM_CAPTURE definition is:

If this flag is set, parenthesized expression in the delimiter pattern will be captured and returned as well.

That said, the above preg_split returns:

array (size=3)
  0 => string '' (length=0)
  1 => string 'foo.com/bar' (length=11)
  2 => string 'bar.com//foo.com/foo/bar' (length=24)

Any idea of what I'm doing wrong or any other idea?

PS: I was using this regex until I've realized that it doesn't cover this case.

Edit:

As @sidyll pointed, I'm missing the $limit in the preg_split parameters. Anyway, there is something wrong with my regex, so I will use @WiktorStribiżew suggestion.

  • 写回答

1条回答 默认 最新

  • douzhan8303 2017-04-19 12:25
    关注

    You may use a preg_match_all with the following regex:

    '~(?:https?:)?//.*?(?=$|(?:https?:)?//)~'
    

    See the regex demo.

    Details:

    • (?:https?:)? - https: or http:, optional (1 or 0 times)
    • // - double /
    • .*? - any 0+ chars other than line break as few as possible up to the first
    • (?=$|(?:https?:)?//) - either of the two:
      • $ - end of string
      • (?:https?:)?// - https: or http:, optional (1 or 0 times), followed with a double /

    Below is a PHP demo:

    $urlsString = "http://foo.com/barhttps://bar.com//foo.com/foo/bar";
    preg_match_all('~(?:https?:)?//.*?(?=$|(?:https?:)?//)~', $urlsString, $urls);
    print_r($urls);
    // => Array ( [0] => http://foo.com/bar [1] => https://bar.com [2] => //foo.com/foo/bar )
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 数学的三元一次方程求解
  • ¥20 iqoo11 如何下载安装工程模式
  • ¥15 本题的答案是不是有问题
  • ¥15 关于#r语言#的问题:(svydesign)为什么在一个大的数据集中抽取了一个小数据集
  • ¥15 C++使用Gunplot
  • ¥15 这个电路是如何实现路灯控制器的,原理是什么,怎么求解灯亮起后熄灭的时间如图?
  • ¥15 matlab数字图像处理频率域滤波
  • ¥15 在abaqus做了二维正交切削模型,给刀具添加了超声振动条件后输出切削力为什么比普通切削增大这么多
  • ¥15 ELGamal和paillier计算效率谁快?
  • ¥15 蓝桥杯单片机第十三届第一场,整点继电器吸合,5s后断开出现了问题