dt2002 2013-04-27 08:07
浏览 281

如何从文本字符串中获取URL?

I have a string that conaint URLs and other texts. I want to get all the URLs in to $matches array. But the following code wouldn't get all the URLs in to $matches array:

$matches = array();
$text = "soundfly.us schoollife.edu hello.net some random news.yahoo.com text http://tinyurl.com/9uxdwc some http://google.com random text http://tinyurl.com/787988 and others will en.wikipedia.org/wiki/Country_music URL";
preg_match_all('$\b(https?|ftp|file)://[-A-Z0-9+&@#/%?=~_|!:,.;]*[-A-Z0-9+&@#/%=~_|]$i', $text, $matches);
print_r($matches);

Above code will get:

http://tinyurl.com/9uxdwc
http://google.com
http://tinyurl.com/787988

.

but misses the following 4 URLs:

schoollife.edu 
hello.net 
news.yahoo.com
en.wikipedia.org/wiki/Country_music

Can you please tell me with an example, how can I modify above code to get all the URLs

  • 写回答

1条回答 默认 最新

  • downloadbooks_2014 2014-06-10 17:00
    关注

    Is this what you need?

    $matches = array();
    $text = "soundfly.us schoollife.edu hello.net some random news.yahoo.com text http://tinyurl.com/9uxdwc some http://google.com random text http://tinyurl.com/787988 and others will en.wikipedia.org/wiki/Country_music URL";
    preg_match_all('$\b((https?|ftp|file)://)?[-A-Z0-9+&@#/%?=~_|!:,.;]*\.[-A-Z0-9+&@#/%=~_|]+$i', $text, $matches);
    print_r($matches);
    

    I made the protocol part optionnal, add the use of a dot spliting the domain and the TLD and a "+" to get the full string after that dot (TLD + extra informations)

    Result is:

    [0] => soundfly.us 
    [1] => schoollife.edu 
    [2] => hello.net 
    [3] => news.yahoo.com 
    [4] => http://tinyurl.com/9uxdwc 
    [5] => http://google.com 
    [6] => http://tinyurl.com/787988 
    [7] => en.wikipedia.org/wiki/Country_music
    

    Also works with IP address because of the mandatory presence of a dot into. Tested with string "192.168.0.1" and "192.168.0.1/test/index.php"

    评论

报告相同问题?

悬赏问题

  • ¥20 怎么用dlib库的算法识别小麦病虫害
  • ¥15 华为ensp模拟器中S5700交换机在配置过程中老是反复重启
  • ¥15 java写代码遇到问题,求帮助
  • ¥15 uniapp uview http 如何实现统一的请求异常信息提示?
  • ¥15 有了解d3和topogram.js库的吗?有偿请教
  • ¥100 任意维数的K均值聚类
  • ¥15 stamps做sbas-insar,时序沉降图怎么画
  • ¥15 买了个传感器,根据商家发的代码和步骤使用但是代码报错了不会改,有没有人可以看看
  • ¥15 关于#Java#的问题,如何解决?
  • ¥15 加热介质是液体,换热器壳侧导热系数和总的导热系数怎么算