dqzve68846 2012-04-29 11:32
浏览 29
已采纳

匹配缩短服务数组的URL

Consider the following list of URLs:

1 http://www.cnn.com/international/stories/423423532
2 http://www.traderscreener.com/blah
3 http://is.gd/fsdaGdfd3
4 http://goo.gl/23V534
5 http://bit.ly/54HFD
6 http://stackoverflow.com/question/ask

I would like to expand shortened URLs to their original form:

$headers = get_headers($URL, 1);
if (!empty($headers['Location'])) {
  $headers['Location'] = (array) $headers['Location'];
  $URL = array_pop($headers['Location']);
}

However, I need to match all URLs against an array of shortening services:

$array(
  'is.gd', 'bit.ly', 'goo.gl', 'wibi.us', 'tinyurl.com' // etc
)

In this case, this would have to filter out URLs 3, 4, and 5. I believe the most easy way of doing this would be to grab *** in http://***/blah. Since I have little experience using regex, what would be the regex needed? Or is there a better way of approaching this?

  • 写回答

3条回答 默认 最新

  • duanhegn231318 2012-04-29 11:42
    关注

    By far the easiest way to do this is not to build a blacklist. Instead, query the URL and see if it redirects. Send a HEAD request, and look for the status code. If it's 3xx, then there's a redirect so you should look for the "Location" header and use that as the new URL.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(2条)

报告相同问题?

悬赏问题

  • ¥15 救!ENVI5.6深度学习初始化模型报错怎么办?
  • ¥30 eclipse开启服务后,网页无法打开
  • ¥30 雷达辐射源信号参考模型
  • ¥15 html+css+js如何实现这样子的效果?
  • ¥15 STM32单片机自主设计
  • ¥15 如何在node.js中或者java中给wav格式的音频编码成sil格式呢
  • ¥15 不小心不正规的开发公司导致不给我们y码,
  • ¥15 我的代码无法在vc++中运行呀,错误很多
  • ¥50 求一个win系统下运行的可自动抓取arm64架构deb安装包和其依赖包的软件。
  • ¥60 fail to initialize keyboard hotkeys through kernel.0000000000