du7535 2012-07-21 22:10
浏览 29
已采纳

获取链接php中的唯一值不起作用

I have a string of text, which I then grab a URL from with php regex. There can be any number of links, so I'm using

 preg_match_all

The problem is that for some reason when I put in one link, it's thinking that there are 3. When I do array unique it filters out the middle value, but not the last one.

Here is the code below

 $bodyMessage = imap_body($hMail,$idxMsg);
 $bodyMessage = quoted_printable_decode($bodyMessage);

 preg_match_all('((https?|ftp|gopher|telnet|file|notes|ms-help):((//)|(\\\\))+[\w\d:#@%/;$()~_?\+-=\\\.&]*)', $bodyMessage, $matches, PREG_PATTERN_ORDER);
 $links = array_unique($matches[0]);
 print_r($links); 

The output of print_r($links) is:

 Array ( [0] => http://usnews.msnbc.msn.com/_news/2012/07/20/12861792-6-year-old-girl-confirmed-to-have-been-killed-in-colorado-theater-shootings?lite 
 [2] => http://usnews.msnbc.msn.com/_news/2012/07/20/12861792-6-year-old-girl-confirmed-to-have-been-killed-in-colorado-theater-shootings?lite

The body of the email that it parses is:

 --20cf300e4d7d02c34004c55e1489 Content-Type: text/plain; charset=ISO-8859-1 @bill http://usnews.msnbc.msn.com/_news/2012/07/20/12861792-6-year-old-girl-confirmed-to-have-been-killed-in-colorado-theater-shootings?lite --20cf300e4d7d02c34004c55e1489 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable @bill 

Any ideas? Thanks!

Edit:

I followed the suggestion, by trimming, and that returns an empty array

 function trims($l){
                    trim($l);   
                }
                $links = $matches[0];
                $trimmedLinks = array_map("trims", $links);
                $trimmedLinks = array_unique($trimmedLinks);
                print_r($trimmedLinks); // = Array ( [0] => ) 

EDIT:

I think this might have something to do with grabbing the body message from imap. When i copy and paste the the string of text from imap, and set that = to $bodyMessage, then it works... Suggestions?

  • 写回答

1条回答 默认 最新

  • dongyu4863 2012-07-21 22:18
    关注

    You should have pattern like this

    ((?:https?|ftp|gopher|telnet|file|notes|ms-help):(?:(?://)|(?:\\\\))+[\w\d:#@%/;$()~_?\+-=\\\.&]*)
    

    with non-capturing groups. If you put ?: in bracket, you'll get non-capturing group. And then an array will be:

    Array ( [0] => http://usnews.msnbc.msn.com/_news/2012/07/20/12861792-6-year-old-girl-confirmed-to-have-been-killed-in-colorado-theater-shootings?lite )
    

    Edit: The answer to this problem is to use imap_fetchbody instead

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 相敏解调 matlab
  • ¥15 求lingo代码和思路
  • ¥15 公交车和无人机协同运输
  • ¥15 stm32代码移植没反应
  • ¥15 matlab基于pde算法图像修复,为什么只能对示例图像有效
  • ¥100 连续两帧图像高速减法
  • ¥15 如何绘制动力学系统的相图
  • ¥15 对接wps接口实现获取元数据
  • ¥20 给自己本科IT专业毕业的妹m找个实习工作
  • ¥15 用友U8:向一个无法连接的网络尝试了一个套接字操作,如何解决?