dongshao1156 2010-05-16 21:48
浏览 8
已采纳

搜索文本时网址和电子邮件正则表达式出现问题

I'm having problems with regular expressions that I got from regexlib. I am trying to do a preg_replace() on a some text and want to replace/remove email addresses and URLs (http/https/ftp).

The code that I am have is:

$sanitiseRegex = array(
    'email' => /'^([a-zA-Z0-9_\-\.]+)@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([a-zA-Z0-9\-]+\.)+))([a-zA-Z]{2,4}|[0-9]{1,3})(\]?)$/',
    'http' => '/^(http|https|ftp)\://[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,3}(:[a-zA-Z0-9]*)?/?([a-zA-Z0-9\-\._\?\,\'/\\\+&%\$#\=~])*$/',        
);

$replace = array(
    'xxxxx',
    'xxxxx'
);

$sanitisedText = preg_replace($sanitiseRegex, $replace, $text);

However I am getting the following error: Unknown modifier '/' and $sanitisedText is null.

Can anyone see the problem with what I am doing or why the regex is failing?

Thanks

  • 写回答

1条回答 默认 最新

  • donglu8779 2010-05-16 21:49
    关注

    For a start, your email string is opened incorrectly:

    'email' => /'^([a-zA-Z0-9_\-\.
    // should be
    'email' => '/^([a-zA-Z0-9_\-\.
    

    The other problem is that you are using / as a character to match and using it the start/end your URL regex, without escaping them in the regex. The simplest solution to simply use a different character to denote start/end of the regex, ie:

    'http' => '@^(http|https|ftp)\://[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,3}(:[a-zA-Z0-9]*)?/?([a-zA-Z0-9\-\._\?\,\'/\\\+&%\$#\=~])*$@'
    

    What is happening is that it sees '^(http|https|ftp)\:' as the regex, then starts looking for options. The first character after the 'end' of the regex is another '/' which is an invalid option, hence the error message.

    EDIT: Something quick that might fix the issue re: not matching. You could try the following instead:

    'http' => '@^(http|https|ftp)\://[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,3}(:[a-zA-Z0-9]*)?(/[a-zA-Z0-9\-\._\?\,\'/\\\+&%\$#\=~]*)?$@'
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?