2013-07-23 16:20 阅读 43

PHP Regex关键词匹配

I have a text field where the user will enter comma separated keywords or key phrases, and the server will then use these values to check multiple bodies of text for matches.

So basically what I need is to match an exact phrase, case insensitive, with possible spaces in a body of text.

I can match keywords easily, by generating the following regex:

Example keywords: peanut, butter, jelly

Regex generated: /peanut|butter|jelly/i

However having spaces does not work. Even if I replace the spaces in the given values with \s

Example: peanut butter, jelly sandwich, delicious

Regex: /peanut\sbutter|jelly\ssandwich|delicious/i

What would be a correct regex to match the phrases exactly ? Case insensitive and using PHP's preg_match ?



This is what I am doing:

$keywordsArray = array_map( 'trim', explode( ',', $keywords ) );
$keywordsArrayEscaped = array_map( 'preg_quote', $keywordsArray );
$keywordsRegex = '/' . implode( '|', $keywordsArrayEscaped ) . '/i';

The above generates the expressions as described above ( Without the replacement of spaces to \s, since it didn't work. )

Following that I simple do preg_match( $keywordsRegex, $text );

  • 点赞
  • 写回答
  • 关注问题
  • 收藏
  • 复制链接分享

2条回答 默认 最新

  • 已采纳
    dongzhang7961 dongzhang7961 2013-07-23 16:31

    I don't see why it wouldn't work with spaces or \s. It should. But to answer the question you asked in general terms, the way to match exact phrases in a regex is to surround them with \Q and \E:

    /\Q<phrase 1>\E|\Q<phrase 2>\E|\Q<phrase 3>\E/

    That's normally used for text that contains escapes or regex metacharacters. You really shouldn't need that for spaces.

    点赞 评论 复制链接分享
  • dongzaobei0942 dongzaobei0942 2013-07-23 16:40

    The only issue I could really find with your code is that you aren't filling in the third field for the results;

    preg_match($keywordsRegex, $text, $results);
    点赞 评论 复制链接分享