doudi4137 2015-08-31 07:19
浏览 25
已采纳

preg_match:找不到具有尾随特殊字符的子字符串

I have a function which uses preg_match to check for if a substring is in another string. Today I realize that if substring has trailing special characters like special regular expression characters (. \ + * ? [ ^ ] $ ( ) { } = ! < > | : -) or @, my preg_match can't find the substring even though it is there.

This works, returns "A match was found."

$find = "website scripting";
$string =  "PHP is the website scripting language of choice.";

if (preg_match("/\b" . $find . "\b/i", $string)) {
    echo "A match was found.";
} else {
    echo "A match was not found.";
}

But this doesn't, returns "A match was not found."

$find = "website scripting @";
$string =  "PHP is the website scripting @ language of choice.";

if (preg_match("/\b" . $find . "\b/i", $string)) {
    echo "A match was found.";
} else {
    echo "A match was not found.";
}

I have tried preg_quote, but it doesn't help.

Thank you for any suggestions!

Edit: Word boundary is required, that's why I use \b. I don't want to find "phone" in "smartphone".

  • 写回答

2条回答 默认 最新

  • dre26973 2015-08-31 07:24
    关注

    You can just check if the characters around the search word are not word characters with look-arounds:

    $find = "website scripting @";
    $string =  "PHP is the website scripting @ language of choice.";
    
    if (preg_match("/(?<!\\w)" . preg_quote($find, '/') . "(?!\\w)/i", $string)) {
        echo "A match was found.";
    } else {
        echo "A match was not found.";
    }
    

    See IDEONE demo

    Result: A match was found.

    Note the double slash used with \w in (?<!\\w) and (?!\\w), as you have to escape regex special characters in interpolated strings.

    The preg_quote function is necessary as the search word - from what I see - can have special characters, and some of them must be escaped if intended to be matched as literal characters.

    UPDATE

    There is a way to build a regex with smartly placed word boundaries around the keyword, but the performance will be worse compared with the approach above. Here is sample code:

    $string =  "PHP is the website scripting @ language of choice.";
    
    $find = "website scripting @";
    $find = preg_quote($find);
    if (preg_match('/\w$/u', $find)) {   //  Setting trailing word boundary
        $find .= '\\b'; 
    } 
    if (preg_match('/^\w/u', $find)) {   //  Setting leading word boundary
        $find = '\\b' . $find;
    }
    
    if (preg_match("/" . $find . "/ui", $string)) {
        echo "A match was found.";
    } else {
        echo "A match was not found.";
    }
    

    See another IDEONE demo

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥20 求个正点原子stm32f407开发版的贪吃蛇游戏
  • ¥15 正弦信号发生器串并联电路电阻无法保持同步怎么办
  • ¥15 划分vlan后,链路不通了?
  • ¥20 求各位懂行的人,注册表能不能看到usb使用得具体信息,干了什么,传输了什么数据
  • ¥15 个人网站被恶意大量访问,怎么办
  • ¥15 Vue3 大型图片数据拖动排序
  • ¥15 Centos / PETGEM
  • ¥15 划分vlan后不通了
  • ¥20 用雷电模拟器安装百达屋apk一直闪退
  • ¥15 算能科技20240506咨询(拒绝大模型回答)