douwen9343 2018-09-24 19:22
浏览 107
已采纳

preg_replace不会从字符串中删除所有空格字符

I've got the following code, which should be comparing 2 strings after stripping all the whitespace, here is a simplified version of the function:

function not_same($type, $org_str1, $str2) {

    $str1 = preg_replace('/\s+/', '', $org_str1);
    $str2 = preg_replace('/\s+/', '', $str2);

    $tries = [];
    $tries[] = ["str1" => $str1, "str2" => $str2, "encoded1" => urlencode($str1), "encoded2" => urlencode($str2)];        

    if($str1 == $str2) {
        return true;
    } else {
        return false;
    }

}

I'm using this to check on a computer if the processor is the same as a matched model in my database, so $org_str1 is what my client says the computer it is running on has, and $str2 is the cpu in my database that the model should have.

Sometimes these strings have unneeded spaces, so during comparison I remove all of the whitspace so the text itself is compared.

Now I am getting computers back saying that the CPU is wrong, because the match is not made, because there is some whitespace that is not removed.

In this specific case, I'm trying to compare the string Client: Celeron® N3050 vs Server: Celeron® N3050. I'm logging each time what is actually being compared on my server, on my client it says it is comparing Client: Celeron® N3050 vs Server: Celeron®N3050

I tried copying and pasting this whitespace into a str_replace() function, but it did not solve the issue. After that, I got the idea of logging the string with urlencode(), this allows me to see exactly what this mysterious white character is, but I still am at a loss on how to fix the issue.

The strings after urlencode() are Client: Celeron%C2%AE%C2%A0N3050 vs Server: Celeron%C2%AEN3050

As you can see, there is still a whitespace character in my client string, encoded to %C2%A0. Why does preg_replace not get rid of this whitespace, and how can I programmatically remove it?

  • 写回答

1条回答 默认 最新

  • dongwen7423 2018-09-24 19:34
    关注

    \xC2\xA0 is a unicode non-breaking space. Add the u modifier to your regex.

    $raw = urldecode('Celeron%C2%AE%C2%A0N3050');
    
    var_dump(
        preg_replace('/\s+/', '', $raw),
        preg_replace('/\s+/u', '', $raw),
        urlencode($raw),
        urlencode(preg_replace('/\s+/u', '', $raw))
    );
    

    Output:

    string(16) "Celeron® N3050"
    string(14) "Celeron®N3050"
    string(24) "Celeron%C2%AE%C2%A0N3050"
    string(18) "Celeron%C2%AEN3050"
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 PADS Logic 原理图
  • ¥15 PADS Logic 图标
  • ¥15 电脑和power bi环境都是英文如何将日期层次结构转换成英文
  • ¥20 气象站点数据求取中~
  • ¥15 如何获取APP内弹出的网址链接
  • ¥15 wifi 图标不见了 不知道怎么办 上不了网 变成小地球了
  • ¥50 STM32单片机传感器读取错误
  • ¥50 power BI 从Mysql服务器导入数据,但连接进去后显示表无数据
  • ¥15 (关键词-阻抗匹配,HFSS,RFID标签天线)
  • ¥15 机器人轨迹规划相关问题