douwen9343 2018-09-24 19:22
浏览 107
已采纳

preg_replace不会从字符串中删除所有空格字符

I've got the following code, which should be comparing 2 strings after stripping all the whitespace, here is a simplified version of the function:

function not_same($type, $org_str1, $str2) {

    $str1 = preg_replace('/\s+/', '', $org_str1);
    $str2 = preg_replace('/\s+/', '', $str2);

    $tries = [];
    $tries[] = ["str1" => $str1, "str2" => $str2, "encoded1" => urlencode($str1), "encoded2" => urlencode($str2)];        

    if($str1 == $str2) {
        return true;
    } else {
        return false;
    }

}

I'm using this to check on a computer if the processor is the same as a matched model in my database, so $org_str1 is what my client says the computer it is running on has, and $str2 is the cpu in my database that the model should have.

Sometimes these strings have unneeded spaces, so during comparison I remove all of the whitspace so the text itself is compared.

Now I am getting computers back saying that the CPU is wrong, because the match is not made, because there is some whitespace that is not removed.

In this specific case, I'm trying to compare the string Client: Celeron® N3050 vs Server: Celeron® N3050. I'm logging each time what is actually being compared on my server, on my client it says it is comparing Client: Celeron® N3050 vs Server: Celeron®N3050

I tried copying and pasting this whitespace into a str_replace() function, but it did not solve the issue. After that, I got the idea of logging the string with urlencode(), this allows me to see exactly what this mysterious white character is, but I still am at a loss on how to fix the issue.

The strings after urlencode() are Client: Celeron%C2%AE%C2%A0N3050 vs Server: Celeron%C2%AEN3050

As you can see, there is still a whitespace character in my client string, encoded to %C2%A0. Why does preg_replace not get rid of this whitespace, and how can I programmatically remove it?

  • 写回答

1条回答 默认 最新

  • dongwen7423 2018-09-24 19:34
    关注

    \xC2\xA0 is a unicode non-breaking space. Add the u modifier to your regex.

    $raw = urldecode('Celeron%C2%AE%C2%A0N3050');
    
    var_dump(
        preg_replace('/\s+/', '', $raw),
        preg_replace('/\s+/u', '', $raw),
        urlencode($raw),
        urlencode(preg_replace('/\s+/u', '', $raw))
    );
    

    Output:

    string(16) "Celeron® N3050"
    string(14) "Celeron®N3050"
    string(24) "Celeron%C2%AE%C2%A0N3050"
    string(18) "Celeron%C2%AEN3050"
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 关于#java#的问题,请各位专家解答!
  • ¥15 急matlab编程仿真二阶震荡系统
  • ¥20 TEC-9的数据通路实验
  • ¥15 ue5 .3之前好好的现在只要是激活关卡就会崩溃
  • ¥50 MATLAB实现圆柱体容器内球形颗粒堆积
  • ¥15 python如何将动态的多个子列表,拼接后进行集合的交集
  • ¥20 vitis-ai量化基于pytorch框架下的yolov5模型
  • ¥15 如何实现H5在QQ平台上的二次分享卡片效果?
  • ¥30 求解达问题(有红包)
  • ¥15 请解包一个pak文件