比较字符串与php中的重音符号

I'm having problems when comparing two strings which contains accents. This is my case:

The first string is: Master The second string is: Máster Diseño Producción

Then, I need to remove the word Máster from the second string, because it's contained in the first string.

I have created a function for clean each string:

function sanear_string($cadena)
{
    $cadena = trim($cadena);

    $cadena = str_replace(
        array('á', 'à', 'ä', 'â', 'ª', 'Á', 'À', 'Â', 'Ä'),
        array('a', 'a', 'a', 'a', 'a', 'A', 'A', 'A', 'A'),
        $cadena
    );

    $cadena = str_replace(
        array('é', 'è', 'ë', 'ê', 'É', 'È', 'Ê', 'Ë'),
        array('e', 'e', 'e', 'e', 'E', 'E', 'E', 'E'),
        $cadena
    );

    $cadena = str_replace(
        array('í', 'ì', 'ï', 'î', 'Í', 'Ì', 'Ï', 'Î'),
        array('i', 'i', 'i', 'i', 'I', 'I', 'I', 'I'),
        $cadena
    );

    $cadena = str_replace(
        array('ó', 'ò', 'ö', 'ô', 'Ó', 'Ò', 'Ö', 'Ô'),
        array('o', 'o', 'o', 'o', 'O', 'O', 'O', 'O'),
        $cadena
    );

    $cadena = str_replace(
        array('ú', 'ù', 'ü', 'û', 'Ú', 'Ù', 'Û', 'Ü'),
        array('u', 'u', 'u', 'u', 'U', 'U', 'U', 'U'),
        $cadena
    );

    $cadena = str_replace(
        array('ñ', 'Ñ', 'ç', 'Ç'),
        array('n', 'N', 'c', 'C',),
        $cadena
    );

    //Esta parte se encarga de eliminar cualquier caracter extraño
    $cadena = str_replace(
        array("\\", "¨", "º", "-", "~",
            "#", "@", "|", "!", "\"",
            "·", "$", "%", "&", "/",
            "(", ")", "?", "'", "¡",
            "¿", "[", "^", "`", "]",
            "+", "}", "{", "¨", "´",
            ">", "<", ";", ",", ":",
            ".", " "),
        '',
        $cadena
    );


    return $cadena;
}

And it helps me to the problem of accents. Now I can use strpos to compare both strings...if result is > 0 then I know that the word is contained... but I need some help more.... Thanks in advance,

写回答
好问题 0 提建议
追加酬金
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

3条回答默认最新

dtng5978 2014-05-21 12:00

关注

As usual when dealing with charset problems, you need to be extra careful about the character counts between multibyte strings and plain ASCII strings.

Your biggest problem here is that you remove some pre-defined characters from the cleaned string, rendering character count coherence between the sanitized string and the original, thus greatly hardening the removal.

I'll use a modified version of your sanitizing function:

function sanitize($cadena) {
    $cadena = str_replace(
        array('á', 'à', 'ä', 'â', 'ª', 'Á', 'À', 'Â', 'Ä'),
        array('a', 'a', 'a', 'a', 'a', 'A', 'A', 'A', 'A'),
        $cadena
    );

    $cadena = str_replace(
        array('é', 'è', 'ë', 'ê', 'É', 'È', 'Ê', 'Ë'),
        array('e', 'e', 'e', 'e', 'E', 'E', 'E', 'E'),
        $cadena
    );

    $cadena = str_replace(
        array('í', 'ì', 'ï', 'î', 'Í', 'Ì', 'Ï', 'Î'),
        array('i', 'i', 'i', 'i', 'I', 'I', 'I', 'I'),
        $cadena
    );

    $cadena = str_replace(
        array('ó', 'ò', 'ö', 'ô', 'Ó', 'Ò', 'Ö', 'Ô'),
        array('o', 'o', 'o', 'o', 'O', 'O', 'O', 'O'),
        $cadena
    );

    $cadena = str_replace(
        array('ú', 'ù', 'ü', 'û', 'Ú', 'Ù', 'Û', 'Ü'),
        array('u', 'u', 'u', 'u', 'U', 'U', 'U', 'U'),
        $cadena
    );

    $cadena = str_replace(
        array('ñ', 'Ñ', 'ç', 'Ç'),
        array('n', 'N', 'c', 'C',),
        $cadena
    );


    return strtolower($cadena);
}

The remove_word function follows:

function remove_word($haystack , $needle) {
    // sanitize input strings
    $haystack_san = sanitize($haystack);
    $needle_san = sanitize($needle);

    // Check for character loss
    if (mb_strlen($haystack_san, 'UTF-8') != mb_strlen($haystack, 'UTF-8') || mb_strlen($needle_san, 'UTF-8') != mb_strlen($needle, 'UTF-8')) {
        // Here for debugging purposes. You may want to drop it in production.
        echo "Lost some chars on the way. Aborting.
";
        echo "     haystack: $haystack (".mb_strlen($haystack, "UTF-8").")
";
        echo " haystack_san: $haystack_san (".mb_strlen($haystack_san, "UTF-8").")
";
        echo "       needle: $needle (".mb_strlen($needle, "UTF-8").")
";
        echo "   needle_san: $needle_san (".mb_strlen($needle_san, "UTF-8").")
";
        return;
    }

    // Check if $needle is found in $haystack
    if (($pos = strpos($haystack_san, $needle_san)) !== false) {
        // Get the string before the word
        $new = mb_substr($haystack, 0, $pos, 'UTF-8');
        // If applicable, get the string after
        if (mb_strlen($haystack, 'UTF-8') - $pos - mb_strlen($needle, 'UTF-8') > 0)
            $new .= mb_substr($haystack, $pos + mb_strlen($needle), NULL, 'UTF-8');
        // Return it
        return $new;
    }

    // If the word wasn't found, return $haystack as-is
    return $haystack;
}

echo remove_word("Hola, Máster Diseño Producción", "Master");
// "Hola,  Diseño Producción"

Note that:

This assumes your strings are UTF-8
The code relies on mb_* function to handle multi-byte characters
This only replaces the first occurence of the word (you may call remove_word until the string no longer changes if you want to replace all occurences)

本回答被题主选为最佳回答 , 对您是否有帮助呢?

查看更多回答(2条)

报告相同问题？

关注问题

PHP如何实现本地html文件标签中的字符串替换？ html5 php
2018-07-19 07:18

回答 11 已采纳需要知道你采用的那个模板引擎，语法不太一样常规的是 ``` {php str_replace("必选的","","权威医学验光配镜，第一次配镜必选的正规医院");}或{str_repla
php后台echo数值给java端字符串长度不符。 java php
2017-04-01 06:53

回答 1 已采纳应该是bom头，php存储为没有bom头的 [php隐形字符65279](http://www.w3dev.cn/article/20110817/php-hidden-char-65279-u
从php中的字符串中提取日期 php
2014-06-18 14:33

回答 2 已采纳 like this $string = "period from 06/01/2014 to 06/30/2014"; $results = array(); preg_match_all(
php 去除字符串中符号,如何从PHP字符串中的字符中删除重音符号？
2021-04-22 17:47

weixin_39621060的博客什么WordPress的实现？function remove_accents($string) {if ( !preg_match('/[\x80-\xff]/', $string) )return $string;$chars = array(// Decompositions for Latin-1 Supplementchr(195).chr(128) =>...
php去掉字符串的第一个字符 php
2016-05-05 07:03

回答 4 已采纳 http://www.w3school.com.cn/php/func_string_substr.asp ``` $a = "About us"; $a=substr($a,1); e
关于PHP拼接字符串问题 php
2014-12-10 01:25

回答 4 已采纳 ``` $vo['scorea']='' ? '' : ''; ```
在PHP中对包含字母和数字的字符串进行排序 html php
2019-01-16 08:37

回答 1 已采纳 A sort() can do that for you. Here's an example from the PHP page doing pretty much the same thing
php正则表达式重复字符,php正则表达式匹配可能的重音字符
2021-04-08 10:40

地理沙龙的博客情况：我想用类似“blablebli”的字符串搜索字符串,并且能够在文本中找到与所有可能的重音变体(“blablebli”,“blábleblí”,“blâblèbli”等等)的匹配.我已经做了相反的解决方法(找到一个没有我写的可能的重音...
php递归如何实现字符串的排列? php
2017-02-23 03:16

回答 3 已采纳 ``` string(12) "acdakjflsdaf" [1] => string(12) "acdakjflsdaf" [2] => st
java字符串比较，求助有没有好的方法
2017-01-23 06:22

回答 6 已采纳不知道具体数据格式是啥，给出的"流量10"，是字符串？还是{"流量":10}，这种key-value形式的内容。如果是"流量10",那用linkedList，插入前比较字符串，选择插入点。如果
如何使用PHP检查一个字符串是否有逗号分隔值 php
2016-02-19 10:16

回答 2 已采纳 Can be easily done by strpos if (strpos($sup_id, ',') !== false) { echo "There's a comma in the
php传递字符串变量到javascript的函数参数,javascript - 在onclick函数中传递字符串参数...
2021-04-29 07:12

weixin_39984952的博客 javascript - 在onclick函数中传递字符串参数我想将参数(即字符串... 由于这个函数调用与数字参数完美配合，我认为它与字符串中的符号“”有关。以前有没有人遇到这个问题？Consec asked 2019-04-07T20:00:34Z19个解...
编写程序以字符串为单位，以空格或标点符号（字符串中仅含','或'.'作为标点符号）作为分隔符，对字符串中所有单词进行倒排，之后把已处理的字符串（应不含标点符号）打印出来 c语言
2022-03-30 20:46

回答 1 已采纳 #include<stdio.h> #include<string.h> #define max 150 //句子总单词数最大值 #define lmax 25 //句
slor 搜索引擎不带重音_带重音字符的字符串排序
2020-08-02 00:13

culuo8053的博客 slor 搜索引擎不带重音Stringscan create a whole host of problems within any programming language. Whether it's a simple string, a string containing emojis, html entities, and even accented characters, ...
php byte转宽字符,php 中的宽字符处理
2021-04-27 01:48

赵猪倌的博客编码问题简述ASCII编码，ASCII(American Standard Code for Information Interchange)，是一种字符编码标准，它的字符集为英文字符集，它规定字符集中的每个字符均由一个字节表示，指定了字符表编码表，称为ASCII...
没有解决我的问题, 去提问

悬赏问题

¥100 set_link_state
¥15 虚幻5 UE美术毛发渲染
¥15 CVRP 图论物流运输优化
¥15 Tableau online 嵌入ppt失败
¥100 支付宝网页转账系统不识别账号
¥15 基于单片机的靶位控制系统
¥15 真我手机蓝牙传输进度消息被关闭了，怎么打开？(关键词-消息通知)
¥15 装 pytorch 的时候出了好多问题，遇到这种情况怎么处理？
¥20 IOS游览器某宝手机网页版自动立即购买JavaScript脚本
¥15 手机接入宽带网线，如何释放宽带全部速度

码龄粉丝数原力等级 --

比较字符串与php中的重音符号

3条回答默认最新

码龄粉丝数原力等级 --

悬赏问题

比较字符串与php中的重音符号

3条回答 默认 最新

悬赏问题

3条回答默认最新