如何在PHP中检测格式错误的utf-8字符串？

iconv function sometimes gives me an error:

Notice:
iconv() [function.iconv]:
Detected an incomplete multibyte character in input string in [...]

Is there a way to detect that there are illegal characters in utf-8 string before putting data to inconv ?

写回答
好问题 0 提建议
追加酬金
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

4条回答默认最新

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
dongshi1880 2011-07-17 11:41
关注
First, note that it is not possible to detect whether text belongs to a specific undesired encoding. You can only check whether a string is valid in a given encoding.

You can make use of the UTF-8 validity check that is available in preg_match ^{[PHP Manual]} since PHP 4.3.5. It will return 0 (with no additional information) if an invalid string is given:

$isUTF8 = preg_match('//u', $string);

Another possibility is mb_check_encoding ^{[PHP Manual]}:

$validUTF8 = mb_check_encoding($string, 'UTF-8');

Another function you can use is mb_detect_encoding ^{[PHP Manual]}:

$validUTF8 = ! (false === mb_detect_encoding($string, 'UTF-8', true));

It's important to set the strict parameter to true.

Additionally, iconv ^{[PHP Manual]} allows you to change/drop invalid sequences on the fly. (However, if iconv encounters such a sequence, it generates a notification; this behavior cannot be changed.)

echo 'TRANSLIT : ', iconv("UTF-8", "ISO-8859-1//TRANSLIT", $string), PHP_EOL; echo 'IGNORE : ', iconv("UTF-8", "ISO-8859-1//IGNORE", $string), PHP_EOL;

You can use @ and check the length of the return string:

strlen($string) === strlen(@iconv('UTF-8', 'UTF-8//IGNORE', $string));

Check the examples on the iconv manual page as well.

You have not shared the source code where the notice is resulting from. You should add it if you want a more concrete suggestion.
本回答被题主选为最佳回答 , 对您是否有帮助呢?

解决无用
评论打赏
分享
举报

评论

按下Enter换行，Ctrl+Enter发表内容

查看更多回答(3条)

报告相同问题？

关注问题

可能编码错误，格式错误的UTF-8字符 json laravel php
2018-03-22 14:08

回答 1 已采纳 In my laravel query i am use a following code so it will give a this type of error... Malform
VB将汉字字符串转换成 UTF-8格式
2015-11-29 13:11

回答 1 已采纳 http://www.williamlong.info/archives/1136.html
在Go中将带有UTF-8字节字符串的命令行输出转换为Unicode代码点
2019-04-10 18:21

回答 1 已采纳 You can use the strconv package to parse the string literal containing the escape sequences. The
php 截取utf-8格式的字符串实例代码
2020-10-21 05:17

在上述内容中，提供了处理UTF-8字符串截取的示例代码，其核心思想在于正确处理字符串的多字节字符。代码示例中定义了一个函数`truncate_utf8_string`，该函数接收三个参数：要截取的字符串`$string`，截取长度`$...
如果我在PHP中将UTF-8编码的字符串与ASCII字符串连接，那么结果字符串的编码是什么？ php
2019-01-29 17:25

回答 2 已采纳 It would depend firstly on whether you mean strict ASCII, which only includes 128 characters. Ever
PHP JSON_encode（）收到“格式错误的UTF-8字符，可能编码错误”（错误） php
2018-05-30 18:07

回答 2 已采纳 SOLVED! The issue was in the function mb_detect_order(), this function just don't work as I was e
如何将json字符串的编码格式改成utf-8 json 有问必答
2021-09-22 16:44

回答 1 已采纳你这是获取utf-8的字节内容。用编码试试 URLEncoder.encode(str,"UTF-8");
Patchwork UTF-8：处理UTF-8格式字符串的便携类库
2022-04-28 21:00

5. **搜索与替换**：提供对UTF-8字符串进行搜索和替换的功能，确保在处理多语言文本时不会出现错误。 6. **正则表达式支持**：增强PHP的正则表达式引擎，使其能更好地处理UTF-8字符串，避免因编码不匹配导致的匹配...
如何去掉字符串中的非UTF-8编码？ java
2012-08-08 15:34

回答 2 已采纳结果 [code="java"]测{方块}试[/code] 只有%00是方块，对头的吧？我的建议也是你自己说的用正则表达式把非法字符过滤掉。看你的描述似乎不确定非法字符有哪些，那么
错误: 编码 UTF-8 的不可映射字符 java
2022-04-13 15:36

回答 2 已采纳可是之前一直都使用的ANSI没问题的呀
在html中的meta中加了charset=utf-8后，页面就会自动的设定为utf-8字符集吗？ html5
2015-11-13 03:14

回答 2 已采纳 meta指定为utf-8，你的html文件存储编码也要为utf-8，而不是ansi，要不会乱码。至于变成utf-8，有些时候浏览器会记住上一次访问的编码
php utf-8转unicode的函数第1/2页
2020-10-30 10:07

- 在函数实现中，可以发现对于不同的字节长度的UTF-8字符，如何逐步解码成Unicode字符。 7. 注意事项： - 当使用iconv或其他编码转换工具时，结果可能采用大端序存储，所以在使用自己编写的函数进行转换时需要...
PHP utf-8编码问题,utf8编码,数据库乱码,页面显示输出乱码
2020-10-27 12:52

PHP文件默认编码通常是ANSI，当处理utf-8字符时，需要将其转换为UTF-8。在编辑器如EditPlus中，保存文件时选择UTF-8编码，注意不能选择带有BOM的UTF-8编码（UTF-8+BOM），因为这可能会在处理session时引入问题。如果...
PHP iconv 解决utf-8和gb2312编码转换问题
2020-10-29 05:08

在上述给定文件的【标题】和【描述】中，提到了一个常见的问题，即在尝试将UTF-8编码的字符串转换为GB2312编码时，转换失败的原因和解决方案。通常，编码转换失败可能是因为在源字符串中存在目标编码中无法表示的...
解析php获取字符串的编码格式的方法(函数)
2020-10-27 05:35

在PHP编程中，处理字符串时，了解其编码格式至关重要，特别是在进行数据交换、文本处理或者与数据库交互时。本文将深入解析PHP获取字符串编码格式的方法，帮助开发者更好地理解和使用相关函数。首先，PHP中用于...
没有解决我的问题, 去提问

悬赏问题

¥15 乌班图ip地址配置及远程SSH
¥15 怎么让点阵屏显示静态爱心，用keiluVision5写出让点阵屏显示静态爱心的代码，越快越好
¥15 PSPICE制作一个加法器
¥15 javaweb项目无法正常跳转
¥15 VMBox虚拟机无法访问
¥15 skd显示找不到头文件
¥15 机器视觉中图片中长度与真实长度的关系
¥15 fastreport table 怎么只让每页的最下面和最顶部有横线
¥15 java 的protected权限，问题在注释里
¥15 这个是哪里有问题啊？

如何在PHP中检测格式错误的utf-8字符串？

4条回答 默认 最新

悬赏问题

4条回答默认最新