UTF-8，数字和正则表达式

This is what I've found in the Kohana3 validator rules:

public static function digit($str, $utf8 = FALSE)
{
    if ($utf8 === TRUE)
    {
        return (bool) preg_match('/^\pN++$/uD', $str);
    }
    else
    {
        return (is_int($str) AND $str >= 0) OR ctype_digit($str);
    }
}

Can someone give an example when passing $utf8 parameter as true and false can give different results (to be precise - false positives for $utf8 == false)?

From what I remember - digits are ascii-safe characters and none of utf-8 characters may be confused with them.

PS: even more detailed - is it possible to fool this check and pass something that in UTF-8 would look not like a number, but would pass the check with $utf-8 == false

写回答
好问题 0 提建议
追加酬金
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

3条回答默认最新

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
dtp791357 2012-11-08 23:22
关注
Just gave your second question part a bit more alcohol, and my conclusion is that you can't hide an ASCII digit in a UTF-8 sequence. Digits must be 0x30..0x39 or in the bitrange 00110000..00110110..00111001.

UTF-8 encodings include prefixes such as

11110xxx 10xxxxxx 10xxxxxx

And therefore a digit ASCII representation can't match anywhere:

00110000 ▲▲ 00110000 ▼ ▲ 00110000

So it's impossible that it would match in Latin-1/ASCII mode, but also have \pN satisfied in /u mode. Ignoring invalid encodings of course.
本回答被题主选为最佳回答 , 对您是否有帮助呢?

解决无用
评论打赏
分享
举报

评论

按下Enter换行，Ctrl+Enter发表内容

查看更多回答(2条)

报告相同问题？

关注问题

el-input如何添加这种正则表达式校验？ javascript vue.js 正则表达式
2021-05-14 11:30

回答 1 已采纳在@change回调里面判断。或者表单验证里面做判断就行了
C#正则表达式查找非纯数字的字符 c# 正则表达式
2022-04-27 01:53

回答 6 已采纳 (([a-zA-Z_])([a-zA-Z0-9_])+)|(([0-9])([a-zA-Z_])+)
正则表达式0-1包含两位小数正则表达式
2018-06-20 02:55

回答 7 已采纳 ``` var re=/^(1|0(\.\d{1,2})?)$/ console.log(re.test('1')) console.log(re.test('0')) consol
正确的PHP匹配UTF-8中文的正则表达式
2021-01-20 01:04

我以前一直用这个 ... 您可能感兴趣的文章:php正则匹配html中带class的div并选取其中内容的方法正则匹配密码只能是数字和字母组合字符串功能【php与js实现】PHP正则匹配日期和时间(时间戳转换)的实例代码PHP入门
如何使用正则表达式解决纯数字或者纯数字加"-"或者"+" 正则表达式
2017-05-09 07:34

回答 1 已采纳 ``` ^[\+\-]?\d+(\.\d+)?$ ```
只能输入1到5之间的数字，长度为1的正则表达式怎么写 c#
2022-06-15 21:29

回答 1 已采纳 ^[1-5]{1}
正则表达式 匹配1-1200的正整数开发语言正则表达式
2021-10-25 14:36

回答 2 已采纳 ^([1-9]|[1-9]\d|1\d{2}|1200)$
UTF-8正则表达式如何匹配汉字
2020-10-23 20:07

UTF-8编码中如何使用正则表达式匹配汉字，涉及的知识点包括正则表达式的组成、使用、以及对特定Unicode编码范围的匹配。在处理文本数据时，正则表达式是十分强大和必要的工具，尤其在编码范围匹配上，可以精确筛选出...
正则表达式删除非utf-8字符但新行 php
2015-07-03 11:58

回答 1 已采纳 Add a negative lookahead at the start. Now this won't match newline character. preg_replace('/(?!
正则表达式数字、正则表达式字符串 java
2023-01-31 00:05

回答 3 已采纳可以参考以下，这个代码 import java.util.Scanner; import java.util.regex.Matcher; import java.util.regex.Patter
如何用正则表达式控制TEXT文本框只能输入1--15的数字？正则表达式
2015-08-20 01:00

回答 3 已采纳 this.value=this.value.match(/1[0-5]|[1-9]$/) == null?this.value>15?15:1:this.value;
UTF-8汉字正则表达式
2019-04-23 19:55

宝贝们备的博客 UTF-8汉字正则表达式
【转】UTF-8汉字正则表达式
2019-10-28 19:41

qq_29069777的博客 preg_match("/^[\x{4e00}-\x{9fa5}A-Za-z0-9_] $/u",$str)) //UTF-8汉字字母数字下划线正则表达式 if(!preg_match("/^[\x{4e00}-\x{9fa5}] $/u",$str)) //UTF-8汉字字母数字下划线正则表达式 { echo "您输入的...
php utf-8正则匹配汉字,php utf-8编码正则匹配中文
2021-04-23 06:26

weixin_39891262的博客首先unicode里面中文的区域的0x4e00-0x9fa5在java或者js这种已unicode编码处理字符串的编程语言中/^[\u4...utf-8也同理之前有一个表达式 “/^[\x80-\xff]+$/”仅仅可以匹配是否含有非ascll字符而汉字只是其中一个比...
汉字输入检查 正则表达式 php,UTF-8正则表达式如何匹配汉字
2021-04-07 08:28

不独行但是特立的博客 preg_match("/^[\x{4e00}-\x{9fa5}A-Za-z0-9_]+$/u",$str))//UTF-8汉字字母数字下划线正则表达式if(!preg_match("/^[\x{4e00}-\x{9fa5}]+$/u",$str)) //UTF-8汉字字母数字下划线正则表达式{echo "您输入的["...
没有解决我的问题, 去提问

悬赏问题

¥15 ansys fluent计算闪退
¥15 有关wireshark抓包的问题
¥15 需要写计算过程，不要写代码，求解答，数据都在图上
¥15 向数据表用newid方式插入GUID问题
¥15 multisim电路设计
¥20 用keil，写代码解决两个问题，用库函数
¥50 ID中开关量采样信号通道、以及程序流程的设计
¥15 U-Mamba/nnunetv2固定随机数种子
¥15 vba使用jmail发送邮件正文里面怎么加图片
¥15 vb6.0如何向数据库中添加自动生成的字段数据。

UTF-8，数字和正则表达式

3条回答 默认 最新

悬赏问题

3条回答默认最新