dousui6488 2012-01-25 14:10 采纳率: 100%
浏览 58
已采纳

无法理解两个preg_match模式之间的区别

in an original code (Drupal core module) previous developer commented out the string:

if (preg_match('/[^\x{80}-\x{F7} a-z0-9@_.\'-]/i', $name)) {

and instead, added:

if (preg_match('/[^\x{80}-\x{F7} a-z0-9@_.\'-]/iu', $name)) {

Can you help me to understand what the difference between these two? What u modifier does? In php docs I found:

u (PCRE8)
This modifier turns on additional functionality of PCRE that is incompatible with Perl. Pattern strings are treated as UTF-8. This modifier is available from PHP 4.1.0 or greater on Unix and from PHP 4.2.3 on win32. UTF-8 validity of the pattern is checked since PHP 4.3.5.

So I guess, previous developer had problems with interpreting special characters or something. I'm a bit puzzled, please advice on this.

  • 写回答

2条回答 默认 最新

  • doulin6761 2012-01-25 14:37
    关注

    The modifier is needed to process utf-8 encoded input properly. A pattern like \xC1 should match the unicode character U+00C1 (À). When you encode Á in utf-8 you get \xC3\x81, so \xC1 doesn't match. The "u" modifier makes the algorithm use utf-8 so it does match.

    Basically, when you work with utf-8 encoded text this is what will happen:

    <?php
    var_dump(preg_match('/\xC1/u', 'Á'));
    // => int(1), matches
    
    var_dump(preg_match('/\xC1/', 'Á'));
    // => int(0), doesn't match
    ?>
    

    In your case the first regular expression [^\x80-\xF7] matches no (non-ascii) UTF-8 encoded text because of the way UTF-8 works. The second expression matches unicode characters outside of the range U+0080 - U+00F7, so it lets through all of cyrillic, greek, arab, hebrew, ...

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥15 MATLAB怎么通过柱坐标变换画开口是圆形的旋转抛物面?
  • ¥15 寻一个支付宝扫码远程授权登录的软件助手app
  • ¥15 解riccati方程组
  • ¥15 display:none;样式在嵌套结构中的已设置了display样式的元素上不起作用?
  • ¥15 使用rabbitMQ 消息队列作为url源进行多线程爬取时,总有几个url没有处理的问题。
  • ¥15 Ubuntu在安装序列比对软件STAR时出现报错如何解决
  • ¥50 树莓派安卓APK系统签名
  • ¥65 汇编语言除法溢出问题
  • ¥15 Visual Studio问题
  • ¥20 求一个html代码,有偿