douhezi2285 2017-09-24 08:37
浏览 28
已采纳

在PHP中使用preg_match的UTF正则表达式

I need a regeular expression for german words with ä,ü etc.

When I test this regex on this website https://regex101.com/

/^\p{L}+$/u

all is fine, but on my server I upload a CSV and want to parse the words. When I call with the word "Benedikt"

preg_match("/^[\p{L}]+$/u", $attributes[0])

I get false. The encoding of the CSV is UTF-8, when I convert it to ANSI, all is good but the ä,ü etc. is not shown correctly, so I think I should convert it to UTF-8. But why is it returning false?

  • 写回答

2条回答 默认 最新

  • douyong6585 2017-09-24 18:23
    关注

    The problem occurs because your csv file starts with a UTF-8 BOM. If you remove this, the regex works perfectly. I have confirmed it with this code:

    <html>
    <head>
    <meta charset="utf-8" /> 
    </head>
    <body>
    <?php
    function remove_utf8_bom($text)
    {
        $bom = pack('H*','EFBBBF');
        $text = preg_replace("/^$bom/", '', $text);
        return $text;
    }
    
    $csvContents = remove_utf8_bom(file_get_contents('udfser_new.csv'));
    $lines = str_getcsv($csvContents, "
    "); //parse the rows
    
    foreach ($lines as &$row) {
        $row = str_getcsv($row, ";");
    
        $firstName = $row[0];
        $lastName = $row[1];
        echo 'First name: ' . $firstName . ' - Matches regex: ' . (preg_match("/^[\p{L}]+$/u", $firstName) ? 'yes' : 'no') . '<br>';
        echo 'Last name: ' . $lastName . ' - Matches regex: ' . (preg_match("/^[\p{L}]+$/u", $lastName) ? 'yes' : 'no') . '<br>';
    }
    ?>
    </body>
    </html>
    

    The regex match the text successfully, and the ü in Glückmann is shown correctly on the page.

    Result

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥15 metadata提取的PDF元数据,如何转换为一个Excel
  • ¥15 关于arduino编程toCharArray()函数的使用
  • ¥100 vc++混合CEF采用CLR方式编译报错
  • ¥15 coze 的插件输入飞书多维表格 app_token 后一直显示错误,如何解决?
  • ¥15 vite+vue3+plyr播放本地public文件夹下视频无法加载
  • ¥15 c#逐行读取txt文本,但是每一行里面数据之间空格数量不同
  • ¥50 如何openEuler 22.03上安装配置drbd
  • ¥20 ING91680C BLE5.3 芯片怎么实现串口收发数据
  • ¥15 无线连接树莓派,无法执行update,如何解决?(相关搜索:软件下载)
  • ¥15 Windows11, backspace, enter, space键失灵