如何使用PHP preg_replace函数将Unicode代码点转换为实际字符/ HTML实体？

I want to convert a set of Unicode code points in string format to actual characters and/or HTML entities (either result is fine).

For example, if I have the following string assignment:

$str = '\u304a\u306f\u3088\u3046';

I want to use the preg_replace function to convert those Unicode code points to actual characters and/or HTML entities.

As per other Stack Overflow posts I saw for similar issues, I first attempted the following:

$str = '\u304a\u306f\u3088\u3046';
$str2 = preg_replace('/\u[0-9a-f]+/', '&#x$1;', $str);

However, whenever I attempt to do this, I get the following PHP error:

Warning: preg_replace() [function.preg-replace]: Compilation failed: PCRE does not support \L, \l, \N, \U, or \u

I tried all sorts of things like adding the u flag to the regex or changing /\u[0-9a-f]+/ to /\x{[0-9a-f]+}/, but nothing seems to work.

Also, I've looked at all sorts of other relevant pages/posts I could find on the web related to converting Unicode code points to actual characters in PHP, but either I'm missing something crucial, or something is wrong because I can't fix the issue I'm having.

Can someone please offer me a concrete solution on how to convert a string of Unicode code points to actual characters and/or a string of HTML entities?

写回答
好问题 0 提建议
追加酬金
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

2条回答默认最新

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
doulan3966 2014-01-05 07:26
关注
From the PHP manual:

Single and double quoted PHP strings have special meaning of backslash. Thus if \ has to be matched with a regular expression \\, then "\\\\" or '\\\\' must be used in PHP code.

First of all, in your regular expression, you're only using one backslash (\). As explained in the PHP manual, you need to use \\\\ to match a literal backslash (with some exceptions).

Second, you are missing the capturing groups in your original expression. preg_replace() searches the given string for matches to the supplied pattern and returns the string where the contents matched by the capturing groups are replaced with the replacement string.

The updated regular expression with proper escaping and correct capturing groups would look like:

$str2 = preg_replace('/\\\\u([0-9a-f]+)/i', '&#x$1;', $str);

Output:

おはよう

Expression: \\\\u([0-9a-f]+)

\\\\ - matches a literal backslash

u - matches the literal u character

( - beginning of the capturing group

[0-9a-f] - character class -- matches a digit (0 - 9) or an alphabet (from a - f) one or more times

) - end of capturing group

i modifier - used for case-insensitive matching

Replacement: &#x$1

& - literal ampersand character (&)

# - literal pound character (#)

x - literal character x

$1 - contents of the first capturing group -- in this case, the strings of the form 304a etc.

RegExr Demo.
本回答被题主选为最佳回答 , 对您是否有帮助呢?

解决无用
评论打赏
分享
举报

评论

按下Enter换行，Ctrl+Enter发表内容

查看更多回答(1条)

报告相同问题？

关注问题

如何使用PHP preg_replace函数将Unicode代码点转换为实际字符/ HTML实体？ php
2014-01-05 07:15

回答 2 已采纳 From the PHP manual: Single and double quoted PHP strings have special meaning of backslash. T
在PHP中使用preg_Replace（）函数时出现问题[重复] php
2018-05-25 06:05

回答 1 已采纳 You need to slightly change your syntax to one of the following two options: $d[] = preg_replace(
Preg_replace问题是剥离数字/字符长度？ php
2018-01-30 19:15

回答 1 已采纳 Dollar signs are special characters in replacement strings, they normally refer to captured substr
php preg replace e,关于 preg_replace 危险的“/e”修饰符
2021-04-08 09:28

机智啵啵鸡的博客 PHP preg_replace() 正则替换，与...而函数中的 /e 这个修饰符的意思就是让正则替换之后将 replacement 参数当作 PHP 代码。该用法常在 PHP webshell 中出现。preg_replace ( mixed pattern, mixed replacem...
PHP关于preg_replace函数的一些问题 php
2017-12-06 13:36

回答 2 已采纳 i是匹配模式，不区分大小写。还有其他的一些匹配模式如/i, /s, /x,/u, /U, /A, /D, /S。
PHP：preg_replace只是数组中第一个匹配的字符串 php
2018-11-25 13:08

回答 1 已采纳 You may use plain text in the associative array keys that you will use to create dynamic regex pat
在preg_replace（）中使用捕获组 php
2019-06-22 05:08

回答 1 已采纳 Great question! For substitution parts My guess is that <\1> and $1 ${1} are pretty much t
php preg_replace 用法,preg_replace 基础入门应用
2021-04-20 13:06

以喋喋的博客 $str="as2223adfsf0s4df0sdfsdf";...//去掉0字符，此时相当于replace的功能,preg_replace("/0/","A",$str);　这样就是将0变成A的意思了echopreg_replace("/[0-9]/","",$str);　//去掉所有数字echopreg_replace("...
使用preg_replace在正则表达式中使用特殊字符的问题 php
2018-11-08 14:16

回答 2 已采纳 In my case the solution was to use this regex: (Test)(?![^>]*?[^<]*?<\/a>)(?![^>]*
php 能否用preg_replace()函数把字符串第一个字符和最后一个字符之间的每一个字符替换为*？ php
2021-10-24 12:40

回答 2 已采纳 $str = 'ABCD'; echo substr_replace($str, str_repeat('*', strlen($str)-2), 1, strlen($str)-2)
仅使用preg_replace替换preg字符串的最后一个匹配项？ php
2019-04-29 07:50

回答 2 已采纳 You can replace the last to occurrences of } with only whitespace chars between them using: }\s*}
php 正则修饰,PHP preg_replace() 函数修饰符及PHP正则使用详解
2021-04-24 12:13

平象法师的博客 PHP 正则表达式将下一个字符标记为一个特殊字符、或一个原义字符、或一个向后引用、或一个八进制转义符。例如，“n”匹配字符“n”。“n”匹配一个换行符。序列“”匹配“”而“(”则匹配“(”。^ 匹配输入字符串的...
如何使用preg_replace忽略特定单词 php
2018-06-29 13:06

回答 1 已采纳 You could use a (*SKIP)(*F) solution: <!END!>(*SKIP)(FAIL)|[^A-Za-z0-9 ] That would match:
如何在PHP中利用preg_replace() 对字符串进行替换
2021-12-01 13:47

非凡的世界的博客今天就跟大家聊聊有关如何在PHP中利用preg_replace() 对字符串进行替换，可能很多人都不太了解，为了让大家更加了解，小编给大家总结了以下内容，希望大家根据这篇文章可以有所收获。
php 正则替换全部,关于PHP中preg_replace() 正则替换所有符合条件的字符串的方法...
2021-03-23 21:09

weixin_39997194的博客这篇文章主要介绍了关于PHP中preg_replace() 正则替换所有符合条件的字符串的方法，有着一定的参考价值，现在分享给大家，有需要的朋友可以参考一下PHP preg_replace() 正则替换，与Javascript 正则替换不同，PHP ...
php preg replace w3c,PHP preg_replace() 正则替换所有符合条件的字符串
2021-04-08 11:22

米粒呢喃的博客一般这种情况，我们用正则按我们的规则去匹配preg_match、替换preg_replace。但一般的应用中，无非是些数据库CRUD，正则摆弄的机会很少。根据前面说的，两种场景：统计分析，用匹配；处理用替换。PHP preg_replace()...
php正则表达式 --preg_replace
2021-01-29 09:56

dj1540225203的博客 PHP preg_replace() 正则替换，与Javascript 正则替换不同，PHP preg_replace() 默认就是替换所有符号匹配条件的元素参考：php正则preg_replace 需要我们用程序处理的数据并不总是预先以数据库思维设计的，或者说...
php正则替换%3cbr%3e_PHP函数preg_replace() 正则替换所有符合条件的字符串
2020-12-21 20:22

weixin_39893274的博客 PHP preg_replace() 正则替换，与JavaScript 正则替换不同，PHP preg_replace() 默认就是替换所有符号匹配条件的元素。...PHP 正则表达式正则字符正则解释\将下一个字符标记为一个特殊字符、或一个原...
php 正则消除汉字邮箱,php函数preg_replace() 正则去除汉字
2021-04-13 14:06

weixin_39572168的博客项目中需要用到去除汉字的方法，整理的资料$file = fopen("hb/hacktea8.txt","r+") or exit("Unable to open file!");while(!feof($file)){$line=fgets($file);...echo preg_replace($pattern, '', $l...
PHP preg_replace() 正则替换所有符合条件的字符串
2017-07-11 16:35

TA远方的博客 PHP preg_replace() 正则替换，与Javascript 正则替换不同，PHP preg_replace() 默认就是替换所有符号匹配条件的元素需要我们用程序处理的数据并不总是预先以数据库思维设计的，或者说是无法用数据库的结构去...
没有解决我的问题, 去提问

悬赏问题

¥15 微信公众平台自制会员卡可以通过收款码收款码收款进行自动积分吗
¥15 随身WiFi网络灯亮但是没有网络，如何解决？
¥15 gdf格式的脑电数据如何处理matlab
¥20 重新写的代码替换了之后运行hbuliderx就这样了
¥100 监控抖音用户作品更新可以微信公众号提醒
¥15 UE5 如何可以不渲染HDRIBackdrop背景
¥70 2048小游戏毕设项目
¥20 mysql架构，按照姓名分表
¥15 MATLAB实现区间[a,b]上的Gauss-Legendre积分
¥15 delphi webbrowser组件网页下拉菜单自动选择问题

如何使用PHP preg_replace函数将Unicode代码点转换为实际字符/ HTML实体？

2条回答 默认 最新

悬赏问题

2条回答默认最新