PHP - html_entity_decode没有解码所有内容

I am parsing an HTML page. At some point I am getting the text between a div and using html_entity_decode to print that text.

The problem is that the page contains characters like this star ★ or others like shapes like ⬛︎, ◄, ◉, etc. I have checked and these characters are not encoded on the source page, they are like you see them normally.

The page is using charset="UTF-8"

So, when I use

html_entity_decode($string, ENT_QUOTES, 'UTF-8');

The star, for example, is "decoded" to â˜

$string is being obtained by using

document.getElementById("id-of-div").innerText

I would like to decode them correctly. How do I do that in PHP?

NOTE: I have tried htmlspecialchars_decode($string, ENT_QUOTES); and it produces the same result.

写回答
好问题 0 提建议
追加酬金
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

1条回答默认最新

dongqian0763 2014-01-05 21:55

关注

I've tried to reproduce your issue with this simple bit of PHP:

<?php
  // Make sure our client knows we're sending UTF-8
  header('Content-Type: text/plain; charset=utf-8');
  $string = "The page contains characters like this star ★ or others like shapes like ⬛︎, ◄, ◉, etc. Here are some entities: This is a &quot;test&quot;.";
  echo 'String: ' . $string . "
";
  echo 'Decoded: ' . html_entity_decode($string, ENT_QUOTES, 'UTF-8');

As expected, the output is:

String: The page contains characters like this star ★ or others like shapes like ⬛︎, ◄, ◉, etc. Here are some entities: This is a &quot;test&quot;.
Decoded: The page contains characters like this star ★ or others like shapes like ⬛︎, ◄, ◉, etc. Here are some entities: This is a "test".

If I change the charset in the header to iso-8859-1, I see this:

String: The page contains characters like this star â˜… or others like shapes like â¬›ï¸Ž, â—„, â—‰, etc. Here are some entities: <span>This is a &quot;test&quot;.
Decoded: The page contains characters like this star â˜… or others like shapes like â¬›ï¸Ž, â—„, â—‰, etc. Here are some entities: <span>This is a "test".

So, I'd say that your issue is a display issue. The "interesting" characters are being left completely untouched by html_entity_decode, as you'd expect. It's just that whatever code you've got, or whatever you're using to look at your output, is using incorrectly using iso-8859-1 to display them.

本回答被题主选为最佳回答 , 对您是否有帮助呢?

报告相同问题？

关注问题

PHP - html_entity_decode没有解码所有内容 html php
2014-01-05 21:26

回答 1 已采纳 I've tried to reproduce your issue with this simple bit of PHP: <?php // Make sure our clien
PHP html_entity_decode和修剪混乱 php
2015-11-03 11:59

回答 1 已采纳 If you use var_dump(urlencode($output)) you'll see that it outputs string(6) "%C2%A0" hence the ch
html_entity_decode终止？ html json php
2016-04-25 12:09

回答 1 已采纳 If you ever accept raw HTML from an outside source to embed into your site, you should always, alw
PHP处理json_decode()解析JSON.stringify
2020-09-09 15:06

fusheng-fate的博客经常我们会使用 JSON.stringify() 保存某些数据，然后需要在php中读取...解决方法：当你使用 JSON stringify时，先在json_decode前使用 html_entity_decode。代码示例：$tempData = h...
PHP - 处理缺少分号的HTML实体 php
2016-09-28 18:42

回答 2 已采纳 It seems you just want to match &# followed with 4 digits that are not followed with ;. Use '~&#\
libxml_set_external_entity_loader在httpd中不起作用 apache php
2016-07-31 11:32

回答 1 已采纳 Rebuid httpd with --enable-mpms-shared=all and then add LoadModule mpm_prefork_module modules/mod_
Wordpress：is_page（）if语句未显示内容 html php
2017-04-24 15:03

回答 1 已采纳 The first line of your if statement seems incorrect. Change this: if (is_page('drinks-packages')
html 实体字符转义,php针对html实体字符串转义函数：htmlspecialchars，htmlspecialchars_decode...
2021-06-08 15:44

Maddie Elfin的博客摘要php中html实体转义解码函数：1：htmlentities，html_entity_decode；2：htmlspecialchars，htmlspecialchars_decode；很多PHP开发的人员只知道后者，却不知道前者；前者是转义所有字符为html实体；后者只转义所...
html实体/特殊字符解码 php
2017-02-16 11:50

回答 1 已采纳 You can use, $subject = html_entity_decode($subject, ENT_QUOTES); However, I would advise again
Symfony4 Forms - 带有两个choice_label的EntityType php symfony
2018-01-29 15:08

回答 2 已采纳 You could also simply use a callback for the choice_label E.g.: ->add('customer', EntityType:
Symfony2 LexikFormFilterBundle：filter_entity的空值导致表单错误 php symfony
2015-04-17 09:43

回答 1 已采纳 User wcluijt's fix in https://github.com/symfony/symfony/issues/14393#issuecomment-94996862 fixed
guzzle-swoole_Guzzle-PHP HTTP客户端
2020-08-17 00:41

culh2177的博客 guzzle-swooleAs you probably know, website development can be broken up into 2 main areas: ...Front end (what the end user sees) 前端(最终用户看到的内容) Back end (what the server has to do...
JQuery将HTML实体/表情符号发布到PHP服务器 mysql php
2018-02-20 13:43

回答 1 已采纳 Solved. Either change database/table collation, as per Andy Foster in the comments below the ques
php 接受数组_PHP接收前端发送的数组
2021-03-22 19:38

格桑熊的博客 //前端发送数据var unPaid=JSON.stringify([{"AMN_D" : "300. 0","AMN_T1" : "300. 0","FUELCODE" : "12","GCODE" : "测试","PRC" : "7.39","PayMode" : "0","Paystate" : "3"}])//php接收数组global $_GPC;$order...
json_decode转换为空
2018-05-10 14:02

我是疯子我张狂的博客前几天在处理前端返回数据时候，发现json转数组失败，json调试发现是 4，最终研究发现是前端传输所来的json数据中夹杂了一些标签。先说一下PHP的json函数 json_encode($data)对变量进行 JSON 编码，该函数如果...
没有解决我的问题, 去提问

悬赏问题

¥15 微信小程序协议怎么写
¥15 c语言怎么用printf（“\b \b”）与getch（）实现黑框里写入与删除？
¥20 怎么用dlib库的算法识别小麦病虫害
¥15 华为ensp模拟器中S5700交换机在配置过程中老是反复重启
¥15 java写代码遇到问题，求帮助
¥15 uniapp uview http 如何实现统一的请求异常信息提示？
¥15 有了解d3和topogram.js库的吗？有偿请教
¥100 任意维数的K均值聚类
¥15 stamps做sbas-insar，时序沉降图怎么画
¥15 买了个传感器，根据商家发的代码和步骤使用但是代码报错了不会改，有没有人可以看看

码龄粉丝数原力等级 --

PHP - html_entity_decode没有解码所有内容

1条回答默认最新

码龄粉丝数原力等级 --

悬赏问题

PHP - html_entity_decode没有解码所有内容

1条回答 默认 最新

悬赏问题

1条回答默认最新