iso-8895-1至xml可接受的UTF-8

I am parsing text/html from web pages into an xml feed, the text/html is encoded iso-8895-1 while the XML feed must be UTF-8. I have used html entities, but am having to manually replace loads of characters, here is what I have so far (still not parsing all text)

$desc = str_replace(array("
", "", "
"),"",$desc);
    $desc = str_replace(array("’","‘","”","“"),"'",$desc);
  $desc = str_replace("£","&pound;",$desc);
    $desc = str_replace("é","&eacute;",$desc);
    $desc = str_replace("²","2",$desc);
    $desc = str_replace(array("-","•"),"&dash;",$desc);
$desc = htmlentities($desc, ENT_QUOTES, "UTF-8");

写回答
好问题 0 提建议
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

2条回答默认最新

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
dpka7974 2010-10-21 10:31
关注
Use iconv(). It will allow you to use native characters in UTF-8 as well - no need for HTML entities.

$data = iconv("ISO-8859-1", "UTF-8", $text);

when doing encoding from UTF-8 to another character set, use IGNORE or TRANSLIT to drop or transliterate non-translatable characters.

alternatively, the mb_* functions shown by @Gumbo will work as well.
本回答被题主选为最佳回答 , 对您是否有帮助呢?

解决无用
评论打赏
分享
举报
编辑

预览
轻敲空格完成输入
显示为

卡片

标题

链接
评论

按下Enter换行，Ctrl+Enter发表内容

查看更多回答(1条)

编辑

预览

报告相同问题？

关注问题

如何使用SoapClient发送ISO-8859-1 XML文件？ php xml
2017-03-21 12:02

回答 1 已采纳 Like pointed by Constantin GALBENU, it cannot be done using SoapClient. I had to use cURL instead,
整个RSS从iso-8859-1进入UTF-8 php
2017-04-06 02:07

回答 1 已采纳 Answer found. I loaded the feed with: $feed = file_get_contents(' .... '); and encoded it with:
来自xml的php utf-8解码返回问号 php xml
2013-04-30 12:11

回答 1 已采纳 Okay the following is now a bit rough/verbose, especially as you already tried so much. Just try t
php xml gbk转utf8,PHP和处理UTF-8XML的外来字符
2021-05-04 18:50

漆园吏的博客根据元标记,正在擦除的文档是utf-8问题是有些数据包含了外来字符,我找不到一种可靠地将它们转换为XML/UTF-8友好实体的方法,以下错误是我通过阅读找到的,理想情况下,我希望有一个始终有效的解决方案。示例1工作正常,...
如何将这个UTF-8转义字符串从亚马逊MWS响应转换为正确的UTF-8？ mongodb php
2014-12-29 11:11

回答 2 已采纳 SimpleXML does not decode the hex entities and understand the result as UTF-8, because that's not
PHP - 如何使用html标记解析xml中的链接？ html php xml
2012-09-18 02:00

回答 2 已采纳 I hope this is helpful. I enjoy using xpath to cut through the XML I get back from SimpleXML: &l
php curl显示不可读的标志 php
2014-12-19 14:15

回答 1 已采纳 It looks like an encoding issue. Try adding CURLOPT_ENCODING. Example: curl_setopt($ch, CURLOPT_
php的curl怎么设置utf-8,PHP：将curl_exec输出转换为UTF8
2021-04-20 04:30

聂述龙的博客通过Gumbo和Pekka的建议,我写了curl_exec_utf8/** The same as curl_exec except tries its best to convert the output to utf8 **/function curl_exec_utf8($ch) {$data = curl_exec($ch);if (!is_string($data)) ...
解析CURL XML响应PHP php xml
2016-07-31 00:29

回答 1 已采纳 You can try this $string = '<result> <contact id="90676"> <Group_Tag name="Sequenc
替换节点XMLDOM和PHP php xml
2016-07-02 03:17

回答 1 已采纳 $node = $xpath->query("/$racine/annonce[@id = '$annonce_no']"); does not give you a DOM node, i
错误{“找不到名称空间名称''的元素'faultstring'。第6行，第126位。“}尝试从C＃连接nusoap Web服务时 c# php
2019-04-30 20:44

回答 1 已采纳 For reason I do not know, there is two definition for function serialize() with same body in two d
php中header设置常见文件类型的content-type
2020-10-23 16:53

字符集编码则确保内容以正确的字符编码显示，常见的编码有UTF-8、ISO-8859-1等。接下来我们逐一列举一些常见的文件类型及其对应的Content-Type值： 1. HTML页面：通常使用的Content-Type是"text/html"，编码通常...
php xml特殊字符处理,php – 使用特殊字符解析XML(UTF-8)
2021-04-23 17:19

weixin_39946364的博客我开始使用一些看起来像这样的XML(简化)：但是在我使用simplexml_load_string解析它后,特殊字符(i)变为：Ã-这显然是非常糟糕的.有没有办法防止这种情况发生？我知道XML很好,当保存为.txt并在浏览器中查看时,字符很...
Goutte-masterWeb抓取器PHP类.zip
2019-07-11 03:01

$header[] ="Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7"; $header[] ="Accept-Language: en-us,en;q=0.5"; $header[] ="Pragma:";//browsers keep this blank. curl_setopt($this->curl,...
php utf8(无bom),php中utf8 与utf-8 与utf8 无BOM
2021-03-31 00:26

天生双下巴的博客一、在php和html中设置编码，请尽量统一写成“UTF-8”,这才是标准写法，而utf-8只是在window中不区分大小写的写法而已，其次，大部分情况简写成“UTF8”或“utf8”程序也可以识别，但在ie浏览器就不识别它了，所以，...
没有解决我的问题, 去提问

iso-8895-1至xml可接受的UTF-8

2条回答 默认 最新

2条回答默认最新