dongqing6755 2012-12-09 06:14
浏览 78
已采纳

PHP中的JPEG IPTC数据无法正确显示UTF-8字符

When reading the IPTC data from an image, UTF-8 accented characters are not displaying properly when reading them via PHP.

For example: é, ø and ü

With a header content-type set as UTF8, instead of the character, I get the question mark in a black diamond. � If no content-type is set, then I get a dash character: —

The following is the code being used to read the IPTC block:

$file = '/path/to/image.jpg';
getimagesize($file, $info);
$iptc = iptcparse($info['APP13']);

I have also tried uploading the exact same image to a WordPress installation on the same server, and it properly strips the accented character and replaces it with it's basic latin equivalent. I don't mind if this is the end result, I would just like to read the characters properly.

Any ideas on how to get the complete and correct data from the image?

  • 写回答

2条回答 默认 最新

  • doubao6936 2013-02-15 15:56
    关注

    Answering a bit late, but since I had the same problem displaying special characters as č š ž (which appear in Slovenian alphabet) I may aswell answer for future reference.

    Solution to this problem actually is not related to php, but to the IPTC data encoding. By default most software that can write IPTC data will store it in plain ASCII. At first I've used Adobe Bridge - which actually displays all special characters as it should when you start tagging your images - but once you want to parse that data in PHP you will actually not see special characters. (I would have to check again this part, but the main catch is that two different encodings happen - one that encodes IPTC data on the image and one that displays that data in a program that can handle IPTC data - or something along this lines).

    To solve the problem I used a program called ExifTool which is an amazing piece of software and will let you manage almost any data on your image.

    Than I used it to convert all IPTC encodings to UTF-8 - and from then on I just had to retag images that had corrupt characters (which Adobe Bridge correctly displays but obviously does not save in correct encoding).

    The command to accomplish this on all images in a folder is:

    exiftool -tagsfromfile @ -iptc:all -codedcharacterset=utf8
    

    You may also want to download ExifTool GUI if you are not familiar working from cmd.

    I haven't found any better program that could accomplish this same task faster.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥15 eclipse运行项目时遇到的问题
  • ¥15 关于#c##的问题:最近需要用CAT工具Trados进行一些开发
  • ¥15 南大pa1 小游戏没有界面,并且报了如下错误,尝试过换显卡驱动,但是好像不行
  • ¥15 没有证书,nginx怎么反向代理到只能接受https的公网网站
  • ¥50 成都蓉城足球俱乐部小程序抢票
  • ¥15 yolov7训练自己的数据集
  • ¥15 esp8266与51单片机连接问题(标签-单片机|关键词-串口)(相关搜索:51单片机|单片机|测试代码)
  • ¥15 电力市场出清matlab yalmip kkt 双层优化问题
  • ¥30 ros小车路径规划实现不了,如何解决?(操作系统-ubuntu)
  • ¥20 matlab yalmip kkt 双层优化问题