dongqing6755 2012-12-09 06:14
浏览 78
已采纳

PHP中的JPEG IPTC数据无法正确显示UTF-8字符

When reading the IPTC data from an image, UTF-8 accented characters are not displaying properly when reading them via PHP.

For example: é, ø and ü

With a header content-type set as UTF8, instead of the character, I get the question mark in a black diamond. � If no content-type is set, then I get a dash character: —

The following is the code being used to read the IPTC block:

$file = '/path/to/image.jpg';
getimagesize($file, $info);
$iptc = iptcparse($info['APP13']);

I have also tried uploading the exact same image to a WordPress installation on the same server, and it properly strips the accented character and replaces it with it's basic latin equivalent. I don't mind if this is the end result, I would just like to read the characters properly.

Any ideas on how to get the complete and correct data from the image?

  • 写回答

2条回答 默认 最新

  • doubao6936 2013-02-15 15:56
    关注

    Answering a bit late, but since I had the same problem displaying special characters as č š ž (which appear in Slovenian alphabet) I may aswell answer for future reference.

    Solution to this problem actually is not related to php, but to the IPTC data encoding. By default most software that can write IPTC data will store it in plain ASCII. At first I've used Adobe Bridge - which actually displays all special characters as it should when you start tagging your images - but once you want to parse that data in PHP you will actually not see special characters. (I would have to check again this part, but the main catch is that two different encodings happen - one that encodes IPTC data on the image and one that displays that data in a program that can handle IPTC data - or something along this lines).

    To solve the problem I used a program called ExifTool which is an amazing piece of software and will let you manage almost any data on your image.

    Than I used it to convert all IPTC encodings to UTF-8 - and from then on I just had to retag images that had corrupt characters (which Adobe Bridge correctly displays but obviously does not save in correct encoding).

    The command to accomplish this on all images in a folder is:

    exiftool -tagsfromfile @ -iptc:all -codedcharacterset=utf8
    

    You may also want to download ExifTool GUI if you are not familiar working from cmd.

    I haven't found any better program that could accomplish this same task faster.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥15 c语言怎么用printf(“\b \b”)与getch()实现黑框里写入与删除?
  • ¥20 怎么用dlib库的算法识别小麦病虫害
  • ¥15 华为ensp模拟器中S5700交换机在配置过程中老是反复重启
  • ¥15 java写代码遇到问题,求帮助
  • ¥15 uniapp uview http 如何实现统一的请求异常信息提示?
  • ¥15 有了解d3和topogram.js库的吗?有偿请教
  • ¥100 任意维数的K均值聚类
  • ¥15 stamps做sbas-insar,时序沉降图怎么画
  • ¥15 买了个传感器,根据商家发的代码和步骤使用但是代码报错了不会改,有没有人可以看看
  • ¥15 关于#Java#的问题,如何解决?