PHP上传的文件名：日文字符编码

When uploading a file with a japanese name, some characters are creating problem. On a windows system, I want to save the name of the file as-uploaded. So I have to use mb_convert_encoding($name, "SJIS", "AUTO"); which works fine most of the cases.

Though, some characters like ① as in 0423図表① totally disappear at the end. It seems that when uploaded the name of the file is already "wrong": it looks like "0423å³è¡¨â .pptx" in UTF-8 and if I change the header charset with

header('Content-Type: text/html; charset=SJIS');

it looks like

 "0423ﾃ･ﾂ崢ｳﾃｨﾂ｡ﾂｨﾃ｢ﾂ堕.pptx"

I am not sure what I can do in this case. I tried to replace the ① character but I cannot even find it with strpos() before or after the encoding conversion.

写回答
好问题 0 提建议
追加酬金
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

1条回答默认最新

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
duan1930 2016-05-13 05:43
关注
To qualify my answer (to the downvoter):

Q: I have heard that UTF-8 does not support some Japanese characters. Is this correct?

A: There is a lot of misinformation floating around about the support of Chinese, Japanese and Korean (CJK) characters. The Unicode Standard supports all of the CJK characters from JIS X 0208, JIS X 0212, JIS X 0221, or JIS X 0213, for example, and many more. This is true no matter which encoding form of Unicode is used: UTF-8, UTF-16, or UTF-32.

Unicode supports over 80,000 CJK characters right now, and work is underway to encode further additions. The International Standard ISO/IEC 10646 and the Unicode Standard are completely synchronized in repertoire and content. And that means that Unicode has the same repertoire as GB 18030, since that also is synchronized with ISO 10646 — although with a different ordering and byte format.

From: The Unicode Consortium.

My Answer:

Rather than strpos use mb_stripos, from the PHP Multibyte string functions to find and replace characters. This should help your script detect and translate the non-latin characters.

If the uploaded file name ($_FILES['var']['name']) is already incorrect in the PHP script (from output such as print_r($_FILES)) then you need to ensure you are correctly encoding the HTML form with accept-charset='UTF-8' (or SJIS, etc.). I would hope you're already well ahead of me on this.

Also it may be advisable to add a few preconditionals at the top of your code, again using the PHP mb_ functions add at the top of your PHP page:

mb_internal_encoding('UTF-8'); //or whatever character set works for you mb_http_output('SJIS'); mb_http_input('UTF-8'); mb_regex_encoding('UTF-8');

Out of interest:

http://www.unicode.org/reports/tr37/

and

http://david.latapie.name/blog/shift-jis-utf-8/
本回答被题主选为最佳回答 , 对您是否有帮助呢?

解决无用
评论打赏
分享
举报

评论

按下Enter换行，Ctrl+Enter发表内容

报告相同问题？

关注问题

PHP上传的文件名：日文字符编码 php
2016-05-13 04:39

回答 1 已采纳 To qualify my answer (to the downvoter): Q: I have heard that UTF-8 does not support some Japa
类型：错误消息：调用字符串上的成员函数result（）文件名：CI \ application \ models \ Country_model.php行号：34 jquery mysql php
2017-02-24 13:32

回答 1 已采纳 You did a mistake on else part $result = ("SELEC... it has to be come like $result = $this->
在PHP下载中正确地从文件名转义坏字符 php
2014-03-10 15:40

回答 1 已采纳 You can get around having some special chars in the file downloads by wrapping the filename in quo
php日文文件名 liunx,linux中文文件名不能访问，求教，在线等。
2021-04-23 12:34

莎鸥的博客以前windows的服务器被别人直接丢了很多中文文件名的图片，现在换到linux上面了，中文的图片就不能访问了，如...都说需要mod_encoding，但是按照相关方法还是不行，有些rpm也...
在PHP中转换字符串以用作文件名...而不剥离特殊字符 php
2016-10-06 20:21

回答 2 已采纳 Your quest is very similar to questions around creating url-safe base64 encoded strings. See this
Windows 批量处理文件名 删除特定字符后的字符 c语言
2022-05-12 10:14

回答 3 已采纳解决方法：右键编辑.bat文件，点击菜单栏的文件-另存为，在窗口底部的编码（E）处，将“UTF-8”改为“ANSI”。原因：cmd中的编码方式为ANSI，若中文不是此编码方式则会出现乱码。
严重性：警告消息：为foreach（）提供的参数无效文件名：helpers / form_helper.php javascript php
2016-07-17 07:19

回答 2 已采纳 You problem lies in form_helper.php means this line of code calling it. echo form_dropdown('local
php连接数据库字符串函数,php 字符串函数
2021-04-08 10:07

墨棏感卿的博客 {php 字符串函数}php操作字符串在Web应用中，用户和系统的交互基本上是用文字来进行的，因此系统对文本信息，即字符串的处理非常重要。文本字符串操作内容很多，本节将一一介绍。3.1.1 去除空格和其他特殊符号有时，...
php上传不起作用后删除文件名中的空格 php
2018-07-05 08:55

回答 1 已采纳 Try with below code, I have just changed and replaced the space with underscore before move_upload
php警告：require（）：文件名不能为空 php
2018-05-19 21:08

回答 1 已采纳 This is calling the function direct inside your router. public function direct($uri) If the rou
遇到PHP错误严重性：通知消息：未定义属性：stdClass :: $ file文件名：controllers / arsip.php行号：137 php
2014-09-25 03:21

回答 1 已采纳 Whenever you get an error like: Message: Undefined property: stdClass::$file it means PHP can t
php判断字符是中文,PHP正则判断字符串中是否包含中文字符的方法实例
2021-03-26 14:52

抹茶奶盖大杯多冰的博客本文主要和大家分享PHP正则判断字符串中是否包含中文字符的方法实例，希望能帮助到大家。PHP正则判断某字符串中是否包含中文字符第一种方法：if (preg_match("/[\x7f-\xff]/", $str)) {//echo "有中文";}else{//echo...
PHP按文件名中的数字排序数组 html php
2017-06-20 18:45

回答 3 已采纳 usort is a php function to sort array using values. usort needs a callback function that receives
PHP下载文件名中文乱码解决方法
2019-09-26 10:04

小白旗的博客通过把Content-Type设置为...那么用Content-Disposition设置下载的文件名，这个也有不少人知道吧。基本上，下载程序都是这么写的： <?php $filename = "document.txt"; header('Content-Type: application...
mysql 字符编码函数_字符集介绍及mysql数据库编码转换
2021-01-18 20:56

白骁威的博客 1、ASCIIASCII是英文American Standard Code for Information Interchange的缩写,美国标准信息交换代码是由美国国家标准学会(American National Standard Institute , ANSI )制定的，标准的单字节字符编码方案，用于...
字符集和字符编码
2021-02-19 16:50

yigg的博客在实际应用中接触比较多的文本编码有3种：ASCII、ANSI和UNICODE，其中ASCII码是后两种也是大多数常用编码的基础。 ...）GB2312（简体中文） BIG5（繁体中文）JIS 编码（日文）GBK（简体和繁体以..
没有解决我的问题, 去提问

悬赏问题

¥15 远程桌面文档内容复制粘贴，格式会变化
¥15 关于#java#的问题：找一份能快速看完mooc视频的代码
¥15 这种微信登录授权谁可以做啊
¥15 请问我该如何添加自己的数据去运行蚁群算法代码
¥20 用HslCommunication 连接欧姆龙 plc有时会连接失败。报异常为“未知错误”
¥15 网络设备配置与管理这个该怎么弄
¥20 机器学习能否像多层线性模型一样处理嵌套数据
¥20 西门子S7-Graph,S7-300，梯形图
¥50 用易语言http 访问不了网页
¥50 safari浏览器fetch提交数据后数据丢失问题

PHP上传的文件名：日文字符编码

1条回答 默认 最新

悬赏问题

1条回答默认最新