dongpo7467 2010-04-30 01:05
浏览 31
已采纳

当htmlentities设置了“UTF-8”时,图像无法上传

I have a form that, among other things, accepts an image for upload and sticks it in the database. Previously I had a function filtering the POSTed data that was basically:

function processInput($stuff) {
    $formdata = $stuff;
    $formdata = htmlentities($formdata, ENT_QUOTES);
    return "'" . mysql_real_escape_string(stripslashes($formdata)) . "'";
}

When, in an effort to fix some weird entities that weren't getting converted properly I changed the function to (all that has changed is I added that 'UTF-8' bit in htmlentities):

function processInput($stuff) {
        $formdata = $stuff;
        $formdata = htmlentities($formdata, ENT_QUOTES, 'UTF-8'); //added UTF-8
        return "'" . mysql_real_escape_string(stripslashes($formdata)) . "'";
    }

And now images will not upload.

What would be causing this? Simply removing the 'UTF-8' bit allows images to upload properly but then some of the MS Word entities that users put into the system show up as gibberish. What is going on?

**EDIT: Since I cannot do much to change the code on this beast I was able to slap a bandaid on by using htmlspecialchars() rather than htmlentities() and that seems to at least leave the image data untouched while converting things like quotes, angle brackets, etc. bobince's advice is excellent but in this case I cannot now spend the time needed to fix the messy legacy code in this project. Most stuff I deal with is object oriented and framework based but now I see first hand what people mean when they talk about "spaghetti code" in PHP.

  • 写回答

1条回答 默认 最新

  • doujiao3998 2010-04-30 01:47
    关注
    function processInput($stuff) {
        $formdata = $stuff;
        $formdata = htmlentities($formdata, ENT_QUOTES);
        return "'" . mysql_real_escape_string(stripslashes($formdata)) . "'";
    }
    

    This function represents a basic misunderstanding of string processing, one common to PHP programmers.

    SQL-escaping, HTML-escaping and input validation are three separate functions, to be used at different stages of your script. It makes no sense to try to do them all in one go; it will only result in characters that are ‘special’ to any one of the processes getting mangled when used in the other parts of the script. You can try to tinker with this function to try to fix mangling in one part of the app, but you'll break something else.

    Why are images being mangled? Well, it's not immediately clear via what path image data is going from a $_FILES temporary upload file to the database. If this function is involved at any point though, it's going to completely ruin the binary content of an image file. Backslashes removed and HTML-escaped... no image could survive that.

    1. mysql_real_escape_string is for escaping some text for inclusion in a MySQL string literal. It should be used always-and-only when making an SQL string literal with inserted text, and not globally applied to input. Because some things that come in in the input aren't going immediately or solely to the database. For example, if you echo one of the input values to the HTML page, you'll find you get a bunch of unwanted backslashes in it when it contains characters like '. This is how you end up with pages full of runaway backslashes.

      (Even then, parameterised queries are generally preferable to manual string hacking and mysql_real_escape_string. They hide the details of string escaping from you so you don't get confused by them.)

    2. htmlentities is for escaping text for inclusion in an HTML page. It should be used always-and-only in the output templating bit of your PHP. It is inappropriate to run it globally over all your input because not everything is going to end up in an HTML page or solely in an HTML page, and most probably it's going to go to the database first where you absolutely don't want a load of < and & rubbish making your text fail to search or substring reliably.

      (Even then, htmlspecialchars is generally preferable to htmlentities as it only encodes the characters that really need it. htmlentities will add needless escaping, and unless you tell it the right encoding it'll also totally mess up all your non-ASCII characters. htmlentities should almost never be used.)

    3. As for stripslashes... well, you sometimes need to apply that to input, but only when the idiotic magic_quotes_gpc option is turned on. You certainly shouldn't apply it all the time, only when you detect magic_quotes_gpc is on. It is long deprecated and thankfully dying out, so it's probably just as good to bomb out with an error message if you detect it being turned on. Then you could chuck the whole processInput thing away.

    To summarise:

    • At start time, do no global input processing. You can do application-specific validation here if you want, like checking a phone number is just numbers, or removing control characters from text or something, but there should be no escaping happening here.

    • When making an SQL query with a string literal in it, use SQL-escaping on the value as it goes into the string: $query= "SELECT * FROM t WHERE name='".mysql_real_escape_string($name)."'";. You can define a function with a shorter name to do the escaping to save some typing. Or, more readably, parameterisation.

    • When making HTML output with strings from the input or the database or elsewhere, use HTML-escaping, eg.: <p>Hello, <?php echo htmlspecialchars($name); ?>!</p>. Again, you can define a function with a short name to do echo htmlspecialchars to save on typing.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥50 yalmip+Gurobi
  • ¥20 win10修改放大文本以及缩放与布局后蓝屏无法正常进入桌面
  • ¥15 angular开发过程中,想要读取模型文件,即图1的335行,会报404错误(如图2)。但我的springboot里配置了静态资源文件,如图3。且在该地址下我有模型文件如图4,请问该问题该如何解决呢?
  • ¥15 itunes恢复数据最后一步发生错误
  • ¥15 关于#windows#的问题:2024年5月15日的win11更新后资源管理器没有地址栏了顶部的地址栏和文件搜索都消失了
  • ¥100 H5网页如何调用微信扫一扫功能?
  • ¥15 讲解电路图,付费求解
  • ¥15 有偿请教计算电磁学的问题涉及到空间中时域UTD和FDTD算法结合的
  • ¥15 vite打包后,页面出现h.createElement is not a function,但本地运行正常
  • ¥15 Java,消息推送配置