dsfsda121545 2012-06-22 23:30
浏览 61
已采纳

php安全功能来过滤我们的恶意代码是剥离出合法的人物

I have a security function which is part of a script. It's supposed to filter out malicious code from being executed in an input form. It works without a problem with normal characters from A-Z, but it rejects inputs with characters such as á, ñ, ö, etc.

What can I do so that form inputs with these characters are not rejected? Here is the function:

function add_special_chars($string, $no_quotes = FALSE)
{
  $patterns = array(
      "/(?i)javascript:.+>/",
      "/(?i)vbscript:.+>/",
      "/(?i)<img.+onload.+>/",
      "/(?i)<body.+onload.+>/",
      "/(?i)<layer.+src.+>/", 
      "/(?i)<meta.+>/", 
      "/(?i)<style.+import.+>/",
      "/(?i)<style.+url.+>/"
  );


    $string = str_ireplace("&amp;","&",$string);

    if (!$no_quotes) $string = str_ireplace("&#039;","'",$string);

    $string = str_ireplace('&quot;','"',$string);
    $string = str_ireplace('&lt;','<',$string);
    $string = str_ireplace('&gt;','>',$string);
    $string = str_ireplace('&nbsp;',' ',$string);

  foreach ($patterns as $pattern)
  {
     if(preg_match($pattern, $string))
     {
        $string = strip_tags($string);
     }
  }      



  $string = preg_replace('#(&\#*\w+)[\x00-\x20]+;#u', "$1;", $string);
  $string = preg_replace('#(&\#x*)([0-9A-F]+);*#iu', "$1$2;", $string);

  $string = html_entity_decode($string, ENT_COMPAT, LANG_CODEPAGE);

  $string = preg_replace('#(<[^>]+[\x00-\x20\"\'\/])(on|xmlns)[^>]*>#iUu', "$1>", $string);

  $string = preg_replace('#([a-z]*)[\x00-\x20\/]*=[\x00-\x20\/]*([\`\'\"]*)[\x00-\x20\/]*j[\x00-\x20]*a[\x00-\x20]*v[\x00-\x20]*a[\x00-\x20]*s[\x00-\x20]*c[\x00-\x20]*r[\x00-\x20]*i[\x00-\x20]*p[\x00-\x20]*t[\x00-\x20]*:#iUu', '$1=$2nojavascript...', $string);
  $string = preg_replace('#([a-z]*)[\x00-\x20\/]*=[\x00-\x20\/]*([\`\'\"]*)[\x00-\x20\/]*v[\x00-\x20]*b[\x00-\x20]*s[\x00-\x20]*c[\x00-\x20]*r[\x00-\x20]*i[\x00-\x20]*p[\x00-\x20]*t[\x00-\x20]*:#iUu', '$1=$2novbscript...', $string);
  $string = preg_replace('#([a-z]*)[\x00-\x20\/]*=[\x00-\x20\/]*([\`\'\"]*)[\x00-\x20\/]*-moz-binding[\x00-\x20]*:#Uu', '$1=$2nomozbinding...', $string);
  $string = preg_replace('#([a-z]*)[\x00-\x20\/]*=[\x00-\x20\/]*([\`\'\"]*)[\x00-\x20\/]*data[\x00-\x20]*:#Uu', '$1=$2nodata...', $string);

  $string = preg_replace('#(<[^>]+[\x00-\x20\"\'\/])style[^>]*>#iUu', "$1>", $string);

  $string = preg_replace('#</*\w+:\w[^>]*>#i', "", $string);

  do
  {
     $original_string = $string;
     $string = preg_replace('#</*(applet|meta|xml|blink|link|embed|object|iframe|frame|frameset|ilayer|layer|bgsound|title|base)[^>]*>#i', "", $string);
  }
  while ($original_string != $string);   

    return $string;
}

UPDATE: I found that the following line seems to be causing the problem, but not sure why:

 $string = preg_replace('#(<[^>]+[\x00-\x20\"\'\/])style[^>]*>#iUu', "$1>", $string);
  • 写回答

1条回答 默认 最新

  • dongma1666 2012-06-23 00:13
    关注

    This is a bad idea. The worst part of your function is the htmlentity_decode() half way though, which undermines the first 1/2 of this function entirely. The attacker can just encode the quote marks and brackets, and you'll just build the payload for the attacker. strip_tags() is a joke, and is not a good way to protect against XSS. The main problem with this function is that it is far too simple. HTMLPurifer is made up of thousands of regular expressions and it does a much better job, but it isn't perfect.

    You are hardly addressing the most common forms of XSS. XSS is an output problem, you can't expect to pass all input though some magical function and assume its safe. XSS depends on how it is used.

    Without actually running your code i think something like this would bypass it:

    <a href='jav&#x41%3b&#x53%3bcript&#x3a%3balert(1)'>so very broken</a>
    

    or maybe even something more simplistic:

    <img src=x onerror=alert(1) />
    

    Like I said this is a gross oversimplification of a extremely complex problem.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 相敏解调 matlab
  • ¥15 求lingo代码和思路
  • ¥15 公交车和无人机协同运输
  • ¥15 stm32代码移植没反应
  • ¥15 matlab基于pde算法图像修复,为什么只能对示例图像有效
  • ¥100 连续两帧图像高速减法
  • ¥15 如何绘制动力学系统的相图
  • ¥15 对接wps接口实现获取元数据
  • ¥20 给自己本科IT专业毕业的妹m找个实习工作
  • ¥15 用友U8:向一个无法连接的网络尝试了一个套接字操作,如何解决?