preg_match在输入清理方面是否足够安全？

I am building a new web-app, LAMP environment... I am wondering if preg_match can be trusted for user's input validation (+ prepared stmt, of course) for all the text-based fields (aka not HTML fields; phone, name, surname, etc..).

For example, for a classic 'email field', if I check the input like:

$email_pattern = "/^([a-zA-Z0-9_\-\.]+)@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.)" .
    "|(([a-zA-Z0-9\-]+\.)+))([a-zA-Z]{2,4}" .
    "|[0-9]{1,3})(\]?)$/";

$email = $_POST['email'];
if(preg_match($email_pattern, $email)){
    //go on, prepare stmt, execute, etc...
}else{
    //email not valid! do nothing except warn the user
}

can I sleep easy against the SQL/XXS injection?

I write the regexp to be the more restrictive as they can.

EDIT: as already said, I do use prepared statements already, and this behavior is just for text-based fields (like phone, emails, name, surname, etc..), so nothing that is allowed to contain HTML (for HTML fields, I use HTMLpurifier).

Actually, my mission is to let pass the input value only if it match my regexp-white-list; else, return it back to the user.

p.s:: I am looking for something without mysql_real_escape_strings; probably the project will switch to Postgresql in the next future, so need a validation method that is cross-database ;)

写回答
好问题 0 提建议
追加酬金
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

7条回答默认最新

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
dongzhentiao2326 2010-04-12 14:45
关注
Whether or not a regular expression suffices for filtering depends on the regular expression. If you're going to use the value in SQL statements, the regular expression must in some way disallow ' and ". If you want to use the value in HTML output and are afraid of XSS, you'll have to make sure your regex doesn't allow <, > and ".

Still, as has been repeatedly said, you do not want to rely on regular expressions, and please by the love of $deity, don't! Use mysql_real_escape_string() or prepared statements for your SQL statements, and htmlspecialchars() for your values when printed in HTML context.

Pick the sanitising function according to its context. As a general rule of thumb, it knows better than you what is and what isn't dangerous.

Edit, to accomodate for your edit:

Database

Prepared statements == mysql_real_escape_string() on every value to put in. Essentially exactly the same thing, short of having a performance boost in the prepared statements variant, and being unable to accidentally forget using the function on one of the values. Prepared statement are what's securing you against SQL injection, rather than the regex, though. Your regex could be anything and it would make no difference to the prepared statement.

You cannot and should not try to use regexes to accodomate for 'cross-database' architecture. Again, typically the system knows better what is and isn't dangerous for it than you do. Prepared statements are good and if those are compatible with the change, then you can sleep easy. Without regexes.

If they're not and you must, use an abstraction layer to your database, something like a custom $db->escape() which in your MySQL architecture maps to mysql_real_escape_string() and in your PostgreSQL architecture maps to a respective method for PostgreSQL (I don't know which that would be off-hand, sorry, I haven't worked with PostgreSQL).

HTML

HTML Purifier is a good way to sanitise your HTML output (providing you use it in whitelist mode, which is the setting it ships with), but you should only use that on things where you absolutely need to preserve HTML, since calling a purify() is quite costly, since it parses the whole thing and manipulates it in ways aiming for thoroughness and via a powerful set of rules. So, if you don't need HTML to be preserved, you'll want to use htmlspecialchars(). But then, again, at this point, your regular expressions would have nothing to do with your escaping, and could be anything.

Security sidenote

Actually, my mission is to let pass the input value only if it match my regexp-white-list; else, return it back to the user.

This may not be true for your scenario, but just as general information: The philosophy of 'returning bad input back to the user' runs risk of opening you to reflected XSS attacks. The user is not always the attacker, so when returning things to the user, make sure you escape it all the same. Just something to keep in mind.

本回答被题主选为最佳回答 , 对您是否有帮助呢?

解决无用
评论打赏
分享
举报

评论

按下Enter换行，Ctrl+Enter发表内容

查看更多回答(6条)

报告相同问题？

关注问题

preg_match在输入清理方面是否足够安全？ php
2010-04-12 14:38

回答 7 已采纳 Whether or not a regular expression suffices for filtering depends on the regular expression. If y
求php一条preg_match_all正则，取指定div的id开头？ php 正则表达式
2021-08-21 14:27

回答 1 已采纳 $reg = "/<div id=\"num_(.*?)_off\".*?>.*?<\/div>/ism";
PHP用preg_match_all正则多个关键字怎么写? php
2017-11-30 05:36

回答 8 已采纳 []改为() ``` $pattaern0='/(你好|中国|国家|新年|娱乐|程序|羁绊|www\\.baidu\\.com|google)+/u'; ```
php preg_replace html,php-使用preg_replace清理小部件输出HTML
2021-04-13 13:35

文木禾的博客我遇到的问题是该小部件生成的HTML无效,下面是一个示例…I am item 1I am item 2I am item 3I am item 4根据我的理解,此代码有两点错误,首先是它在段落内部使用了div(块级元素).第二个是还有一个额外的P标签,该标签...
php preg_match在限制内找到字符串里面的字符串 php
2018-01-30 18:30

回答 2 已采纳 Regex: "([a-z0-9]{32})" or (?<=")[a-z0-9]{32}(?=") $text = "pagination},queryId:\"472f257a40c6
Php preg_match_all仅匹配最后一个元素 php
2019-07-19 08:34

回答 2 已采纳 Here is another variant using \G that is bit faster and avoids empty matches: (?:{{([\w-]+(?:\h+[
使用正则表达式和php preg_match_all在括号之间获取字符串 php
2017-07-14 12:34

回答 2 已采纳 This method will extract your desired substrings and prepare the output data as you have requested
反序列化漏洞学习
2022-08-02 09:06

sctalish的博客并且不会报错,可以被正常反序列化绕过部分正则 preg_match('/^O:\d+/')匹配序列化字符串是否是对象字符串开头 **：利用+绕过或 serialize(array(a)); //a为要反序列化的对象(序列化结果开头是a，不影响作为数组...
preg_match如何返回匹配？ php
2018-12-17 14:58

回答 5 已采纳 Your regex is flawed. Use this: preg_match('/^Tmps(.+)$/', $fieldName, $matches); echo($matches[1
PHP preg_match_all谜语 php
2018-07-30 22:29

回答 1 已采纳 /<tr>.*?class="DD.*?/ says "find <tr>, then match everything until you find class="D
PHP preg_match_all不处理大数据 laravel php
2018-05-16 06:59

回答 1 已采纳 The pattern at play matches balanced curly brackets using regex recursion. The pattern itself look
php用正则_PHP 正则表达式常用函数使用小结
2021-03-22 23:31

weixin_39852276的博客在PHP中有两套正则表达式函数库。一套是由PCRE(Perl Compatible Regular Expression)库提供的。PCRE库使用和Perl相同的语法规则实现了正则表达式的模式匹配，其使用以“preg_”为前缀命名的函数。另一套是由POSIX...
如果模式不匹配，如何使preg_match_all返回一个空数组值？ php
2017-10-23 16:11

回答 2 已采纳 It looks like each iteration can only return a maximum of one match, so preg_match_all with the in
php发送邮件安全风险,安全-邮件发送前如何清理PHP中的用户输入？
2021-05-04 02:03

朵拉陈的博客安全-邮件发送前如何清理PHP中的用户输入？我有一个简单的PHP邮件程序脚本，该脚本从通过POST提交的表单中获取值并将其邮寄给我：$to = "me@example.com";$name = $_POST['name'];$message = $_POST['message'];$...
php图片64位处理,PHP处理base64编码图片
2021-03-25 09:22

应钟有微的博客 /*** 处理base64编码格式的图片* @param $base64_image_content* @return mixed*/function saveBase64Image($base64_image_content){if (preg_match('/^(data:\s*image\/(\w+);base64,)/', $base64_image_content, $...
2021-03-04-php序列化与反序列化
2022-03-23 15:10

黑朱雀的博客 preg_match(’/[oc]:\d+:/i’, $var)的绕过 unserialize()的__wakeup()漏洞我们分析一下一会儿代码的过程：（就是我们传过去var变量后会发生什么）首先var会被base64解码，然后会先进行正则表达式的匹配 '/[oc]:.
php常用自定义函数集
2022-01-07 21:27

xmode的博客 php /** * 过滤html、script、css标签 * * @param string $str 待过滤字符串 * @param int $mode 过滤模式：0-过滤全部; 1-仅过滤script; 2-过滤script+css；3-保留基本标签 * * @return string 返回过滤后的...
php学习笔记6--php中的正则表达式函数
2014-08-16 23:48

Iron-Man的博客 PHP中的正则表达式函数 ...PCRE库使用和Perl相同的语法规则实现了正则表达式的模式匹配，其使用以“preg_”为前缀命名的函数。另一套是由POSIX（Portable Operation System interface）扩展库提供的。POSIX扩展
解：第二周TLS_任务(2)
2023-03-20 18:49

river_mouth_man的博客 [第五空间 2021]EasyCleanup，[强网杯 2019]随便注，[虎符CTF 2022]ezphp，[鹏城杯 2022]简单的php，[RoarCTF 2019]Easy Java，[NPUCTF2020]ezinclude
php抓取curl下载文件,PHP 利用 Curl 函数实现多线程抓取网页和下载文件
2021-05-08 05:31

weixin_39925098的博客 PHP 利用 Curl Functions 可以完成各种传送文件操作，比如模拟浏览器发送GET，POST请求等等，然而因为php语言本身不支持多线程，所以开发爬虫程序效率并不高，因此经常需要借助Curl Multi Functions 这个功能实现...
没有解决我的问题, 去提问

悬赏问题

¥15 msix packaging tool打包问题
¥28 微信小程序开发页面布局没问题，真机调试的时候页面布局就乱了
¥15 python的qt5界面
¥15 无线电能传输系统MATLAB仿真问题
¥50 如何用脚本实现输入法的热键设置
¥20 我想使用一些网络协议或者部分协议也行，主要想实现类似于traceroute的一定步长内的路由拓扑功能
¥30 深度学习，前后端连接
¥15 孟德尔随机化结果不一致
¥15 apm2.8飞控罗盘bad health，加速度计校准失败
¥15 求解O-S方程的特征值问题给出边界层布拉休斯平行流的中性曲线

preg_match在输入清理方面是否足够安全？

7条回答 默认 最新

Database

HTML

Security sidenote

悬赏问题

7条回答默认最新