I have constant problems with data where odd characters like  will show up in our database causing everything to break at some point down the line. I need to get a system in place that only allows specific characters through and ignores all of these crazy things that can be pasted from Microsoft Office. Is there something like this built in, or should I start from scratch?
删除所有类型字符
- 写回答
- 好问题 0 提建议
- 关注问题
- 邀请回答
-
2条回答 默认 最新
doucai6663 2011-07-21 17:29关注Well, you can remove all such characters via e.g.
$text = preg_replace('@[^\d\w\s,.;:]@', '', $text);where[^\d\w\s,.;:]is a set of characters to keep (\d\w\s means all digits, letters, and spaces). Amend the set with other characters you do want to keep.However, that is the wrong approach. You should instead ensure that your entire application is using and processing UTF-8 from ground up, so that you can store and handle those characters correctly. Making an ASCII or ISO Latin site in this day and age is just weird and essentially causes data loss due to cutting out characters that people actually use...
解决 无用评论 打赏 举报