This question already has an answer here:
how can i replace all non word characters (utf-8) in a string ?
for ASCII:
$url = preg_replace("/\W+/", " ", $url);
is there any equivalent for UTF-8 ?
</div>
This question already has an answer here:
how can i replace all non word characters (utf-8) in a string ?
for ASCII:
$url = preg_replace("/\W+/", " ", $url);
is there any equivalent for UTF-8 ?
</div>
You can use the Xwd character class that contains letters, digits and underscore:
$url = preg_replace('~\P{Xwd}+~u', ' ', $url);
If you don't want the underscore, you can use Xan
\p{Xwd}
(Perl word character) is a predefined character class and \P{Xwd}
is the negation of this class.
The u
modifier means that the string must be treated as an unicode string.
equivalence:
\p{Xan} <=> [\p{L}\p{N}]
\p{Xwd} <=> [\p{Xan}_]