This kind of problem are easily solved with preg_replace_callback
. The idea consists to extract the substring between quotes and then to edit it in the callback function:
$text = preg_replace_callback('~"[^"]*"~', function ($m) {
return preg_replace('~\s~', '#', $m[0]);
}, $text);
It's the most simple way.
It's more complicated to do it with a single pattern with preg_replace
but it's possible:
$text = preg_replace('~(?:\G(?!\A)|")[^"\s]*\K(?:\s|"(*SKIP)(*F))~', '#', $text);
demo
Pattern details:
(?:
\G (?!\A) # match the next position after the last successful match
|
" # or the opening double quote
)
[^"\s]* # characters that aren't double quotes or a whitespaces
\K # discard all characters matched before from the match result
(?:
\s # a whitespace
|
" # or the closing quote
(*SKIP)(*F) # force the pattern to fail and to skip the quote position
# (this way, the closing quote isn't seen as an opening quote
# in the second branch.)
)
This way uses the \G
anchors to ensure that all matched whitespaces are between the quotes.
Edge cases:
-
there's an orphan opening quote: In this case, all whitespaces from the last quote until the end of the string are replaced. But if you want you can change this behavior adding a lookahead to check if the closing quote exists:
~(?:\G(?!\A)|"(?=[^"]*"))[^"\s]*\K(?:\s|"(*SKIP)(*F))~
-
double quotes can contain escaped double quotes that have to be ignored: You have to describe escaped characters like this:
~(?:\G(?!\A)|")[^"\s\\\\]*+(?:\\\\\S[^"\s\\\\]*)*+(?:\\\\?\K\s|"(*SKIP)(*F))~
Other strategy suggested by @revo: check if the number of remaining quotes at a position is odd or even using a lookahead:
\s(?=[^"]*+(?:"[^"]*"[^"]*)*+")
It is a short pattern, but it can be problematic with long strings since for each position with a whitespace you have to check the string until the last quote with the lookahead.