$ echo abcabcabc | perl -ne 'print $1 if /(a.*?c)$/'
abcabcabc
# what, non-greedy become greedy?
Non-greedy means it'll match the fewest characters possible at the current location such that the entire pattern matches.
After matching a
at position 0
, bcabcab
is the least .*?
can match at position 1
while still satisfying the rest of the pattern.
"abcabcabc" = /a.*?c$/
in detail:
- At pos 0,
a
matches 1 char (a
).
- At pos 1,
.*?
matches 0 chars (empty string).
- At pos 1,
c
fails to match. Backtrack!
- At pos 1,
.*?
matches 1 char (b
).
- At pos 2,
c
matches 1 char (c
).
- At pos 3,
$
fails to match. Backtrack!
- At pos 1,
.*?
matches 2 chars (bc
).
- At pos 1,
c
fails to match. Backtrack!
- ...
- At pos 1,
.*?
matches 7 chars (bcabcab
).
- At pos 8,
c
matches 1 char (c
).
- At pos 9,
$
matches 0 chars (empty string). Match successful!
"abcabcabc" = /a.*c$/
in detail (for contrast):
- At pos 0,
a
matches 1 char (a
).
- At pos 1,
.*
matches 8 chars (abcabcabc
).
- At pos 9,
c
fails to match. Backtrack!
- At pos 1,
.*
matches 7 chars (abcabcab
).
- At pos 8,
c
matches 1 char (c
).
- At pos 9,
$
matches 0 chars (empty string). Match successful!
Tip: Avoid patterns with two instances of a non-greediness modifier. Unless you are using them as an optimization, there's a good chance they can match something you don't want them to match. This is relevant here because patterns implicitly start with \G(?s:.*?)\K
(unless cancelled out by a leading ^
, \A
or \G
).
What you want is one of the following:
/a[^a]*c$/
/a[^c]*c$/
/a[^ac]*c$/
You could also use one of the following:
/a(?:(?!a).)c$/s
/a(?:(?!c).)c$/s
/a(?:(?!a|c).)c$/s
It would be inefficient and unreadable to use these latter three in this situation, but they will work with boundaries that are longer than one character.