Preface:
For the sake of explanation, I decided to clarify your "labels" by preceding them with a %
. This can be any reserved symbol or other pattern that helps clear up what is a label/text:
/**
* @param variable_a %label:This is variable: a %required:true
* @param variable_b %required:false %pattern:/[a-zA-Z:]/
*/
Problem:
The problem with capturing repetitive patterns in regular expressions is you can't have an unknown amount of capture groups (i.e. you either need to match a global number of matches or capture a specific amount of groups in each match):
@param (?# find a param)
\s* (?# whitespace)
(\w+) (?# capture the variable)
\s* (?# whitespace)
(?: (?# start non capturing group)
%(\w+): (?# capture the label)
([^%
]+) (?# capture the text)
)+ (?# repeat the non-capturing group)
In this example, I put the label/text capturing code in a non-capturing and repeated (1+ times) group. This allows us to match the whole string, however only the last set of labels/texts are captured (since we only have 3 groups: variable, label, and text).
Straightforward Solution:
Instead of this, you can just match the whole string and then parse the label/text string after-the-fact:
(?# match the whole string)
@param (?# find a param)
\s* (?# whitespace)
(\w+) (?# capture the variable)
\s* (?# whitespace)
(.*) (?# capture the labels/texts)
(?# parse the label/text string)
% (?# the start of a label)
(\w+) (?# capture label)
: (?# end of label)
([^%]+) (?# capture text)
Awesome Solution:
Finally, we can use some regular expression magic to do a global match of all label/text combinations. This means we will have a defined set of 3 capture groups (variable, label, text) and we'll have a variable amount of matches. I think this one is best to show and then explain, so here is the crazy awesome regex magic:
(?: (?# start non-capturing group)
@param (?# find a param)
\s* (?# whitespace)
(\w+) (?# capture the variable)
\s* (?# whitespace)
| (?# OR)
\G (?# start back over from our last match)
) (?# end non-capturing group)
%(\w+): (?# capture the label)
([^%
]+) (?# capture the text)
This one revolves around the PCRE magic of \G
, which matches the end of the last match. So we start a non-capturing group that will contain the "prefix" of a @param
definition. This will either match and capture the variable OR start over from the end of our last match. Then we match/capture 1 label/text group. Next time it is repeated, we will start where we left off, the variable capture group will be blank (since it doesn't exist that far into the string, you'll have to use logic to know which variable you are on), and capture another label/text group (until we hit a new line, since I said a text can't be %
or
). Then the next match attempt will find a new variable defined by @param
. I think this will be your best option, it just takes some more logic on your end.