I am trying to group words of 4 or more characters with words of 3 or less characters using preg_match_all()
in PHP. I am doing this for a keyword search function where users can enter things like "An elephant" and I cannot have any results come back that have just "An" in them.
Therefore instead of breaking the keywords apart by spaces, (e.g. "An", "elephant") I need to put the keywords of three or less characters with the next or previous keyword. (e.g. "An elephant", "History of")
In order to accomplish this I am trying to use conditional sub patterns but I am not sure if I am really on the right track here.
Here's the best I've got so far:
(\s\S{1,3}\s*)?(?(1)\S+)
Yet I seem to also be matching a whole bunch of empty spaces as well. Can someone please point me in the right direction?
In the case of "History of elephants" I am trying to get it to create two matches: "History of", and "elephants".
I cannot simply omit the "stop words" because they are important in this case. The real-life use case is searching for course titles such as "Calculus A" and in that case "A" is important.