I'm trying to figure out a way to use regular expressions to find duplicate words on a webpage, I'm completely clueless and apologise in advance if I'm using the incorrect terminology.
So far I've found the following regular expressions which work well but only on words that are consecutively (e.g. hello hello) but not words that are placed in different parts of the webpage or separated by another word (e.g. hello food hello)
\b(\w+)(\s+\1\b)*
\b(\w+(?:\s*\w*))\s+\1\b
I would be super grateful to anyone that can help, I realise I might not be in the right place since I'm basically a noob.