I'm working on using htmlpurifier to create a text-only version of my site. I now need to replace all the a hrefs with the text only url i.e. 'www.example.com/aboutus' becomes 'www.example.com/text/aboutus'
Initially I tried a simple str_replace on the domain (I use a global variable for the domain), but the problem is links to files also get replaced i.e. 'www.example.com/document.pdf' becomes 'www.example.com/text/document.pdf' and therefore fails.
Is there a regular expression where I can say replace domain with domain/text where the url does not include string?
Thanks for any pointers you might be able to give me :)