I have articles and keywords stored inside MySQL. The site will preprocess the new articles to find how many matching keywords there are and then update a table which stores the relevant keywords related to the article. This will then be used on the front-end by highlighting keywords within the article and will link users to articles with the same matching keywords.
My concern here is how to do this processing efficiently. My idea is: when processing new articles, it finds the ngrams of the text (up to 3- or 4-gram) and then search each against the keywords table in the MySQL database. This may end up being a slow mess, I haven't tried. But maybe I'm approaching this the wrong way?
Any resources on how to do this efficiently would be awesome. Language used here is primarily PHP.