I have a class which makes use of regular expression for Natural Language Processing and the time spend processing the large amount of data it is fed does not look promising.
I'm looking into having it scaled out, have the means of doing things in parallel, which I have yet to have any experience of.
I was hoping someone could explain what I am trying to get myself into, pros and cons of doing this in php. Also if you could provide good resources on scaling in general or much better scaling in php. Thanks.
EDIT:
foreach ($sentences as $sentence) {
// for each sentence check if a keyword or any of its synonyms
// appear together with any sentiment applicable to the keyword
foreach ($this->keywords as $keyword => $synonyms) {
foreach ($this->sentiments[$keyword] as $sentiment => $weight) {
$match = $this->check($sentence, $synonyms, $sentiment);
}
}
}
// regex part of the code
$keywords = implode('|', $keywords);
$pattern = "/(\b$sentiment\b(.*|\s)\b($keywords)\b|\b($keywords)\b(.*|\s)\b$sentiment\b)/i";
preg_match_all($pattern, $sentence, $matches);