drmg17928 2011-06-11 04:23
浏览 50
已采纳

需要有关在PHP中扩展的指导

I have a class which makes use of regular expression for Natural Language Processing and the time spend processing the large amount of data it is fed does not look promising.

I'm looking into having it scaled out, have the means of doing things in parallel, which I have yet to have any experience of.

I was hoping someone could explain what I am trying to get myself into, pros and cons of doing this in php. Also if you could provide good resources on scaling in general or much better scaling in php. Thanks.

EDIT:

foreach ($sentences as $sentence) { 
  // for each sentence check if a keyword or any of its synonyms
  // appear together with any sentiment applicable to the keyword
  foreach ($this->keywords as $keyword => $synonyms) {              
    foreach ($this->sentiments[$keyword] as $sentiment => $weight) {
      $match = $this->check($sentence, $synonyms, $sentiment);
    }
  }
}

// regex part of the code
$keywords = implode('|', $keywords);
$pattern = "/(\b$sentiment\b(.*|\s)\b($keywords)\b|\b($keywords)\b(.*|\s)\b$sentiment\b)/i";

preg_match_all($pattern, $sentence, $matches);
  • 写回答

2条回答 默认 最新

  • duancashi1362 2011-06-11 05:11
    关注

    PHP may not be a great choice for that type of application. Its a rather high level language and with it comes overhead that may slow down any significant processing.

    Now if you want to stick to PHP, you can do it with some sort of job managing application. There may already be some applications you could use like gearman, or even hadoop. You break your data down into chunks and feed it to the application. With those tools you can scale your processing over one or more servers.

    If you use Amazon web services, you may want to look at Elastic Map Reduce and see if it fits your needs.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥15 msix packaging tool打包问题
  • ¥28 微信小程序开发页面布局没问题,真机调试的时候页面布局就乱了
  • ¥15 python的qt5界面
  • ¥15 无线电能传输系统MATLAB仿真问题
  • ¥50 如何用脚本实现输入法的热键设置
  • ¥20 我想使用一些网络协议或者部分协议也行,主要想实现类似于traceroute的一定步长内的路由拓扑功能
  • ¥30 深度学习,前后端连接
  • ¥15 孟德尔随机化结果不一致
  • ¥15 apm2.8飞控罗盘bad health,加速度计校准失败
  • ¥15 求解O-S方程的特征值问题给出边界层布拉休斯平行流的中性曲线