douzhuiqing1151 2009-05-26 09:27
浏览 168
已采纳

PHP中的异步处理或消息队列(CakePHP)[关闭]

I am building a website in CakePHP that processes files uploaded though an XML-RPC API and though a web frontend. Files need to be scanned by ClamAV, thumbnails need to be generated, et cetera. All resource intensive work that takes some time for which the user should not have to wait. So, I am looking into asynchronous processing with PHP in general and CakePHP in particular.

I came across the MultiTask plugin for CakePHP that looks promising. I also came across various message queue implementations such as dropr and beanstalkd. Of course, I will also need some kind of background process, probably implemented using a Cake Shell of some kind. I saw MultiTask using PHP_Fork to implement a multithreaded PHP daemon.

I need some advice on how to fit all these pieces together in the best way.

  • Is it a good idea to have a long-running daemon written in PHP? What should I watch out for?
  • What are the advantage of external message queue implementations? The MultiTask plugin does not use an external message queue. It rolls it's own using a MySQL table to store tasks.
  • What message queue should I use? dropr? beanstalkd? Something else?
  • How should I implement the backend processor? Is a forking PHP daemon a good idea or just asking for trouble?

My current plan is either to use the MultiTask plugin or to edit it to use beanstald instead of it's own MySQL table implementation. Jobs in the queue can simply consist of a task name and an array of parameters. The PHP daemon would watch for incoming jobs and pass them out to one of it's child threads. The would simply execute the CakePHP Task with the given parameters.

Any opinion, advice, comments, gotchas or flames on this?

  • 写回答

4条回答 默认 最新

  • dongshuohuan5291 2009-06-02 16:22
    关注

    I've had excellent results with BeanstalkD and a back-end written in PHP to retrieve jobs and then act on them. I wrapped the actual job-running in a bash-script to keep running if even if it exited (unless I do a 'exit(UNIQNUM);', when the script checks it and will actually exit). In that way, the restarted PHP script clears down any memory that may have been used, and can start afresh every 25/50/100 jobs it runs.

    A couple of the advantages of using it is that you can set priorities and delays into a BeanstalkD job - "run this at a lower priority, but don't start for 10 seconds". I've also queued a number of jobs up at the some time (run this now, in 5 seconds and again after 30 secs).

    With the appropriate network configuration (and running it on an accessible IP address to the rest of your network), you can also run a beanstalkd deamon on one server, and have it polled from a number of other machines, so if there are a large number of tasks being generated, the work can be split off between servers. If a particular set of tasks needs to be run on a particular machine, I've created a 'tube' which is that machine's hostname, which should be unique within our cluster, if not globally (useful for file uploads). I found it worked perfectly for image resizing, often returning the finished smaller images to the file system before the webpage itself that would refer to it would refer to the URL it would be arriving at.

    I'm actually about to start writing a series of articles on this very subject for my blog (including some techniques for code that I've already pushed several million live requests through) - My URL is linked from my user profile here, on Stackoverflow.

    (I've written a series of articles on the subject of Beanstalkd and queuing of jobs)

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(3条)

报告相同问题?

悬赏问题

  • ¥15 如何在scanpy上做差异基因和通路富集?
  • ¥20 关于#硬件工程#的问题,请各位专家解答!
  • ¥15 关于#matlab#的问题:期望的系统闭环传递函数为G(s)=wn^2/s^2+2¢wn+wn^2阻尼系数¢=0.707,使系统具有较小的超调量
  • ¥15 FLUENT如何实现在堆积颗粒的上表面加载高斯热源
  • ¥30 截图中的mathematics程序转换成matlab
  • ¥15 动力学代码报错,维度不匹配
  • ¥15 Power query添加列问题
  • ¥50 Kubernetes&Fission&Eleasticsearch
  • ¥15 報錯:Person is not mapped,如何解決?
  • ¥15 c++头文件不能识别CDialog