douzhuiqing1151 2009-05-26 09:27
浏览 168
已采纳

PHP中的异步处理或消息队列(CakePHP)[关闭]

I am building a website in CakePHP that processes files uploaded though an XML-RPC API and though a web frontend. Files need to be scanned by ClamAV, thumbnails need to be generated, et cetera. All resource intensive work that takes some time for which the user should not have to wait. So, I am looking into asynchronous processing with PHP in general and CakePHP in particular.

I came across the MultiTask plugin for CakePHP that looks promising. I also came across various message queue implementations such as dropr and beanstalkd. Of course, I will also need some kind of background process, probably implemented using a Cake Shell of some kind. I saw MultiTask using PHP_Fork to implement a multithreaded PHP daemon.

I need some advice on how to fit all these pieces together in the best way.

  • Is it a good idea to have a long-running daemon written in PHP? What should I watch out for?
  • What are the advantage of external message queue implementations? The MultiTask plugin does not use an external message queue. It rolls it's own using a MySQL table to store tasks.
  • What message queue should I use? dropr? beanstalkd? Something else?
  • How should I implement the backend processor? Is a forking PHP daemon a good idea or just asking for trouble?

My current plan is either to use the MultiTask plugin or to edit it to use beanstald instead of it's own MySQL table implementation. Jobs in the queue can simply consist of a task name and an array of parameters. The PHP daemon would watch for incoming jobs and pass them out to one of it's child threads. The would simply execute the CakePHP Task with the given parameters.

Any opinion, advice, comments, gotchas or flames on this?

  • 写回答

4条回答 默认 最新

  • dongshuohuan5291 2009-06-02 16:22
    关注

    I've had excellent results with BeanstalkD and a back-end written in PHP to retrieve jobs and then act on them. I wrapped the actual job-running in a bash-script to keep running if even if it exited (unless I do a 'exit(UNIQNUM);', when the script checks it and will actually exit). In that way, the restarted PHP script clears down any memory that may have been used, and can start afresh every 25/50/100 jobs it runs.

    A couple of the advantages of using it is that you can set priorities and delays into a BeanstalkD job - "run this at a lower priority, but don't start for 10 seconds". I've also queued a number of jobs up at the some time (run this now, in 5 seconds and again after 30 secs).

    With the appropriate network configuration (and running it on an accessible IP address to the rest of your network), you can also run a beanstalkd deamon on one server, and have it polled from a number of other machines, so if there are a large number of tasks being generated, the work can be split off between servers. If a particular set of tasks needs to be run on a particular machine, I've created a 'tube' which is that machine's hostname, which should be unique within our cluster, if not globally (useful for file uploads). I found it worked perfectly for image resizing, often returning the finished smaller images to the file system before the webpage itself that would refer to it would refer to the URL it would be arriving at.

    I'm actually about to start writing a series of articles on this very subject for my blog (including some techniques for code that I've already pushed several million live requests through) - My URL is linked from my user profile here, on Stackoverflow.

    (I've written a series of articles on the subject of Beanstalkd and queuing of jobs)

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(3条)

报告相同问题?

悬赏问题

  • ¥50 如何用脚本实现输入法的热键设置
  • ¥20 我想使用一些网络协议或者部分协议也行,主要想实现类似于traceroute的一定步长内的路由拓扑功能
  • ¥30 深度学习,前后端连接
  • ¥15 孟德尔随机化结果不一致
  • ¥15 apm2.8飞控罗盘bad health,加速度计校准失败
  • ¥15 求解O-S方程的特征值问题给出边界层布拉休斯平行流的中性曲线
  • ¥15 谁有desed数据集呀
  • ¥20 手写数字识别运行c仿真时,程序报错错误代码sim211-100
  • ¥15 关于#hadoop#的问题
  • ¥15 (标签-Python|关键词-socket)