dsbiw2911188 2014-07-04 18:23
浏览 78
已采纳

在PHP中并行处理/分叉以加速检查大型数组

I have a php script on my website that is designed to give a nice overview of a domain name the user enters. It does this job quite well, however it is very slow. This might have something to do with the fact it's checking an array of 64 possible domain names, and THEN moving on to checking nameservers for A records/MX records/NS records etc.

What i would like to know, is it possible to run multiple threads/child processes of this? So that it will check multiple ellements of the array at once, and generate the output a lost faster?

I've put an example of my code in a pastebin (so to avoid creating a huge and spammy post on here) http://pastebin.com/Qq9qKtP9

In perl I can do something like this:

  $fork = new Parallel::ForkManager($threads);
  foreach(Something here){
  $fork->start and next;
  $fork->finish;
  }

And i could make the loop run in as many processes as needed. Is something similar possible in PHP or any other ways you can think of to speed this up? The main issue is, cloudflare has a timeout, and often it will take long enough CF blocks the response happening.

Thanks

  • 写回答

3条回答 默认 最新

  • douxuanma4357 2014-07-04 23:26
    关注

    The first thing you want to do is optimze your code to shorten the execution time as much as possible. For example, instead of making five dns queries: $NS = dns_get_record($murl, DNS_NS); $MX = dns_get_record($murl,DNS_MX); $SRV = dns_get_record($murl,DNS_SRV); $A = dns_get_record($murl,DNS_A); $TXT = dns_get_record($murl,DNS_TXT);

    You can only call dns_get_record once: $DATA = dns_get_record($murl, DNS_NS + DNS_MX + DNS_SRV + DNS_A + DNS_TXT); and parse out the variables from there.

    Instead of outright forking processes to handle several parts concurrently, I'd implement a queue that all of the requests would get pushed into. The query processor would be limited as to how many items it could process at once, avoiding the potential DoS if hundreds or thousands of requests hit your site at the same time. Without some sort of limiting mechanism, you'd end up with so many processes that the server might hang.
    As for the processor, in addition to the previously mentioned items, you could try pecl/Gearman as your queue processor. I haven't used it, but it appears to do what you're looking for.

    Another method to optimize this would be implementing a caching system, that saved the results for, say, a week (or whatever). This would cut down on someone looking up the same site repeatedly in a day (or running a script on your site).

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(2条)

报告相同问题?

悬赏问题

  • ¥20 西门子S7-Graph,S7-300,梯形图
  • ¥50 用易语言http 访问不了网页
  • ¥50 safari浏览器fetch提交数据后数据丢失问题
  • ¥15 matlab不知道怎么改,求解答!!
  • ¥15 永磁直线电机的电流环pi调不出来
  • ¥15 用stata实现聚类的代码
  • ¥15 请问paddlehub能支持移动端开发吗?在Android studio上该如何部署?
  • ¥20 docker里部署springboot项目,访问不到扬声器
  • ¥15 netty整合springboot之后自动重连失效
  • ¥15 悬赏!微信开发者工具报错,求帮改