dsbiw2911188 2014-07-04 18:23
浏览 78
已采纳

在PHP中并行处理/分叉以加速检查大型数组

I have a php script on my website that is designed to give a nice overview of a domain name the user enters. It does this job quite well, however it is very slow. This might have something to do with the fact it's checking an array of 64 possible domain names, and THEN moving on to checking nameservers for A records/MX records/NS records etc.

What i would like to know, is it possible to run multiple threads/child processes of this? So that it will check multiple ellements of the array at once, and generate the output a lost faster?

I've put an example of my code in a pastebin (so to avoid creating a huge and spammy post on here) http://pastebin.com/Qq9qKtP9

In perl I can do something like this:

  $fork = new Parallel::ForkManager($threads);
  foreach(Something here){
  $fork->start and next;
  $fork->finish;
  }

And i could make the loop run in as many processes as needed. Is something similar possible in PHP or any other ways you can think of to speed this up? The main issue is, cloudflare has a timeout, and often it will take long enough CF blocks the response happening.

Thanks

  • 写回答

3条回答 默认 最新

  • douxuanma4357 2014-07-04 23:26
    关注

    The first thing you want to do is optimze your code to shorten the execution time as much as possible. For example, instead of making five dns queries: $NS = dns_get_record($murl, DNS_NS); $MX = dns_get_record($murl,DNS_MX); $SRV = dns_get_record($murl,DNS_SRV); $A = dns_get_record($murl,DNS_A); $TXT = dns_get_record($murl,DNS_TXT);

    You can only call dns_get_record once: $DATA = dns_get_record($murl, DNS_NS + DNS_MX + DNS_SRV + DNS_A + DNS_TXT); and parse out the variables from there.

    Instead of outright forking processes to handle several parts concurrently, I'd implement a queue that all of the requests would get pushed into. The query processor would be limited as to how many items it could process at once, avoiding the potential DoS if hundreds or thousands of requests hit your site at the same time. Without some sort of limiting mechanism, you'd end up with so many processes that the server might hang.
    As for the processor, in addition to the previously mentioned items, you could try pecl/Gearman as your queue processor. I haven't used it, but it appears to do what you're looking for.

    Another method to optimize this would be implementing a caching system, that saved the results for, say, a week (or whatever). This would cut down on someone looking up the same site repeatedly in a day (or running a script on your site).

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(2条)

报告相同问题?

悬赏问题

  • ¥15 HFSS 中的 H 场图与 MATLAB 中绘制的 B1 场 部分对应不上
  • ¥15 如何在scanpy上做差异基因和通路富集?
  • ¥20 关于#硬件工程#的问题,请各位专家解答!
  • ¥15 关于#matlab#的问题:期望的系统闭环传递函数为G(s)=wn^2/s^2+2¢wn+wn^2阻尼系数¢=0.707,使系统具有较小的超调量
  • ¥15 FLUENT如何实现在堆积颗粒的上表面加载高斯热源
  • ¥30 截图中的mathematics程序转换成matlab
  • ¥15 动力学代码报错,维度不匹配
  • ¥15 Power query添加列问题
  • ¥50 Kubernetes&Fission&Eleasticsearch
  • ¥15 報錯:Person is not mapped,如何解決?