dsbiw2911188
2014-07-04 18:23
浏览 73
已采纳

在PHP中并行处理/分叉以加速检查大型数组

I have a php script on my website that is designed to give a nice overview of a domain name the user enters. It does this job quite well, however it is very slow. This might have something to do with the fact it's checking an array of 64 possible domain names, and THEN moving on to checking nameservers for A records/MX records/NS records etc.

What i would like to know, is it possible to run multiple threads/child processes of this? So that it will check multiple ellements of the array at once, and generate the output a lost faster?

I've put an example of my code in a pastebin (so to avoid creating a huge and spammy post on here) http://pastebin.com/Qq9qKtP9

In perl I can do something like this:

  $fork = new Parallel::ForkManager($threads);
  foreach(Something here){
  $fork->start and next;
  $fork->finish;
  }

And i could make the loop run in as many processes as needed. Is something similar possible in PHP or any other ways you can think of to speed this up? The main issue is, cloudflare has a timeout, and often it will take long enough CF blocks the response happening.

Thanks

图片转代码服务由CSDN问答提供 功能建议

我的网站上有一个php脚本,旨在提供用户输入的域名的精彩概述。 它做得很好,但速度很慢。 这可能与它正在检查64个可能的域名数组的事实有关,然后继续检查名称服务器的A记录/ MX记录/ NS记录等。

我是什么 想知道,有可能运行多个线程/子进程吗? 这样它会一次检查数组的多个元素,并生成输出更快的丢失?

我在pastebin中放了一个代码示例(这样可以避免在这里创建一个巨大的垃圾邮件) http://pastebin.com/Qq9qKtP9

在perl我可以这样做:

  $ fork = new Parallel :: ForkManager($ threads); 
 foreach(Something here){
 $ fork-> start and next; 
 $ fork-> finish; 
}  
   
 
 

我可以根据需要在多个进程中运行循环。 在PHP中可以用类似的东西或者你能想到的任何其他方式加快速度吗? 主要问题是,cloudflare有一个超时,并且通常需要足够长的CF阻止响应发生。

谢谢

  • 点赞
  • 写回答
  • 关注问题
  • 收藏
  • 邀请回答

3条回答 默认 最新

  • douxuanma4357 2014-07-04 23:26
    已采纳

    The first thing you want to do is optimze your code to shorten the execution time as much as possible. For example, instead of making five dns queries: $NS = dns_get_record($murl, DNS_NS); $MX = dns_get_record($murl,DNS_MX); $SRV = dns_get_record($murl,DNS_SRV); $A = dns_get_record($murl,DNS_A); $TXT = dns_get_record($murl,DNS_TXT);

    You can only call dns_get_record once: $DATA = dns_get_record($murl, DNS_NS + DNS_MX + DNS_SRV + DNS_A + DNS_TXT); and parse out the variables from there.

    Instead of outright forking processes to handle several parts concurrently, I'd implement a queue that all of the requests would get pushed into. The query processor would be limited as to how many items it could process at once, avoiding the potential DoS if hundreds or thousands of requests hit your site at the same time. Without some sort of limiting mechanism, you'd end up with so many processes that the server might hang.
    As for the processor, in addition to the previously mentioned items, you could try pecl/Gearman as your queue processor. I haven't used it, but it appears to do what you're looking for.

    Another method to optimize this would be implementing a caching system, that saved the results for, say, a week (or whatever). This would cut down on someone looking up the same site repeatedly in a day (or running a script on your site).

    点赞 评论
  • doubo3384 2014-07-04 20:38

    I doubt that it's a good idea to fork with PHP the apache process. But if you really want there is PCNTL (which is not available in the apache module).

    You might have more fun with pthread. Nowadays you can even download a PHP which claims to be threadsafe.

    And finally you have the possibility to use classic non blocking IO which I would prefer in the case of PHP.

    点赞 评论
  • donglian4464 2014-07-06 19:19

    * Never Mind Support !! *

    You never want to create threads (or additional processes for that matter) in direct response to a web request.

    If your frontend is instructed to create 60 threads every time someone clicks on page.php, and 100 people come along and request page.php at once, you will be asking your hardware to create and execute 6000 threads concurrently, to say nothing of the threads used by operating system services and other software. For obvious reasons, this does not, and will never scale.

    Rather you want to separate out those parts of the application that require additional threads or processes and communicate with this part of the application via some kind of sane RPC. This means that the backend of the application can utilize concurrency via pthreads or forking, using a fixed number of threads or processes, and spreading work as evenly as possible across all available resources. This allows for an influx of traffic; it allows your application to scale.

    I won't write example code, it seems altogether too trivial.

    点赞 评论

相关推荐 更多相似问题