dougai3418 2018-01-17 23:30
浏览 41

PHP:匹配子网列表中的IP(CIDR)

I have a list of CIDR like this:

192.168.0.1/24
10.0.0.1/32
etc...

The list is growing.
In order to check if an IP fits within one of these CIDR, I perform a loop with following function:

function cidr_match($ip, $range){
    list ($subnet, $bits) = explode('/', $range);
    $ip = ip2long($ip);
    $subnet = ip2long($subnet);
    $mask = -1 << (32 - $bits);
    $subnet &= $mask; // in case the supplied subnet was not correctly aligned
    return ($ip & $mask) == $subnet;
}

Since my CIDR list is growing, I would like to improve that function in order to avoid testing each line of CIDR one by one until it returns true. I would like to get rid of the for loop surrounding my function above.
Is there a way to perform kind of a "pre-check" on the IP I am about to check, so that it does not run the full list sequentially (from top to bottom)?
I want to optimize so that my code would behave this way: give the IP to the function --> the function kinds of "sorts" the list or "finds" the most probable CIDR --> run the check on the IP for the most probable CIDR(s) --> return "true" as soon as possible
A guidance will be appreciated.

  • 写回答

2条回答 默认 最新

  • douluo1330 2018-01-18 00:00
    关注

    Honestly, unless your CIDR range is humongous and you're checking lots of IPs within the same process, you're probably not going to see much in the way of performance gains. However, if that IS the scenario you're looking at, then you could look into trying to squeeze a bit of performance by pre-processing your ranges and IP (perform the ip2long() calls once and store the separated mask/subnet for comparison).

    For example, this is the way you're doing it today, I'm assuming:

    <?php
    // Original style
    $ranges = array(
      "192.168.0.1/32",
      "192.168.0.1/26",
      "192.168.0.1/24",
      "192.168.0.1/16",
      "127.0.0.1/24",
      "10.0.0.1/32",
      "10.0.0.1/24"
    );
    
    
    // Run the check
    $start = microtime(true);
    find_cidr("10.0.0.42", $ranges);
    find_cidr("192.168.0.12", $ranges);
    find_cidr("10.0.0.1", $ranges);
    $end = microtime(true);
    echo "Ran 3 find routines in " . ($end - $start) . " seconds!
    ";
    
    function find_cidr($ip, $ranges)
    {
      foreach($ranges as $range)
      {
        if(cidr_match($ip, $range))
        {
          echo "IP {$ip} found in range {$range}!
    ";
          break;
        }
      }  
    }
    
    function cidr_match($ip, $range){
        list ($subnet, $bits) = explode('/', $range);
        $ip = ip2long($ip);
        $subnet = ip2long($subnet);
        $mask = -1 << (32 - $bits);
        $subnet &= $mask; // in case the supplied subnet was not correctly aligned
        return ($ip & $mask) == $subnet;
    }
    

    On my machine, that runs in about 0.0005 - 0.001 seconds (checking 3 IPs against a small handful of ranges).

    If I write something to pre-process the ranges:

    <?php
    // Slightly-optimized style
    
    $ranges = array(
      "192.168.0.1/32",
      "192.168.0.1/26",
      "192.168.0.1/24",
      "192.168.0.1/16",
      "127.0.0.1/24",
      "10.0.0.1/32",
      "10.0.0.1/24"
    );
    
    $matcher = new BulkCIDRMatch($ranges);
    $start = microtime(true);
    $matcher->FindCIDR("10.0.0.42");
    $matcher->FindCIDR("192.168.0.12");
    $matcher->FindCIDR("10.0.0.1");
    $end = microtime(true);
    echo "Ran 3 find routines in " . ($end - $start) . " seconds!
    ";
    
    
    class BulkCIDRMatch
    {
      private $_preparedRanges = array();
    
      public function __construct($ranges)
      {
        foreach($ranges as $range)
        {
          list ($subnet, $bits) = explode('/', $range);
          $subnet = ip2long($subnet);
          $mask = -1 << (32 - $bits);
          $subnet &= $mask; // in case the supplied subnet was not correctly aligned
    
          $this->_preparedRanges[$range] = array($mask,$subnet);
        }
      }
    
      public function FindCIDR($ip)
      {
        $result = $this->_FindCIDR(ip2long($ip));
        if($result !== null)
        {
          echo "IP {$ip} found in range {$result}!
    ";
        }
        return $result;
      }
    
      private function _FindCIDR($iplong)
      {
        foreach($this->_preparedRanges as $range => $details)
        {
          if(($iplong & $details[0]) == $details[1])
          {
            return $range;
          }
        }
    
        // No match
        return null;
      }
    }
    

    ...then I see faster CHECKING but there's slightly more overhead at the beginning when the class is initialized and it processes and stores all the ranges. So if I timed the OVERALL run with just 3 IPs against a handful of ranges, the "optimized" way is actually a little slower. But if I run 1,000 IPs against 10,000 CIDRs, the "optimized" way will have more noticeable improvements over the original way (at the expense of additional memory usage to store the pre-processed range data).

    So it's all really up to volume and what you're trying to do.

    That said, if you're worried about 0.001 second performance being too slow, then PHP might not be the right language to use for your checks. Or at least you might want to consider writing a custom extension so more of the processing is done in C.

    EDIT: To answer the original question about finding "probable" ranges to check (prior to any sort of conversion from its string form), it's probably not a very reliable thing to try. Ranges can span past their initial octets, so if you start comparing those values (e.g. "I'm looking at 192.168.1.0, so I'm only going to look at ranges starting in 192"), you're not only incurring the performance overhead of the string comparison on each entry (which slows down the overall lookup), but you could potentially miss out on a valid range.

    评论

报告相同问题?

悬赏问题

  • ¥15 基于单片机数字电压表电路组成及框图
  • ¥100 任意维数的K均值聚类
  • ¥15 stamps做sbas-insar,时序沉降图怎么画
  • ¥15 unity第一人称射击小游戏,有demo,在原脚本的基础上进行修改以达到要求
  • ¥15 买了个传感器,根据商家发的代码和步骤使用但是代码报错了不会改,有没有人可以看看
  • ¥15 关于#Java#的问题,如何解决?
  • ¥15 加热介质是液体,换热器壳侧导热系数和总的导热系数怎么算
  • ¥100 嵌入式系统基于PIC16F882和热敏电阻的数字温度计
  • ¥15 cmd cl 0x000007b
  • ¥20 BAPI_PR_CHANGE how to add account assignment information for service line