ds2321 2009-09-28 21:32
浏览 30
已采纳

使用星号PHP脚本匹配电话前缀的最快方法

and thanks in advance for the help.

Background - I am writing a PHP script that needs to figure out the destination the caller is trying to reach. Telephony prefixes are strings that identify a destination. For each call, the program must find the longest prefix that matches the string. For example the number 30561234567 would be matched by 305 but not by 3057 or 304. If 3056 existed it would be the preferred match.

After investigating several data structures, a tree in which each node stores a digit and contains pointers to the other 10 possible choices seems ideal. This could be implemented as an array of arrays, where the script could check 3, find an array there, then check 0 on that new array, find another array and so on until it finds a value instead of an array. This value would be the destination id (the output of the script).

Requirements - Performance is absolutely critical. Time spent checking these prefixes delays the call, and each server has to handle large amounts of calls, so the data structure must be stored in memory. There are approximately 6000 prefixes at the moment.

Problem - The script is run each time the server receives a call, so the data must be held in a cache server of some sort. After checking memcached and APC (Advanced PHP Cache), I decided to use APC because it is [much faster][3] (it is a local memory store)

The problem I have is that the array of arrays can become up to 10 arrays deep, and will be a very complex data structure that, if I add to the cache as an object, will take a long time to de-serialize.

However if I add every single array to the cache separately (with some logical naming structure to be able to find it easily like 3 for array 3, then 30 for array 30, 305 for the array that follows that patch etc...) I will have to fetch different arrays from the cache many times (up to 10 per call), making me hit the cache too often.

Am I going about this the right way? Maybe there is another solution? Or is one of the methods I have proposed superior to the other?

Thank you for you input,

Alex

  • 写回答

6条回答 默认 最新

  • douyun1860 2009-09-28 23:35
    关注

    Here is some sample code for an N-ary tree structure;

    class PrefixCache {
     const EOS = 'eos';
     protected $data;
    
     function __construct() {
      $this->data = array();
      $this->data[self::EOS] = false;
     }
    
     function addPrefix($str) {
      $str = (string) $str;
      $len = strlen($str);
    
      for ($i=0, $t =& $this->data; $i<$len; ++$i) {
       $ch = $str[$i];
    
       if (!isset($t[$ch])) {
        $t[$ch] = array();
        $t[$ch][self::EOS] = false;
       }
    
       $t =& $t[$ch];
      }
    
      $t[self::EOS] = true;
     }
    
     function matchPrefix($str) {
      $str = (string) $str;
      $len = strlen($str);
    
      $so_far = '';
      $best = '';
    
      for ($i=0, $t =& $this->data; $i<$len; ++$i) {
       $ch = $str[$i];
    
       if (!isset($t[$ch]))
        return $best;
       else {
        $so_far .= $ch;
        if ($t[$ch][self::EOS])
         $best = $so_far;
    
        $t =& $t[$ch];     
       }
      }
    
      return false; // string not long enough - potential longer matches remain
     }
    
     function dump() {
      print_r($this->data);
     }
    }
    

    this can then be called as

    $pre = new PrefixCache();
    
    $pre->addPrefix('304');
    $pre->addPrefix('305');
    // $pre->addPrefix('3056');
    $pre->addPrefix('3057');
    
    echo $pre->matchPrefix('30561234567');
    

    which performs as required (returns 305; if 3056 is uncommented, returns 3056 instead).

    Note that I add a terminal-flag to each node; this avoids false partial matches, ie if you add prefix 3056124 it will properly match 3056 instead of returning 305612.

    The best way to avoid reloading each time is to turn it into a service; however, before doing so I would measure run-times with APC. It may well be fast enough as is.

    Alex: your answer is absolutely correct - but not applicable to this question :)

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(5条)

报告相同问题?

悬赏问题

  • ¥100 set_link_state
  • ¥15 虚幻5 UE美术毛发渲染
  • ¥15 CVRP 图论 物流运输优化
  • ¥15 Tableau online 嵌入ppt失败
  • ¥100 支付宝网页转账系统不识别账号
  • ¥15 基于单片机的靶位控制系统
  • ¥15 真我手机蓝牙传输进度消息被关闭了,怎么打开?(关键词-消息通知)
  • ¥15 装 pytorch 的时候出了好多问题,遇到这种情况怎么处理?
  • ¥20 IOS游览器某宝手机网页版自动立即购买JavaScript脚本
  • ¥15 手机接入宽带网线,如何释放宽带全部速度