dougou8639 2014-10-09 13:29
浏览 155
已采纳

Amazon ElastiCache Memcached / Redis:将IP范围映射到国家/地区

I have a MySQL database that has an ip range (start and end, so two columns) and a country code (1 column). The database is used to look up the country based upon an ip address. It works but I want to speed it up more. An idea is to store the data on Amazon ElastiCache using e.g. Redis or Memcache. The problem I have is how would one go with this approach? Redis as well as Memcache uses key - values making it, in my opinion, difficult to store an IP range as well as the country code. What approach would you suggest for using ElastiCache Memcache or Redis?

The country range would be something like:

  • 192.168.1.1 - 192.168.1.100 (Country A)
  • 192.168.2.1 - 192.168.2.50 (Country B)
  • 192.168.1.150 - 192.168.1.200 (Country A)

Now I get the IP address e.g. 192.168.1.160, I need to look this up as fast as possible and return in this case Country A.

Looking forward to your ideas.

Marc

  • 写回答

2条回答 默认 最新

  • drxm5014 2017-05-25 16:58
    关注

    Just saw your question, even it is long ago that you asked, I have a solution proposal using Redis.

    Let's first try modelling the problem with some basic numbers (instead of IPs) and see how it can be solved:

    Range to Country Lookup

       Lookup |   Range    |     Country
      --------|------------+------------------
              |     5      |  begin:Country A
          L1 >>>           |
              |     10     |  end:Country A
              |            |
          L2 >>>           |
              |            |
          L2.1>>>   15     |  begin:Country B
              |            |
              |     20     |  end:Country B
          L3 >>>           |
              |            |
    

    Lookup L1:

    Make a number lookup between [6,10] (here inclusive range). In that case result will be end:Country A => IP Address belongs to Country A. Why we start with 6 will be obvious in L2.

    Lookup L2:

    Lookup a number in range [11, 15] (here inclusive range) . Result will be begin:Country B =>

    • IF Lookup L2.1
      => Looked Up Number points to the range start, i.e. begin:Country B
      => OK: iff IP belongs to begin:Country B value directly

    • ELSE ERROR: IP does not belong to any known range

    Lookup L3:

    Result will be Empty List or Set => ERROR: IP does not belong to any known range

    Insertion is more tricky!

    Inserting ranges must be taken care, because a newly inserted range might break an existing range. Here are the cases of insertion:

       Insert |   Range    |     Country
      --------|------------+------------------
              |     5      |  begin:Country A
              |            |
          I1 >>>    8,9    |  !!! Country C !!!
              |            |
              |     10     |  end:Country A
              |            |
              |            |
          I2 >>>    12,14  |  Country E
              |            |
              |            |
              |     15     |  begin:Country B
              |            |
          I3 >>>    17,21  |  !!! Country D !!!
              |            |
              |     20     |  end:Country B
              |            |
          I4 >>>    22,27  |  Country F
              |            |
    

    Insertion I1:

    Renders Addresses with IPs 6 and 7 (between 5 and 8) to be invalid. => Effectively Country A range shrinks to single IP Address 10.

    Insertion I2:

    OK, no range intersections

    Insertion I3:

    Renders begin of Country B invalid + renders begin of Country D (17..20) invalid

    Insertion I4:

    OK

    Note: Probably you will need to introduce range splitting logic in some of the cases.

    Redis based solution

    I'd propose to use Redis ZSET for that purpose. Here are the observations:

    1. Every IPv4 address can be represented as a 32bit integer besides the dotted decimal string representation.

    2. Redis ZSET guarantees uniqueness of the stored members by additionally ordering them with scores

    3. We can search for ZSET members by using a score range, i.e. ZRANGEBYSCORE command.

    If we use numeric IP as a ZSET score, we are done. Uniqueness of a country is enforced by prepending begin: and end: prefixes for particular range. In real life one country might have multiple IP address ranges, thus you will probably end up encoding the range number into the Country name, like begin:r1:Country A and end:r1:Country A. You could normalize this and introduce indirection. But to keep number of lookups low you'd want to denormalize it and have as much info as possible within single DB access. This is because introducing a new range is much less frequent than doing a lookup so increasing number of lookups will decrease performance.

       Lookup |   Score    |     Country
      --------|------------+------------------
              |     5      |  begin:Country A
          L1 >>>           |
              |     10     |  end:Country A
              |            |
          L2 >>>           |
              |            |
          L2.1>>>   15     |  begin:Country B
              |            |
              |     20     |  end:Country B
          L3 >>>           |
              |            |
    

    What Redis commands to use

    Here are just plain commands without your logic to check for error cases during insertions etc.

    • Adding a new range

      > ZADD ip-to-country 3232235777 "begin:Country A" 3232235876 "end:Country A"
      

      Note: 3232235777 is IPv4 192.168.1.1 represented as unsigned int, same applies to 192.168.1.100.

    • Checking to which range a particular IP belongs

      > ZRANGEBYSCORE ip-to-country 3232235778 +inf WITHSCORES LIMIT 0 1
      

      Note: 3232235778 is IPv4 192.168.1.2 represented as unsigned int and we make a lookup of one element (i.e. LIMIT 0 1) from 192.168.1.8 onwards (i.e. +inf).

    • Checking for Lookup 2.1 looked up IP starts the new range

       > ZSCORE ip-to-country "begin:Country A"
      

      Note: result will be 3232235777

    Complexity Analysis

    Space Complexity: If in worst case we end up with every single IP representing start and end of a Range, we will need O(2*N) space, where N is 2^32. But in real life this number will be much smaller. In some algorithm books you will see that 2^32 is considered a constant factor and thus will be reduced to O(1).

    Runtime Complexity: Redis states that ZRANGEBYSCORE is a O(log(N)+M) operation, where M is the number of elements in LIMIT, i.e. here only 1. If we have at most 2*2^32 scores in worst case than log2(8billion) is around 33 comparisons within Redis set implementation. But in reality I think there would not be more than 2 or 3 thousand ranges, which is around 11 comparisons. Redis states for a KEYS command:

    Redis running on an entry level laptop can scan a 1 million key database in 40 milliseconds.

    All in all your lookup will be blazing fast!

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥20 机器学习能否像多层线性模型一样处理嵌套数据
  • ¥20 西门子S7-Graph,S7-300,梯形图
  • ¥50 用易语言http 访问不了网页
  • ¥50 safari浏览器fetch提交数据后数据丢失问题
  • ¥15 matlab不知道怎么改,求解答!!
  • ¥15 永磁直线电机的电流环pi调不出来
  • ¥15 用stata实现聚类的代码
  • ¥15 请问paddlehub能支持移动端开发吗?在Android studio上该如何部署?
  • ¥20 docker里部署springboot项目,访问不到扬声器
  • ¥15 netty整合springboot之后自动重连失效