dougou8639 2014-10-09 13:29
浏览 155
已采纳

Amazon ElastiCache Memcached / Redis:将IP范围映射到国家/地区

I have a MySQL database that has an ip range (start and end, so two columns) and a country code (1 column). The database is used to look up the country based upon an ip address. It works but I want to speed it up more. An idea is to store the data on Amazon ElastiCache using e.g. Redis or Memcache. The problem I have is how would one go with this approach? Redis as well as Memcache uses key - values making it, in my opinion, difficult to store an IP range as well as the country code. What approach would you suggest for using ElastiCache Memcache or Redis?

The country range would be something like:

  • 192.168.1.1 - 192.168.1.100 (Country A)
  • 192.168.2.1 - 192.168.2.50 (Country B)
  • 192.168.1.150 - 192.168.1.200 (Country A)

Now I get the IP address e.g. 192.168.1.160, I need to look this up as fast as possible and return in this case Country A.

Looking forward to your ideas.

Marc

  • 写回答

2条回答 默认 最新

  • drxm5014 2017-05-25 16:58
    关注

    Just saw your question, even it is long ago that you asked, I have a solution proposal using Redis.

    Let's first try modelling the problem with some basic numbers (instead of IPs) and see how it can be solved:

    Range to Country Lookup

       Lookup |   Range    |     Country
      --------|------------+------------------
              |     5      |  begin:Country A
          L1 >>>           |
              |     10     |  end:Country A
              |            |
          L2 >>>           |
              |            |
          L2.1>>>   15     |  begin:Country B
              |            |
              |     20     |  end:Country B
          L3 >>>           |
              |            |
    

    Lookup L1:

    Make a number lookup between [6,10] (here inclusive range). In that case result will be end:Country A => IP Address belongs to Country A. Why we start with 6 will be obvious in L2.

    Lookup L2:

    Lookup a number in range [11, 15] (here inclusive range) . Result will be begin:Country B =>

    • IF Lookup L2.1
      => Looked Up Number points to the range start, i.e. begin:Country B
      => OK: iff IP belongs to begin:Country B value directly

    • ELSE ERROR: IP does not belong to any known range

    Lookup L3:

    Result will be Empty List or Set => ERROR: IP does not belong to any known range

    Insertion is more tricky!

    Inserting ranges must be taken care, because a newly inserted range might break an existing range. Here are the cases of insertion:

       Insert |   Range    |     Country
      --------|------------+------------------
              |     5      |  begin:Country A
              |            |
          I1 >>>    8,9    |  !!! Country C !!!
              |            |
              |     10     |  end:Country A
              |            |
              |            |
          I2 >>>    12,14  |  Country E
              |            |
              |            |
              |     15     |  begin:Country B
              |            |
          I3 >>>    17,21  |  !!! Country D !!!
              |            |
              |     20     |  end:Country B
              |            |
          I4 >>>    22,27  |  Country F
              |            |
    

    Insertion I1:

    Renders Addresses with IPs 6 and 7 (between 5 and 8) to be invalid. => Effectively Country A range shrinks to single IP Address 10.

    Insertion I2:

    OK, no range intersections

    Insertion I3:

    Renders begin of Country B invalid + renders begin of Country D (17..20) invalid

    Insertion I4:

    OK

    Note: Probably you will need to introduce range splitting logic in some of the cases.

    Redis based solution

    I'd propose to use Redis ZSET for that purpose. Here are the observations:

    1. Every IPv4 address can be represented as a 32bit integer besides the dotted decimal string representation.

    2. Redis ZSET guarantees uniqueness of the stored members by additionally ordering them with scores

    3. We can search for ZSET members by using a score range, i.e. ZRANGEBYSCORE command.

    If we use numeric IP as a ZSET score, we are done. Uniqueness of a country is enforced by prepending begin: and end: prefixes for particular range. In real life one country might have multiple IP address ranges, thus you will probably end up encoding the range number into the Country name, like begin:r1:Country A and end:r1:Country A. You could normalize this and introduce indirection. But to keep number of lookups low you'd want to denormalize it and have as much info as possible within single DB access. This is because introducing a new range is much less frequent than doing a lookup so increasing number of lookups will decrease performance.

       Lookup |   Score    |     Country
      --------|------------+------------------
              |     5      |  begin:Country A
          L1 >>>           |
              |     10     |  end:Country A
              |            |
          L2 >>>           |
              |            |
          L2.1>>>   15     |  begin:Country B
              |            |
              |     20     |  end:Country B
          L3 >>>           |
              |            |
    

    What Redis commands to use

    Here are just plain commands without your logic to check for error cases during insertions etc.

    • Adding a new range

      > ZADD ip-to-country 3232235777 "begin:Country A" 3232235876 "end:Country A"
      

      Note: 3232235777 is IPv4 192.168.1.1 represented as unsigned int, same applies to 192.168.1.100.

    • Checking to which range a particular IP belongs

      > ZRANGEBYSCORE ip-to-country 3232235778 +inf WITHSCORES LIMIT 0 1
      

      Note: 3232235778 is IPv4 192.168.1.2 represented as unsigned int and we make a lookup of one element (i.e. LIMIT 0 1) from 192.168.1.8 onwards (i.e. +inf).

    • Checking for Lookup 2.1 looked up IP starts the new range

       > ZSCORE ip-to-country "begin:Country A"
      

      Note: result will be 3232235777

    Complexity Analysis

    Space Complexity: If in worst case we end up with every single IP representing start and end of a Range, we will need O(2*N) space, where N is 2^32. But in real life this number will be much smaller. In some algorithm books you will see that 2^32 is considered a constant factor and thus will be reduced to O(1).

    Runtime Complexity: Redis states that ZRANGEBYSCORE is a O(log(N)+M) operation, where M is the number of elements in LIMIT, i.e. here only 1. If we have at most 2*2^32 scores in worst case than log2(8billion) is around 33 comparisons within Redis set implementation. But in reality I think there would not be more than 2 or 3 thousand ranges, which is around 11 comparisons. Redis states for a KEYS command:

    Redis running on an entry level laptop can scan a 1 million key database in 40 milliseconds.

    All in all your lookup will be blazing fast!

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥50 导入文件到网吧的电脑并且在重启之后不会被恢复
  • ¥15 (希望可以解决问题)ma和mb文件无法正常打开,打开后是空白,但是有正常内存占用,但可以在打开Maya应用程序后打开场景ma和mb格式。
  • ¥20 ML307A在使用AT命令连接EMQX平台的MQTT时被拒绝
  • ¥20 腾讯企业邮箱邮件可以恢复么
  • ¥15 有人知道怎么将自己的迁移策略布到edgecloudsim上使用吗?
  • ¥15 错误 LNK2001 无法解析的外部符号
  • ¥50 安装pyaudiokits失败
  • ¥15 计组这些题应该咋做呀
  • ¥60 更换迈创SOL6M4AE卡的时候,驱动要重新装才能使用,怎么解决?
  • ¥15 让node服务器有自动加载文件的功能