donkey111111 2014-05-07 21:46
浏览 209
已采纳

快速搜索加密数据?

I've got a requirement to encrypt Personally identifiable information (PII) data in an application DB. The application uses smart searches in the system that use sound like, name roots and part words searches to find name and address quickly.

If we put in encryption on those fields (the PII data encrypted at the application tier), the searches will be impacted by the volume of records because we cant rely on SQL in the normal way and the search engine (in the application) would switch to reading all values, decrypt them and do the searches.

Is there any easy way of solving this so we can always encrypt the PII data and also give our user base the fast search functionality?

We are using a PHP Web/App Tier (Zend Server and a SQL Server DB). The application does not currently use technology like Lucene etc.

Thanks

Cheers

  • 写回答

3条回答 默认 最新

  • dongmale0656 2014-05-07 22:02
    关注

    Encrypting the data also makes it look a great deal like randomized bit strings. This precludes any operations the shortcut searching via an index.

    For some encrypted data, e.g. Social security number, you can store a hash of the number in a separate column, then index this hash field and search for the hash. This has limited utility obviously, and is of no value in searches name like 'ROB%'

    If your database is secured properly may sound nice, but it is very difficult to achieve if the bad guys can break in and steal your servers or backups. And if it is truly as requirement (not just a negotiable marketing driven item), you are forced to comply.

    You may be able to negotiate storing partial data in unencrypted, e.g., first 3 character of last name or such like so that you can still have useful (if not perfect) indexing.

    ADDED

    I should have added that you might be allowed to hash part of a name field, and search on that hash -- assuming you are not allowed to store partial name unencrypted -- you lose usefulness again, but it may still be better than no index at all.

    For this hashing to be useful, it cannot be seeded -- i.e., all records must hash based on the same seed (or no seed), or you will be stuck performing a table scan.

    You could also create a covering index, still encrypted of course, but a table scan could be considerable quicker due to the reduced I/O & memory required.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(2条)

报告相同问题?

悬赏问题

  • ¥15 树莓派与pix飞控通信
  • ¥15 自动转发微信群信息到另外一个微信群
  • ¥15 outlook无法配置成功
  • ¥30 这是哪个作者做的宝宝起名网站
  • ¥60 版本过低apk如何修改可以兼容新的安卓系统
  • ¥25 由IPR导致的DRIVER_POWER_STATE_FAILURE蓝屏
  • ¥50 有数据,怎么建立模型求影响全要素生产率的因素
  • ¥50 有数据,怎么用matlab求全要素生产率
  • ¥15 TI的insta-spin例程
  • ¥15 完成下列问题完成下列问题