donkey111111 2014-05-07 21:46
浏览 209
已采纳

快速搜索加密数据?

I've got a requirement to encrypt Personally identifiable information (PII) data in an application DB. The application uses smart searches in the system that use sound like, name roots and part words searches to find name and address quickly.

If we put in encryption on those fields (the PII data encrypted at the application tier), the searches will be impacted by the volume of records because we cant rely on SQL in the normal way and the search engine (in the application) would switch to reading all values, decrypt them and do the searches.

Is there any easy way of solving this so we can always encrypt the PII data and also give our user base the fast search functionality?

We are using a PHP Web/App Tier (Zend Server and a SQL Server DB). The application does not currently use technology like Lucene etc.

Thanks

Cheers

  • 写回答

3条回答 默认 最新

  • dongmale0656 2014-05-07 22:02
    关注

    Encrypting the data also makes it look a great deal like randomized bit strings. This precludes any operations the shortcut searching via an index.

    For some encrypted data, e.g. Social security number, you can store a hash of the number in a separate column, then index this hash field and search for the hash. This has limited utility obviously, and is of no value in searches name like 'ROB%'

    If your database is secured properly may sound nice, but it is very difficult to achieve if the bad guys can break in and steal your servers or backups. And if it is truly as requirement (not just a negotiable marketing driven item), you are forced to comply.

    You may be able to negotiate storing partial data in unencrypted, e.g., first 3 character of last name or such like so that you can still have useful (if not perfect) indexing.

    ADDED

    I should have added that you might be allowed to hash part of a name field, and search on that hash -- assuming you are not allowed to store partial name unencrypted -- you lose usefulness again, but it may still be better than no index at all.

    For this hashing to be useful, it cannot be seeded -- i.e., all records must hash based on the same seed (or no seed), or you will be stuck performing a table scan.

    You could also create a covering index, still encrypted of course, but a table scan could be considerable quicker due to the reduced I/O & memory required.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(2条)

报告相同问题?

悬赏问题

  • ¥15 关于#python#的问题:求帮写python代码
  • ¥15 LiBeAs的带隙等于0.997eV,计算阴离子的N和P
  • ¥15 关于#windows#的问题:怎么用WIN 11系统的电脑 克隆WIN NT3.51-4.0系统的硬盘
  • ¥15 来真人,不要ai!matlab有关常微分方程的问题求解决,
  • ¥15 perl MISA分析p3_in脚本出错
  • ¥15 k8s部署jupyterlab,jupyterlab保存不了文件
  • ¥15 ubuntu虚拟机打包apk错误
  • ¥199 rust编程架构设计的方案 有偿
  • ¥15 回答4f系统的像差计算
  • ¥15 java如何提取出pdf里的文字?