douhuo3696 2013-01-25 12:46
浏览 30
已采纳

超快速部分文本匹配的数据库/语言选项[关闭]

I am building a project and require a super fast way of supplying an autocomplete feed with results based on a partial text match.

I will be indexing/searching on only one field in a database, though the database row will have additional data I won't be indexing those fields. I will have approx. 25k rows.

Requirements:

  • Must match anywhere in the field (Lorem Ipsum Dolor Sit Amet would be found when starting to type "Lor", "Ipsum", "olor", "Sit Amet")
  • Needs to be extremely quick at returning results in a JSON feed (though the original source of the data doesn't matter too much)
  • Scalable solution for high traffic

I have reviewed a few options...

  • Using MongoDB like such like query in mongoDB
  • ElasticSearch - not sure if a bit overkill for what I need to do, and haven't seen any exaples of matching the partial text as above
  • SQL LIKE query, but imagine this won't be nearly fast enough?

Programming language isn't too much of an issue but Python or PHP would be preferred.

  • 写回答

2条回答 默认 最新

  • duanchuang1935 2013-01-25 13:00
    关注

    As others have mentioned, a full-text index that performs linguistic and syntactic analysis (tokenizing, stemming, case and accent-normalization, etc) will give you the best results. But this won't come without a certain amount of setup and configuration.

    Check out Solr's Suggester component: http://wiki.apache.org/solr/Suggester, and there is a new one - I think it's called AnalyzingSuggester or some such, which is available with Lucene only, I think, so if you want an in-memory solution you could use that (Java only though).

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥15 用comsol模拟大气湍流通过底部加热(温度不同)的腔体
  • ¥50 安卓adb backup备份子用户应用数据失败
  • ¥20 有人能用聚类分析帮我分析一下文本内容嘛
  • ¥15 请问Lammps做复合材料拉伸模拟,应力应变曲线问题
  • ¥30 python代码,帮调试
  • ¥15 #MATLAB仿真#车辆换道路径规划
  • ¥15 java 操作 elasticsearch 8.1 实现 索引的重建
  • ¥15 数据可视化Python
  • ¥15 要给毕业设计添加扫码登录的功能!!有偿
  • ¥15 kafka 分区副本增加会导致消息丢失或者不可用吗?