超快速部分文本匹配的数据库/语言选项[关闭]

我正在构建一个项目,需要一种超快速的方法来提供基于部分文本匹配的结果的自动完成源 。</ p>

我将仅对数据库中的一个字段进行索引/搜索,尽管数据库行将包含其他数据,我不会将这些字段编入索引。 我会有约。 25k行。</ p>

要求:</ p>


  • 必须在该领域的任何地方匹配(Lorem Ipsum Dolor Sit Amet在开始时会被找到 输入“Lor”,“Ipsum”,“olor”,“Sit Amet”)</ li>
  • 需要非常快速地在JSON提要中返回结果(尽管数据的原始来源不是 太多了)</ li>
  • 高流量的可扩展解决方案</ li>
    </ ul>

    我已经审查了几个选项... </ p>


    • 使用MongoDB,如 mongoDB中的查询 </ li>
    • ElasticSearch - 不确定我是否有点过度需要做什么,并且没有看到任何匹配上述部分文本的问题</ li>
    • SQL LIKE查询,但是 想象一下这还不够快?</ li>
      </ ul>

      编程语言不是太大的问题,但Python或PHP将是首选。</ p> \ n </ div>

展开原文

原文

I am building a project and require a super fast way of supplying an autocomplete feed with results based on a partial text match.

I will be indexing/searching on only one field in a database, though the database row will have additional data I won't be indexing those fields. I will have approx. 25k rows.

Requirements:

  • Must match anywhere in the field (Lorem Ipsum Dolor Sit Amet would be found when starting to type "Lor", "Ipsum", "olor", "Sit Amet")
  • Needs to be extremely quick at returning results in a JSON feed (though the original source of the data doesn't matter too much)
  • Scalable solution for high traffic

I have reviewed a few options...

  • Using MongoDB like such like query in mongoDB
  • ElasticSearch - not sure if a bit overkill for what I need to do, and haven't seen any exaples of matching the partial text as above
  • SQL LIKE query, but imagine this won't be nearly fast enough?

Programming language isn't too much of an issue but Python or PHP would be preferred.

douniao8687
douniao8687 你有没有检查过solr?lucene.apache.org/solr,我不认为LIKE是工作的工具mysql支持FULLTEXT索引
7 年多之前 回复

2个回答



正如其他人所提到的,这是一个执行语言和句法分析的全文索引(标记化,词干化,案例和重音标准化等) )会给你最好的结果。 但是如果没有一定的设置和配置,就不会有这种情况。</ p>

查看Solr的建议组件: http://wiki.apache.org/solr/Suggester ,还有一个新的 - 我认为它叫做AnalyticsSuggester或者其他一些,只有Lucene才有,我想 ,所以如果你想要一个内存中的解决方案,你可以使用它(仅限Java)。</ p>
</ div>

展开原文

原文

As others have mentioned, a full-text index that performs linguistic and syntactic analysis (tokenizing, stemming, case and accent-normalization, etc) will give you the best results. But this won't come without a certain amount of setup and configuration.

Check out Solr's Suggester component: http://wiki.apache.org/solr/Suggester, and there is a new one - I think it's called AnalyzingSuggester or some such, which is available with Lucene only, I think, so if you want an in-memory solution you could use that (Java only though).



这听起来像是一个典型的全文搜索事物。 根据您的应用程序和数据所在的数据库,进行中的嗖嗖可能会做什么 你需要(像Lucene for Java)。</ p>

你说得对,SQL LIKE </ code>查询与实际全文相比会表现得非常糟糕 指数。 MongoDB可能也不是很合适,但是可以调整大致按照你的建议做。 </ p>
</ div>

展开原文

原文

This sounds like a typical full-text searching thing. Depending on your application and the database the data is in, an in-process whoosh might do what you need (Like Lucene for Java).

You're right to say that an SQL LIKE query is going to perform horribly compared to an actual full-text index. MongoDB might not be a very good fit either, though is tunable to do roughly what you suggest.

Csdn user default icon
上传中...
上传图片
插入图片
抄袭、复制答案,以达到刷声望分或其他目的的行为,在CSDN问答是严格禁止的,一经发现立刻封号。是时候展现真正的技术了!
立即提问
相关内容推荐