douou5933 2014-02-01 11:58
浏览 85
已采纳

狮身人面像精确/部分匹配和排名

I'm trying to achieve two goals with the single Sphinx request: get results that match any word from query and have exact match on the first place. For example if I had song search request database:

  1. Miley Cyrus Ball
  2. Miley Cyrus Wrecking
  3. Miley Cyrus

And two test queries:

  1. Miley Cyrus
  2. Miley Cyrus Wrecking Ball

If I search for "Miley Cyrus" I want to get row #3 and if I search for "Miley Cyrus Wrecking Ball" I want to get #1 or #2. I tried different combination of matching and ranking modes but still can't get this working with the single request. When I try SPH_MATCH_EXTENDED2 and SPH_RANK_SPH04 my first test query works fine returning result #3 on the first place, but the second test query returns no results. When I try SPH_MATCH_ANY I get partial matched results for the second test query (#2 has a bit higher weight which seems correct) but the first query returns 3 rows with the same weight and #1 is on top because of the order in the DB. The only workaround I have for now is making two queries: one for exact match and another one for partial match if the first one failed.

Also from this article I see that all match modes except SPH_MATCH_EXTENDED2 are legacy, so what should I use for partial match like in example above when they are removed?

  • 写回答

1条回答 默认 最新

  • drgc9632 2014-02-02 18:40
    关注

    td;dr There is only one Matching mode - Extended. Don't use any others. If you want to modify what documents are included, modify the query itself (eg with quorum operator). Then can pick how documents are ordered using Ranking mode.


    The first thing to realise, is that matching and ranking are two distinct topics.

    • Matching is what documents are even present the results, ie comparing the query and saying yes/no to the question "does this document match the query?"

    • Ranking is computing a weight, so the best matches can rise to the top by sorting by weight.

    historically matching and ranking where combined into one concept, you choose teh matching mode (which chose how query was inpreted) and a suitable ranking calculation was automatically selected.

    This realised to be not flexible enough, so where seperated. But lots of people used the old behaviour, so the old matching modes (any/phrase etc) where maintained for compatibility reasons.

    Internally there is only ONE matching mode - Extended. The older legacy matching modes, automatically rewrite the query as needed (change it to extended query syntax), and pick a particular ranking mode.

    So by keeping extended matching mode, you get to choose yourself the ranking mode. So can choose matching (modifying the query) or the ranking behaviour independently.


    I explained all the backstory to show you that if the provided matching modes aren't good enough, you can do the same thing. ie

    • You need to choose a particular ranking mode (or even a completely custom one via the ranking expression)

    • AND you may well need to modify the query itself, to change the matching behaviour. (remember choosing MATCH_ANY, changes the query AND selects a ranking mode.)

    So could rewrite the query to use quorum, eg

    "Miley Cyrus Wrecking Ball"/2
    

    Remembering to keep Extended match mode. Then can choose a ranking mode independatly (setRankingMode) - eg can now use SPH_RANK_SPH04, but you do get 'fuzzy' matching behaviour (like would with match any)

    ... dont forget to try other ranking modes too.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 在获取boss直聘的聊天的时候只能获取到前40条聊天数据
  • ¥20 关于URL获取的参数,无法执行二选一查询
  • ¥15 液位控制,当液位超过高限时常开触点59闭合,直到液位低于低限时,断开
  • ¥15 marlin编译错误,如何解决?
  • ¥15 有偿四位数,节约算法和扫描算法
  • ¥15 VUE项目怎么运行,系统打不开
  • ¥50 pointpillars等目标检测算法怎么融合注意力机制
  • ¥20 Vs code Mac系统 PHP Debug调试环境配置
  • ¥60 大一项目课,微信小程序
  • ¥15 求视频摘要youtube和ovp数据集