dongmi1872 2012-01-16 04:25
浏览 47
已采纳

Sphinx根据排序返回不一致的结果集

I'm trying to implement multilingual indexes for the web application I'm developing. At the moment, records exist in a few languages, English, Malay & Arabic (but they are not separated into different columns). Only English stemmer is currently enabled.

Only two indexes are built, for the stemmed and the non-stemmed indexes. I'm having the problem with the stemmed index, as the result set returned is not consistent, depending on the sort column.

These two queries (from the stemmed index), each returns a different number of total results, although the difference between them is only the sort order.

SELECT * FROM test1stemmed WHERE MATCH('@institution universiti') GROUP BY art_id ORDER BY art_title_ord ASC;

SELECT * FROM test1stemmed WHERE MATCH('@institution universiti') GROUP BY art_id ORDER BY art_title_ord DESC;

However, if the same queries were run on the non-stemmed index, the numbers of results are equal.

I'm also having the same problem with Sphinx PHP API:

$sp = new SphinxClient();
$sp->SetServer('localhost', 9312);
$sp->SetMatchMode(SPH_MATCH_EXTENDED);
$sp->SetGroupBy('art_id', SPH_GROUPBY_ATTR, "$sp_sort_column $sort");
$sp->SetLimits($offset, $rows_per_page, 1000);
$sp->Query("$q", 'test1stemmed');

What am I missing?

  • 写回答

1条回答 默认 最新

  • douxiu6835 2012-01-17 08:03
    关注

    Something that I missed from the documentation here http://sphinxsearch.com/docs/2.0.2/clustering.html

    WARNING: grouping is done in fixed memory and thus its results are only approximate; so there might be more groups reported in total_found than actually present. @count might also be underestimated. To reduce inaccuracy, one should raise max_matches. If max_matches allows to store all found groups, results will be 100% correct.

    So I can workaround this by increasing the value in max_matches, but since putting a very large value is absolutely undesirable, I would fix the query instead.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 使用Jdk8自带的算法,和Jdk11自带的加密结果会一样吗,不一样的话有什么解决方案,Jdk不能升级的情况
  • ¥15 画两个图 python或R
  • ¥15 在线请求openmv与pixhawk 实现实时目标跟踪的具体通讯方法
  • ¥15 八路抢答器设计出现故障
  • ¥15 opencv 无法读取视频
  • ¥15 用matlab 实现通信仿真
  • ¥15 按键修改电子时钟,C51单片机
  • ¥60 Java中实现如何实现张量类,并用于图像处理(不运用其他科学计算库和图像处理库))
  • ¥20 5037端口被adb自己占了
  • ¥15 python:excel数据写入多个对应word文档