douxi4114 2016-05-05 01:08 采纳率: 100%
浏览 40
已采纳

搜索结果的多样性

First of all: Sorry for the long post, I am trying to explain a hard situation in an easy way and, at the same time, trying to give as much information as I can.

I have an algorithm that tries to determine user expectation during a search. There are a couple of way I can use it and I have the same problem with both of them, so, lets say I use it for disambiguation. Well, with a db structure like this one (or any other that allows the work):

post

ID | TITLE
---+----------------------------------------------
1  | Orange developed the first 7G phone
2  | Orange: the fruit of gods
3  | Theory of Colors: Orange
4  | How to prepare the perfect orange juice

keywords

ID | WORD     | ABOUT   
---+----------+---------
1  | orange   | company 
2  | orange   | fruit   
3  | orange   | color   

post_keywords

ID | POST  | KEYWORD
---+-------+---------
1  |   1   |   1 
2  |   2   |   2
3  |   3   |   3
4  |   4   |   2

.

If in a search box, an user make a search for the word "orange", the algorithm would look that orange may refers to the company, the color, or the fruit and, by answering a couple of questions, it tries to determine which the user is looking for. After all that I get an array like this one:

$e = array(
    'fruit' => 0.153257,
    'color' => 0.182332,
    'company' => 0.428191,
);

In this point I know the user is probably looking for information about the fruit (because fruit's value is closer to 0) and if I am wrong my second bet goes for the color. At the bottom of the list, the company.

So, with a Join and ORDER BY FIELD(keywords.id, 2,3,1) I can give the results the (almost) perfect order:

- Orange: the fruit of gods
- How to prepare the perfect orange juice
- Theory of Colors: Orange
- Orange developed the first 7G phone

.

Well... as you can imagine, I wouldn't come for help if everything is so nice. So, the problem is that in is the previous example we have only 4 possible results, so, if the user really was looking for the company he can find this result in the 4th position and everything is okay. But... If we have 200 post about the fruit and 100 post about the color, the first post about the company come in the position 301st.

I am looking for a way to alternate the order (in a predictable and repeatable way) now that I know the user is must likely looking for the fruit, followed by the color and the company at the end. I want to be able to show a post about the fruit in the first position (and possibly the second), followed by a post about the color, followed by the company and start this cycle again until the results ends.

Edit: I'll be happy with a MySQL trick or with an idea to change the approach, but I can't accept third-party solutions.

  • 写回答

2条回答 默认 最新

  • dongyou6768 2016-05-05 08:10
    关注

    You can use variables to provide custom sort field.

    SELECT
      p.*,
      CASE k.about
        WHEN 'company' THEN @sort_company := @sort_company + 1
        WHEN 'color' THEN @sort_color := @sort_color + 1
        WHEN 'fruit' THEN @sort_fruit := @sort_fruit + 1
        ELSE NULL
      END AS sort_order,
    k.about
    FROM post p
      JOIN post_keywords pk ON (p.id = pk.post)
      JOIN keywords k ON (pk.keyword = k.id)
      JOIN (SELECT @sort_fruit := 0, @sort_color := 0, @sort_company := 0) AS vars
    ORDER BY sort_order, FIELD(k.id, 2, 3, 1)
    

    Result will look like this:

    | id | title                                   | sort_order | about   |
    |---:|:----------------------------------------|-----------:|:--------|
    |  2 | Orange: the fruit of gods               |          1 | fruit   |
    |  3 | Theory of Colors: Orange                |          1 | color   |
    |  1 | Orange developed the first 7G phone     |          1 | company |
    |  4 | How to prepare the perfect orange juice |          2 | fruit   |
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥15 R语言Rstudio突然无法启动
  • ¥15 关于#matlab#的问题:提取2个图像的变量作为另外一个图像像元的移动量,计算新的位置创建新的图像并提取第二个图像的变量到新的图像
  • ¥15 改算法,照着压缩包里边,参考其他代码封装的格式 写到main函数里
  • ¥15 用windows做服务的同志有吗
  • ¥60 求一个简单的网页(标签-安全|关键词-上传)
  • ¥35 lstm时间序列共享单车预测,loss值优化,参数优化算法
  • ¥15 Python中的request,如何使用ssr节点,通过代理requests网页。本人在泰国,需要用大陆ip才能玩网页游戏,合法合规。
  • ¥100 为什么这个恒流源电路不能恒流?
  • ¥15 有偿求跨组件数据流路径图
  • ¥15 写一个方法checkPerson,入参实体类Person,出参布尔值