donglei3370 2013-06-20 20:21
浏览 459
已采纳

MySQL搜索多个关键字并按最佳匹配顺序排列

I currently have a tag system for my blog. Each blog is inserted in the blog table, and the tags are inserted in the tag table. The tag table has a column blog_id to link each tag to a blog item.

So let's say we have:

Blog table:
id - name 
20 - a nice blog post about product x

Tag table:
id - blog_id - tag
12 - 20 - nice
13 - 20 - product x

I have a search function that searches through the tags based on a search string and that works fine.

But I would like to expand the query to search for multiple tags, and order it by the best match. Searching for multiple tags will not be a problem, because I could just explode the search string and loop through it, but ordering it by the best match is something I can not figure out.

So let's say I have 3 blog posts, with each the following tags:
1. sunny, in-ear, earphones, review
2. pear, out-ear, earphones, review
3. pear, in-ear, earphones, review

And a user searches for "pear in-ear earphones", I would like the order of the result to be:
3. (because 3 tags match)
1. (because 2 tags match)
2. (because 1 tags match)

This is what the query looks like:

SELECT `b`.* 
FROM (`blog` b) 
WHERE (
    b.name LIKE '%pear in-ear earphones%' 
    OR 
    b.id IN ( 
        SELECT bt.blog_id 
        FROM blog_tags bt 
        WHERE bt.tag LIKE '%pear in-ear earphones%' 
        AND bt.blog_id = b.id 
        ) 
    ) 
ORDER BY `b`.`date` desc


Who could help me out?
I've looked at "Full Text Search" but that is not an option, because my table is InnoDB.

Thanks in advance!

  • 写回答

2条回答 默认 最新

  • douyue2313 2013-06-20 20:32
    关注

    I personally love using Solr for text matching. You can create these complex formulas that will weight matches in the name higher than matches in the tags or vice versa. It also matches pluralizations too. So if I search butterflies it will find the butterfly matches.

    Here's another query that may help you with order blog tags by frequency. This query will get all of the blog items with at least one match in the tags. It will order by the number of tags that matches

    SELECT *
    FROM blog b
    JOIN (
        SELECT blog_id, COUNT(*) as matches 
        FROM tags 
        WHERE tag in ('pear', 'in-ear', 'earphones') 
        GROUP BY blog_id
    ) t
    ON t.blog_id = b.blog_id
    ORDER BY matches desc
    

    You can add the number of matches for a particular string like this:

    SELECT *,
        t.matches + 
        COALESCE((LENGTH(b.`title`)-LENGTH(REPLACE(b.`title`,'pear','')))/LENGTH('pear'),0) + 
        COALESCE((LENGTH(b.`title`)-LENGTH(REPLACE(b.`title`,'in-ear','')))/LENGTH('in-ear'),0) + 
        COALESCE((LENGTH(b.`title`)-LENGTH(REPLACE(b.`title`,'earphones','')))/LENGTH('earphones'),0) AS total_matches,
    FROM blog b
    LEFT JOIN (
        SELECT blog_id, COUNT(*) as matches 
        FROM tags 
        WHERE tag in ('pear', 'in-ear', 'earphones') 
        GROUP BY blog_id
    ) t
    ON t.blog_id = b.blog_id
    ORDER BY total_matches desc
    ORDER BY 
    

    Just a note this query will probably be pretty slow with all of these matching and things. I still recommend using an indexing software like Solr

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥15 对于这个复杂问题的解释说明
  • ¥50 三种调度算法报错 有实例
  • ¥15 关于#python#的问题,请各位专家解答!
  • ¥200 询问:python实现大地主题正反算的程序设计,有偿
  • ¥15 smptlib使用465端口发送邮件失败
  • ¥200 总是报错,能帮助用python实现程序实现高斯正反算吗?有偿
  • ¥15 对于squad数据集的基于bert模型的微调
  • ¥15 为什么我运行这个网络会出现以下报错?CRNN神经网络
  • ¥20 steam下载游戏占用内存
  • ¥15 CST保存项目时失败