将用户输入的标题与类别匹配 - 大型INNODB数据库

我有一个拥有200多万件产品的大型INNODB数据库。 'products'表包含以下字段:id,title,description,category。 </ p>

还有一个名为“类别”的MyISAM表,其中包含网站上使用的所有类别的列表。 这有以下字段:id,name,keywords,parentid。</ p>

我的问题更多的是逻辑而不是代码,但我想要实现的目的如下:</ p >

当用户在网站上列出新产品时,因为他们正在键入描述,所以应该尝试确定将产品放入哪个类别(准确度很高)。</ p>

我最初尝试使用MySQL MATCH()将输入的标题与类别表中的关键字列表相匹配,但这远非准确。</ p>

更好 想法似乎是将用户输入的标题与已存在于数据库中的产品的标题相匹配,按照它们所在的类别对它们进行分组,然后按最大的组对它们进行排序。 但是,在INNODB数据库中,我显然不能使用全文,而且我认为它的速度相当于2毫米?</ p>

你会怎么做 - 我猜它需要 类似于stackoverflow如何显示类似问题的方式?</ p>
</ div>

展开原文

原文

I have a large INNODB database with over 2 million products on it. The 'products' table has the following fields: id,title,description,category.

There is also a MyISAM table called 'category' that contains a list of all categories used on the website. This has the following fields: id,name,keywords,parentid.

My question is more about the logic rather than code, but what I am trying to achieve is as follows:

When a user lists a new product on the site, as they are typing the description it should try to work out what category to put the product in (with good accuracy).

I tried this initially by using MySQL MATCH() to match the entered title against a list of keywords in the category table, but this was far from accurate.

A better idea seems to be to match the user entered title against titles for products already in the database, grouping them by the category they are in and then sorting them by the largest group. However, on an INNODB database I obviously can't use fulltext, and with 2mill items I think it would be pretty slow anyway?

How would you do it - I guess it would need to be a similar way to how stackoverflow displays similar questions?

1个回答



如果您在体面的服务器上运行,则200万条记录上的全文索引是一个有效选项。 初始索引将需要一段时间,这是肯定的,但搜索应该相当快,MySQL可以接受它。</ p>

InnoDB从v5.6.4开始支持全文索引。 您应该考虑升级。</ p>

如果无法升级,请参阅我此前的答案 我建议解决方法。</ p>

对于您的用例,您可能需要查看 WITH QUERY EXPANSION </ code> 选项:</ p>



它的工作原理是执行两次搜索,其中第二次搜索的搜索短语是与第一次搜索中几个最高度相关的文档连接的原始搜索短语。 因此,如果其中一个文档包含单词“databases”和单词“MySQL”,则第二个搜索查找包含单词“MySQL”的文档,即使它们不包含单词“database”</ p>
</ blockquote>
</ div>

展开原文

原文

A fulltext index on 2 million records is a valid option, if you are running on a decent server. The inital indexing will take a while, that's for sure, but searches should be reasonably fast, MySQL can take it.

InnoDB supports fulltext indexes as of v5.6.4. You should consider upgrading.

If upgrading is not an option, please see this previous answer of mine where I suggest a workaround.

For your use case, you may want to take a look at the WITH QUERY EXPANSION option:

It works by performing the search twice, where the search phrase for the second search is the original search phrase concatenated with the few most highly relevant documents from the first search. Thus, if one of these documents contains the word “databases” and the word “MySQL”, the second search finds the documents that contain the word “MySQL” even if they do not contain the word “database”

duanchun1881
duanchun1881 伟大的帖子,谢谢你的想法。 我非常希望我们的服务器升级到5.6,但由于服务器在Debian上,我目前看不到任何不错的升级路径。 MySQL网站上的.deb文件安装了一个非常拙劣的版本,包含所有错误的文件位置和布局,用于正常的debian MySQL安装。
7 年多之前 回复
Csdn user default icon
上传中...
上传图片
插入图片
抄袭、复制答案,以达到刷声望分或其他目的的行为,在CSDN问答是严格禁止的,一经发现立刻封号。是时候展现真正的技术了!
立即提问