dongye9228 2009-12-14 07:34
浏览 24
已采纳

什么是最容易实现的网站搜索应用程序,它支持模糊搜索?

I have a site that needs to search thru about 20-30k records, which are mostly movie and TV show names. The site runs php/mysql with memcache.

Im looking to replace the FULLTEXT with soundex() searching that I currently have, which works... kind of, but isn't very good in many situations.

Are there any decent search scripts out there that are simple to implement, and will provide a decent searching capability (of 3 columns in a table).

  • 写回答

4条回答 默认 最新

  • dongyou9721 2009-12-16 08:29
    关注

    ewemli's answer is in the right direction but you should be combining FULLTEXT and soundex mapping, not replacing the fulltext, otherwise your LIKE queries are likely be very slow.

    create table with_soundex (
      id INT UNSIGNED AUTO_INCREMENT NOT NULL PRIMARY KEY,
      original TEXT,
      soundex TEXT,
      FULLTEXT (soundex)
    );
    
    insert into with_soundex (original, soundex) values 
    
    ('add some test cases', CONCAT_WS(' ', soundex('add'), soundex('some'), soundex('test'), soundex('cases'))),
    ('this is some text', CONCAT_WS(' ', soundex('this'), soundex('is'), soundex('some'), soundex('text'))),
    ('one more test case', CONCAT_WS(' ', soundex('one'), soundex('more'), soundex('test'), soundex('case'))),
    ('just filling the index', CONCAT_WS(' ', soundex('just'), soundex('filling'), soundex('the'), soundex('index'))),
    ('need one more example', CONCAT_WS(' ', soundex('need'), soundex('one'), soundex('more'), soundex('example'))),
    ('seems to need more', CONCAT_WS(' ', soundex('seems'), soundex('to'), soundex('need'), soundex('more')))
    ('some helpful cases to consider', CONCAT_WS(' ', soundex('some'), soundex('helpful'), soundex('cases'), soundex('to'), soundex('consider')))
    
    select * from with_soundex where match(soundex) against (soundex('test'));
    +----+---------------------+---------------------+
    | id | original            | soundex             |
    +----+---------------------+---------------------+
    |  1 | add some test cases | A300 S500 T230 C000 | 
    |  2 | this is some text   | T200 I200 S500 T230 | 
    |  3 | one more test case  | O500 M600 T230 C000 | 
    +----+---------------------+---------------------+
    
    select * from with_soundex where match(soundex) against (CONCAT_WS(' ', soundex('test'), soundex('some')));
    +----+--------------------------------+---------------------------+
    | id | original                       | soundex                   |
    +----+--------------------------------+---------------------------+
    |  1 | add some test cases            | A300 S500 T230 C000       | 
    |  2 | this is some text              | T200 I200 S500 T230       | 
    |  3 | one more test case             | O500 M600 T230 C000       | 
    |  7 | some helpful cases to consider | S500 H414 C000 T000 C5236 | 
    +----+--------------------------------+---------------------------+
    

    That gives pretty good results (within the limits of the soundex algo) while taking maximum advantage of an index (any query LIKE '%foo' has to scan every row in the table).

    Note the importance of running soundex on each word, not on the entire phrase. You could also run your own version of soundex on each word rather than having SQL do it but in that case make sure you do it both when storing and retrieving in case there are differences between the algorithms (for instance, MySQL's algo doesn't limit itself to the standard 4 chars)

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(3条)

报告相同问题?

悬赏问题

  • ¥50 用易语言http 访问不了网页
  • ¥50 safari浏览器fetch提交数据后数据丢失问题
  • ¥15 matlab不知道怎么改,求解答!!
  • ¥15 永磁直线电机的电流环pi调不出来
  • ¥15 用stata实现聚类的代码
  • ¥15 请问paddlehub能支持移动端开发吗?在Android studio上该如何部署?
  • ¥20 docker里部署springboot项目,访问不到扬声器
  • ¥15 netty整合springboot之后自动重连失效
  • ¥15 悬赏!微信开发者工具报错,求帮改
  • ¥20 wireshark抓不到vlan