donglv9116 2012-02-10 06:43
浏览 63
已采纳

部分匹配

Is there a built in function or a function that someone has already written that can match names without being exact?

For example, I have:

Marry
John
Steve
Steven
Stewie

If someone types "stew" the function would return Stewie.
Or if someone types "ry" the function would return Marry.
Or if someone misspells "Marries" the function would still return Marry. (due to being the most similar of them all)
If "Ste" is supplied it can return false but it doesn't really matter to me.

Does anyone know how to write this sort of function or know of one already written? Seeing as this is probably a common thing, I would assume so.

Thanks.

  • 写回答

2条回答 默认 最新

  • dsuw85815 2012-02-10 06:49
    关注

    Actually there are some methods to achieve this:

    Built-In methods

    Not Built in methods

    • LCS Longest common subsequence
    • Letter N-Grams (used sometimes for spellchecking)
    • Levensthein automaton
    • Word lists (just for completeness)

    One of those should help you to solve your problem.

    The problem of every of those algorithms is that they are not accurate. So you will have a heuristical solution to the problem.

    Usually there are pro and cons between distance and sound algorithms. Sound specific algorithms are less accurate(round about 33% accuracy). But fast. Levensthein is much more accurate but slow. At least the php implementation. There are other systems where Levensthein is faster by a large margin (see e.g. Levensthein Automata. But this automata algorithm is not built in in php).

    Probably as a basic hint:

    • If you have much unique terms to compare against dont use Similar_text or Levensthein stick with Sound algorithms
    • If you have a pretty small set use Levensthein.
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥15 求差集那个函数有问题,有无佬可以解决
  • ¥15 【提问】基于Invest的水源涵养
  • ¥20 微信网友居然可以通过vx号找到我绑的手机号
  • ¥15 寻一个支付宝扫码远程授权登录的软件助手app
  • ¥15 解riccati方程组
  • ¥15 display:none;样式在嵌套结构中的已设置了display样式的元素上不起作用?
  • ¥15 使用rabbitMQ 消息队列作为url源进行多线程爬取时,总有几个url没有处理的问题。
  • ¥15 Ubuntu在安装序列比对软件STAR时出现报错如何解决
  • ¥50 树莓派安卓APK系统签名
  • ¥65 汇编语言除法溢出问题