2019-04-23 10:29

python difflib.SequenceMatcher() 字符串序列差异比较

python difflib.SequenceMatcher() 字符串序列差异比较
底层算法是 编辑距离算法 还是最长公共子串算法 或者 其他算法?

  • 点赞
  • 写回答
  • 关注问题
  • 收藏
  • 复制链接分享
  • 邀请回答


  • caozhy 从今以后生命中的每一秒都属于我爱的人 2年前

    class difflib.SequenceMatcher

    This is a flexible class for comparing pairs of sequences of any type, so long as the sequence elements are hashable. The basic algorithm predates, and is a little fancier than, an algorithm published in the late 1980’s by Ratcliff and Obershelp under the hyperbolic name “gestalt pattern matching.” The idea is to find the longest contiguous matching subsequence that contains no “junk” elements (the Ratcliff and Obershelp algorithm doesn’t address junk). The same idea is then applied recursively to the pieces of the sequences to the left and to the right of the matching subsequence. This does not yield minimal edit sequences, but does tend to yield matches that “look right” to people.


    点赞 评论 复制链接分享