drlma06060 2012-11-10 22:24
浏览 27
已采纳

在Scala中重写PHP的similar_text

In an attempt to rewrite PHP's similar_text algorithm I have tried a few different approaches. All have been moderately successful but ultimately failed.

First attempt: I tried just rewriting it from the PHP source code. C's elegant use of pointers makes the same exact implementation seemingly impossible to make in Scala and be clean.

Second attempt: I tried rewriting it from a Java function someone posted on PHP similar_text() in java. Unfortunately that function doesn't work in Java so nevermind porting it over to Scala.

Third (current) attempt: I'm currently attempting to translate this JavaScript implementation into Scala: http://phpjs.org/functions/similar_text/. I've used it before in JavaScript and it seems to function properly. My translation (below) into Scala is not functioning properly. It gets you within 1 or 2 similarity indexes but it is typically not 100% to the results of it's PHP counterpart.

def similartext(first:String,second:String) : Int = {
  if (first == null || second == null) {
    0
  }

  var pos1:Int = 0
  var pos2:Int = 0
  var max:Int = 0
  var sum:Int = 0
  var l:Int = 0

  val firstLength:Int = first.length
  val secondLength:Int = second.length

  for (p <- 0 until firstLength) {
    for (q <- 0 until secondLength) {
      while(p+l < firstLength && q+l < secondLength && (first.charAt(p+l) == second.charAt(q+l))) {
        if (l > max) {
            println("[" + p + "," + q + "," + l + "]" + first.charAt(p+l) + " | " + second.charAt(q+l))
            max = l
            pos1 = p
            pos2 = q
          }
        l += 1
      }
    }
  }

  sum = max;

  if (sum > 0) {
    if (pos1 > 0 && pos2 > 0) {
      sum += similartext(first.substring(0, pos2), second.substring(0, pos2))
    }

    if ((pos1 + max < firstLength) && (pos2 + max < secondLength)) {
      sum += similartext(first.substring(pos1 + max, (pos1 + max) + (firstLength - pos1 - max)), second.substring(pos2 + max, (pos2 + max) + (secondLength - pos2 - max)))
    }
  }

  sum;
}

Tests:

(Scala)val st = similartext("apple","aple") Yields 3
(PHP)$similar = similar_text("apple","aple"); Yields 4

(Scala)val st = similartext("starbucks","stharducks") Yields 8
(PHP)$similar = similar_text("starbucks","stharducks"); Yields 8

(Scala)val st = similartext("hello earth!","hello world!") Yields 10
(PHP)$similar = similar_text("hello earth!","hello world!"); Yields 8

Does anyone have any ideas on what is going wrong here?

  • 写回答

1条回答 默认 最新

  • 普通网友 2012-11-11 00:44
    关注

    Here's a hint: look very closely at line 28 of the JavaScript version—particularly the last character of the line. That's where your implementation differs. (You also don't reset l to zero for every pair of indices, but that's not the most important problem.)

    Here's a var-free Scala version, by the way:

    def similarText(x: String, y: String): Int = {
      val indices = for {
        (s, p) <- x.tails.zipWithIndex
        (t, q) <- y.tails.zipWithIndex
        l = ((s zip t) takeWhile Function.tupled(_ == _)).size
      } yield (p, q, l)
      val (pos1, pos2, max) = indices.maxBy(_._3)
    
      if (max == 0) max else max +
        similarText(x take pos1, y take pos2) +
        similarText(x drop (pos1 + max), y drop (pos2 + max))
    }
    

    This is fairly off-the-cuff—I'm sure you could make it more concise and efficient pretty easily.

    And for extra credit: there's a bug in the JavaScript version—try for example "aabcd" and "abcabcd" and the result won't be the same as PHP's (or mine).

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥50 导入文件到网吧的电脑并且在重启之后不会被恢复
  • ¥15 (希望可以解决问题)ma和mb文件无法正常打开,打开后是空白,但是有正常内存占用,但可以在打开Maya应用程序后打开场景ma和mb格式。
  • ¥20 ML307A在使用AT命令连接EMQX平台的MQTT时被拒绝
  • ¥20 腾讯企业邮箱邮件可以恢复么
  • ¥15 有人知道怎么将自己的迁移策略布到edgecloudsim上使用吗?
  • ¥15 错误 LNK2001 无法解析的外部符号
  • ¥50 安装pyaudiokits失败
  • ¥15 计组这些题应该咋做呀
  • ¥60 更换迈创SOL6M4AE卡的时候,驱动要重新装才能使用,怎么解决?
  • ¥15 让node服务器有自动加载文件的功能