ds753947 2014-10-29 18:41
浏览 80
已采纳

对于某些字符,PHP和Clojure(Java)代码之间的原始MD5 base64编码字符串的结果不同

I have a server that does create a hash using the following code:

base64_encode(md5("some value", true))

What I have to do is to produce the same hash value in Clojure (using Java interop). What I did is to create the following Clojure functions:

(defn md5-raw [s]
  (let [algorithm (java.security.MessageDigest/getInstance "MD5")
    size (* 2 (.getDigestLength algorithm))]
    (.digest algorithm (.getBytes s))))

(defn bytes-to-base64-string [bytes]
  (String. (b64/encode bytes) "UTF-8"))

Then I use that code that way:

(bytes-to-base64-string (md5-raw "some value")

Now, everything works fine with normal strings. However, after processing multiple different examples, I found the the following character is causing issues:

This is the UTF-8 character #8217.

If I run the following PHP code:

base64_encode(md5("’", true))

What is returned is:

yOy9/y97p/GfapveLVQAHA==

If I run the following Clojure code:

(bytes-to-base64-string (md5-raw "’"))

I get the following value:

aF1ZConzUtEGRN2YXaKpoQ==

Why is that? I am suspecting a character encoding issue, but everything appears to be handled as UTF-8 as far as I can see.

  • 写回答

1条回答 默认 最新

  • doqpm82240 2014-10-29 19:01
    关注

    Not everything can be guaranteed to be UTF-8 in your example, the following expression depends on your default charset:

    (.getBytes s)
    

    You should - well, actually this depends on your use case - use:

    (.getBytes s "UTF-8")
    

    Demonstration:

    (defn md5-with-charset
      [s charset]
      (let [algorithm (java.security.MessageDigest/getInstance "MD5")]
        (.digest algorithm (.getBytes s charset))))
    
    (b64 (md5-with-charset "’" "UTF-8"))  ;; => "yOy9/y97p/GfapveLVQAHA=="
    (b64 (md5-with-charset "’" "ASCII"))  ;; => "0UV7csP7MjomcRJa7z6rXQ=="
    (b64 (md5-with-charset "’" "UTF-16")) ;; => "3CLVThylT2KkrocdUpxIpg=="
    (b64 (md5-with-charset "’" "UTF-32")) ;; => "iHBMMMzkWTbPU+n8GCHitQ=="
    

    (where b64 is a base64 encoding step)


    And I found it:

    (b64 (md5-with-charset "’" "windows-1250")) ;; => "aF1ZConzUtEGRN2YXaKpoQ=="
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 Vue3 大型图片数据拖动排序
  • ¥15 划分vlan后不通了
  • ¥15 GDI处理通道视频时总是带有白色锯齿
  • ¥20 用雷电模拟器安装百达屋apk一直闪退
  • ¥15 算能科技20240506咨询(拒绝大模型回答)
  • ¥15 自适应 AR 模型 参数估计Matlab程序
  • ¥100 角动量包络面如何用MATLAB绘制
  • ¥15 merge函数占用内存过大
  • ¥15 使用EMD去噪处理RML2016数据集时候的原理
  • ¥15 神经网络预测均方误差很小 但是图像上看着差别太大