duangengruan2144 2016-06-10 09:15
浏览 20
已采纳

核心语言中的字符串比较

Taking this simple comparison loopValue == "Firstname", is the following statement true?

If the internal operand inspecting the first char does not match the compared string, it will early abort

So taking the rawer form loopValue and "Firstname" are both []byte. And it would walk the array kind of like so as callback loop for truth:

someInspectionFunc(loopValue, "Firstname", func(charA, charB) {
    return charA == charB
})

... making it keep on going until it bumps false and checks if the number of iterations was equal to both their lengths. Also does it check length first?

if len(loopValue) != len("Firstname") {
    return false
}

I can't really find an explanation in the go source-code on GitHub as it's a bit above me.

The reason I'm asking this is because I'm doing big data processing and am benchmarking and doing cpu, memory and allocation pprof to squeeze some more juice out of the process. From that process it kind of made me think how Go (but also just C in general) would do this under the hood. Is this fully on an assembly level or does the comparison already happen in native Go code (kind of like sketched in the snippets above)?

Please let me know if I'm being too vague or if I missed something. Thank you

Update

When I did a firstCharater match in big strings of json, before really comparing I got about 3.7% benchmarking gain on 100k heavy entries:

<some irrelevant inspection code>.. v[0] == firstChar && v == lookFor {
    // Match found when it reaches here
}

the code above (especially on long strings) is faster than just going for v == lookFor.

  • 写回答

1条回答 默认 最新

  • drdu53813 2016-06-10 09:24
    关注

    The function is handled in assembly. The amd64 version is:

    TEXT runtime·eqstring(SB),NOSPLIT,$0-33
        MOVQ    s1str+0(FP), SI
        MOVQ    s2str+16(FP), DI
        CMPQ    SI, DI
        JEQ eq
        MOVQ    s1len+8(FP), BX
        LEAQ    v+32(FP), AX
        JMP runtime·memeqbody(SB)
    eq:
        MOVB    $1, v+32(FP)
        RET
    

    And it's the compiler's job to ensure that the strings are of equal length before that is called. (The runtime·memeqbody function is actually where the optimized memory comparisons happen, but there's probably no need to post the full text here)

    The equivalent Go code would be:

    func eqstring_generic(s1, s2 string) bool {
        if len(s1) != len(s2) {
            return false
        }
        for i := 0; i < len(s1); i++ {
            if s1[i] != s2[i] {
                return false
            }
        }
        return true
    }
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
编辑
预览

报告相同问题?

手机看
程序员都在用的中文IT技术交流社区

程序员都在用的中文IT技术交流社区

专业的中文 IT 技术社区,与千万技术人共成长

专业的中文 IT 技术社区,与千万技术人共成长

关注【CSDN】视频号,行业资讯、技术分享精彩不断,直播好礼送不停!

关注【CSDN】视频号,行业资讯、技术分享精彩不断,直播好礼送不停!

客服 返回
顶部