dongtaogu8510
2018-10-30 16:38
浏览 169
已采纳

检查字符串仅包含ASCII字符

Does Go have any method or there is a suggestion how to check if a string contains only ASCII characters? What is the right way to do it?

From my research, one of the solution is to check whatever there is any char greater than 127.

func isASCII(s string) bool {
    for _, c := range s {
        if c > unicode.MaxASCII {
            return false
        }
    }

    return true
}
  • 写回答
  • 好问题 提建议
  • 关注问题
  • 收藏
  • 邀请回答

2条回答 默认 最新

  • dongye1912 2018-10-30 17:29
    已采纳

    In Go, we care about performance, Therefore, we would benchmark your code:

    func isASCII(s string) bool {
        for _, c := range s {
            if c > unicode.MaxASCII {
                return false
            }
        }
        return true
    }
    
    BenchmarkRange-4    20000000    82.0 ns/op
    

    A faster (better, more idiomatic) version, which avoids unnecessary rune conversions:

    func isASCII(s string) bool {
        for i := 0; i < len(s); i++ {
            if s[i] > unicode.MaxASCII {
                return false
            }
        }
        return true
    }
    
    BenchmarkIndex-4    30000000    55.4 ns/op
    

    ascii_test.go:

    package main
    
    import (
        "testing"
        "unicode"
    )
    
    func isASCIIRange(s string) bool {
        for _, c := range s {
            if c > unicode.MaxASCII {
                return false
            }
        }
        return true
    }
    
    func BenchmarkRange(b *testing.B) {
        str := ascii()
        b.ResetTimer()
        for N := 0; N < b.N; N++ {
            is := isASCIIRange(str)
            if !is {
                b.Fatal("notASCII")
            }
        }
    }
    
    func isASCIIIndex(s string) bool {
        for i := 0; i < len(s); i++ {
            if s[i] > unicode.MaxASCII {
                return false
            }
        }
        return true
    }
    
    func BenchmarkIndex(b *testing.B) {
        str := ascii()
        b.ResetTimer()
        for N := 0; N < b.N; N++ {
            is := isASCIIIndex(str)
            if !is {
                b.Log("notASCII")
            }
        }
    }
    
    func ascii() string {
        byt := make([]byte, unicode.MaxASCII+1)
        for i := range byt {
            byt[i] = byte(i)
        }
        return string(byt)
    }
    

    Output:

    $ go test ascii_test.go -bench=.
    BenchmarkRange-4    20000000    82.0 ns/op
    BenchmarkIndex-4    30000000    55.4 ns/op
    $
    
    已采纳该答案
    评论
    解决 无用
    打赏 举报
  • dtw52353 2018-10-30 17:21

    It looks like your way is best.

    ASCII is simply defined as:

    ASCII encodes 128 specified characters into seven-bit integers

    As such, characters have values 0-27 (or 0-127, 0x0-0x7F).

    Go provides no way to check that every rune in a string (or byte in a slice) has numerical values in a specific range, so your code seems to be the best way to do it.

    评论
    解决 无用
    打赏 举报