douyun6399
2014-07-19 02:00
浏览 571
已采纳

Golang中不区分大小写的字符串搜索

How do I search through a file for a word in a case insensitive manner?

For example

If I'm searching for UpdaTe in the file, if the file contains update, the search should pick it and count it as a match.

图片转代码服务由CSDN问答提供 功能建议

如何以不区分大小写的方式在文件中搜索单词?

例如

如果我正在文件中搜索 UpdaTe ,如果 该文件包含更新,搜索应将其选中并将其视为匹配项。

  • 点赞
  • 写回答
  • 关注问题
  • 收藏
  • 邀请回答

4条回答 默认 最新

  • donglian4464 2014-10-29 05:55
    已采纳

    strings.EqualFold() can check if two strings are equal, while ignoring case. It even works with Unicode. See http://golang.org/pkg/strings/#EqualFold for more info.

    http://play.golang.org/p/KDdIi8c3Ar

    package main
    
    import (
        "fmt"
        "strings"
    )
    
    func main() {
        fmt.Println(strings.EqualFold("HELLO", "hello"))
        fmt.Println(strings.EqualFold("ÑOÑO", "ñoño"))
    }
    

    Both return true.

    点赞 评论
  • dronthpi05943 2014-07-19 03:02

    Presumably the important part of your question is the search, not the part about reading from a file, so I'll just answer that part.

    Probably the simplest way to do this is to convert both strings (the one you're searching through and the one that you're searching for) to all upper case or all lower case, and then search. For example:

    func CaseInsensitiveContains(s, substr string) bool {
        s, substr = strings.ToUpper(s), strings.ToUpper(substr)
        return strings.Contains(s, substr)
    }
    

    You can see it in action here.

    点赞 评论
  • duanlaiyin2356 2014-07-19 03:22

    If your file is large, you can use regexp and bufio:

    //create a regex `(?i)update` will match string contains "update" case insensitive
    reg := regexp.MustCompile("(?i)update")
    f, err := os.Open("test.txt")
    if err != nil {
        log.Fatal(err)
    }
    defer f.Close()
    
    //Do the match operation
    //MatchReader function will scan entire file byte by byte until find the match
    //use bufio here avoid load enter file into memory
    println(reg.MatchReader(bufio.NewReader(f)))
    

    About bufio

    The bufio package implements a buffered reader that may be useful both for its efficiency with many small reads and because of the additional reading methods it provides.

    点赞 评论
  • dousi6405 2017-02-23 15:45

    Do not use strings.Contains unless you need exact matching rather than language-correct string searches

    None of the current answers are correct unless you are only searching ASCII characters the minority of languages (like english) without certain diaeresis / umlauts or other unicode glyph modifiers (the more "correct" way to define it as mentioned by @snap). The standard google phrase is "searching non-ASCII characters".

    For proper support for language searching you need to use http://golang.org/x/text/search.

    func SearchForString(str string, substr string) (int, int) {
        m := search.New(language.English, search.IgnoreCase)
        return = m.IndexString(str, substr)
    }
    
    start, end := SearchForString('foobar', 'bar');
    if start != -1 && end != -1 {
        fmt.Println("found at", start, end);
    }
    

    Or if you just want the starting index:

    func SearchForStringIndex(str string, substr string) (int, bool) {
        m := search.New(language.English, search.IgnoreCase)
        start, _ := m.IndexString(str, substr)
        if start == -1 {
            return 0, false
        }
        return start, true
    }
    
    index, found := SearchForStringIndex('foobar', 'bar');
    if found {
        fmt.Println("match starts at", index);
    }
    

    Search the language.Tag structs here to find the language you wish to search with or use language.Und if you are not sure.

    Update

    There seems to be some confusion so this following example should help clarify things.

    package main
    
    import (
        "fmt"
        "strings"
    
        "golang.org/x/text/language"
        "golang.org/x/text/search"
    )
    
    var s = `Æ`
    var s2 = `Ä`
    
    func main() {
        m := search.New(language.Finnish, search.IgnoreDiacritics)
        fmt.Println(m.IndexString(s, s2))
        fmt.Println(CaseInsensitiveContains(s, s2))
    }
    
    // CaseInsensitiveContains in string
    func CaseInsensitiveContains(s, substr string) bool {
        s, substr = strings.ToUpper(s), strings.ToUpper(substr)
        return strings.Contains(s, substr)
    }
    
    点赞 评论

相关推荐 更多相似问题