duanfu6160 2018-03-18 18:45
浏览 219
已采纳

Golang正则表达式始终返回false?

I am taking a user input(a regular expression), and checking to see if a given line of a file would match it. I then return some ID if there's a match(the ID of the line), and that's about it. However, it appears as my match always returns false? But, interestingly, if I throw a wildcard .*, the program will take significantly longer to execute than a specific regular expression. So, there must be something going on -- why does it always return false?

Sample code:

func main() {

    // User input from command line
    reader := bufio.NewReader(os.Stdin)
    fmt.Print("Enter regexp: ")
    userRegexp, _ := reader.ReadString('
')

    // List all .html files in static dir
    files, err := filepath.Glob("static/*.html")
    if err != nil {
        log.Fatal(err)
    }

    // Empty array of int64's to be returned with matching results
    var lineIdArr []int64

    for _, file := range files {
        htmlFile, _ := os.Open(file)
        fscanner := bufio.NewScanner(htmlFile)

        // Loop over each line
        for fscanner.Scan() {

            line := fscanner.Text()

            match := matchLineByValue(userRegexp, line) // This is always false?

            // ID is always the first item. Seperate by ":" and cast it to int64.
            lineIdStr := line[:strings.IndexByte(line, ':')]
            lineIdInt, err := strconv.ParseInt(lineIdStr, 10, 64)

            if err != nil {
                panic(err)
            }

            // If matched, append ID to lineIdArr
            if match {
                lineIdArr = append(lineIdArr, lineIdInt)
            }
        }
    }
    fmt.Println("Return array: ", lineIdArr)
    fmt.Println("Using regular expression: ", userRegexp)
}

func matchLineByValue(re string, s string) bool {
    return regexp.MustCompile(re).MatchString(s)
}

is regexp.MustCompile(re).MatchString(s) not the right way to construct a regular expression from user input and match it to a whole line?

The string it matches is fairly long(it's basically a whole html file), would that present an issue?

  • 写回答

1条回答 默认 最新

  • dongningwen1146 2018-03-18 18:56
    关注

    The call userRegexp, _ := reader.ReadString(' ') returns a string with a trailing newline. Trim the newline:

     userRegexp, err := reader.ReadString('
    ')
     if err != nil {
        // handle error
     }
     userRegexp = userRegexp[:len(userRegexp)-1]
    

    Here's the code with some other improvements (compile regexp once, use scanner Bytes):

    // User input from command line
    reader := bufio.NewReader(os.Stdin)
    fmt.Print("Enter regexp: ")
    userRegexp, err := reader.ReadString('
    ')
    if err != nil {
        log.Fatal(err)
    }
    userRegexp = userRegexp[:len(userRegexp)-1]
    re, err := regexp.Compile(userRegexp)
    if err != nil {
        log.Fatal(err)
    }
    
    // List all .html files in static dir
    files, err := filepath.Glob("static/*.html")
    if err != nil {
        log.Fatal(err)
    }
    
    // Empty array of int64's to be returned with matching results
    var lineIdArr []int64
    
    for _, file := range files {
        htmlFile, _ := os.Open(file)
        fscanner := bufio.NewScanner(htmlFile)
        // Loop over each line
        for fscanner.Scan() {
            line := fscanner.Bytes()
            if !re.Match(line) {
                continue
            }
            lineIdStr := line[:bytes.IndexByte(line, ':')]
            lineIdInt, err := strconv.ParseInt(string(lineIdStr), 10, 64)
            if err != nil {
                log.Fatal(err)
            }
            lineIdArr = append(lineIdArr, lineIdInt)
        }
    }
    fmt.Println("Return array: ", lineIdArr)
    fmt.Println("Using regular expression: ", userRegexp)
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?