dongnai6973 2014-11-30 18:09
浏览 361
已采纳

在Golang中解析日期和时间的最佳方法

I have a lot of datetime values incoming as string into my golang program. The format is fixed in number of digit:

2006/01/02 15:04:05

I started to parse these dates with the time.Parse function

const dtFormat = "2006/01/02 15:04:05"

func ParseDate1(strdate string) (time.Time, error) {
    return time.Parse(dtFormat, strdate)
}

but I had some performances issue with my program. Thus I tried to tune it by writting my own parsing function, taking into account that my format is kind of fixed:

func ParseDate2(strdate string) (time.Time, error) {
    year, _ := strconv.Atoi(strdate[:4])
    month, _ := strconv.Atoi(strdate[5:7])
    day, _ := strconv.Atoi(strdate[8:10])
    hour, _ := strconv.Atoi(strdate[11:13])
    minute, _ := strconv.Atoi(strdate[14:16])
    second, _ := strconv.Atoi(strdate[17:19])

    return time.Date(year, time.Month(month), day, hour, minute, second, 0, time.UTC), nil
}

finally I did a benchmark on top of these 2 functions and got the following result:

 BenchmarkParseDate1      5000000               343 ns/op
 BenchmarkParseDate2     10000000               248 ns/op

This is a performance improvement by 27%. Is there a better way in terms of performances that could improve such datetime parsing ?

  • 写回答

2条回答 默认 最新

  • dongya3627 2014-11-30 19:30
    关注

    I would expect to make your entire program much faster. For example, ParseDate3,

    func ParseDate3(date []byte) (time.Time, error) {
        year := (((int(date[0])-'0')*10+int(date[1])-'0')*10+int(date[2])-'0')*10 + int(date[3]) - '0'
        month := time.Month((int(date[5])-'0')*10 + int(date[6]) - '0')
        day := (int(date[8])-'0')*10 + int(date[9]) - '0'
        hour := (int(date[11])-'0')*10 + int(date[12]) - '0'
        minute := (int(date[14])-'0')*10 + int(date[15]) - '0'
        second := (int(date[17])-'0')*10 + int(date[18]) - '0'
        return time.Date(year, month, day, hour, minute, second, 0, time.UTC), nil
    }
    

    Benchmarks:

    $ go test -bench=.
    testing: warning: no tests to run
    PASS
    BenchmarkParseDate1  5000000           308 ns/op
    BenchmarkParseDate2 10000000           225 ns/op
    BenchmarkParseDate3 30000000            44.9 ns/op
    ok      so/test 5.741s
    $ go test -bench=.
    testing: warning: no tests to run
    PASS
    BenchmarkParseDate1  5000000           308 ns/op
    BenchmarkParseDate2 10000000           226 ns/op
    BenchmarkParseDate3 30000000            45.4 ns/op
    ok      so/test 5.757s
    $ go test -bench=.
    testing: warning: no tests to run
    PASS
    BenchmarkParseDate1  5000000           312 ns/op
    BenchmarkParseDate2 10000000           225 ns/op
    BenchmarkParseDate3 30000000            45.0 ns/op
    ok      so/test 5.761s
    $ 
    

    Reference:

    Profiling Go Programs


    If you insist on using date string, use ParseDate4,

    func ParseDate4(date string) (time.Time, error) {
        year := (((int(date[0])-'0')*10+int(date[1])-'0')*10+int(date[2])-'0')*10 + int(date[3]) - '0'
        month := time.Month((int(date[5])-'0')*10 + int(date[6]) - '0')
        day := (int(date[8])-'0')*10 + int(date[9]) - '0'
        hour := (int(date[11])-'0')*10 + int(date[12]) - '0'
        minute := (int(date[14])-'0')*10 + int(date[15]) - '0'
        second := (int(date[17])-'0')*10 + int(date[18]) - '0'
        return time.Date(year, month, day, hour, minute, second, 0, time.UTC), nil
    }
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
  • drtj40036 2014-11-30 19:30
    关注

    From what you have already showed, using strconv.Atoi directly improved your performance. You can push it further and roll your own atoi for your particular use case.

    You expect each item to be a positive base-10 number. You also know it can't overflow, because max length of string representation passed is 4. The only error possible is then a non-digit character in the string. Knowing this, we can simply do the following:

    var atoiError = errors.New("invalid number")
    func atoi(s string) (x int, err error) {
        i := 0
        for ; i < len(s); i++ {
            c := s[i]
            if c < '0' || c > '9' {
                err = atoiError
                return
            }
            x = x*10 + int(c) - '0'
        }
        return
    }
    

    Wrapping this into ParseDate3, I have the following result:

    BenchmarkParseDate1  5000000           355 ns/op
    BenchmarkParseDate2 10000000           278 ns/op
    BenchmarkParseDate3 20000000            88 ns/op
    

    You could make it faster by not returning an error in atoi, but I encourage you to test the input anyway (unless it's validated somewhere else in your code).

    Alternative atoi approach after seeing the inlined solution:

    Pushing this even further, you could take advantage of the fact that all but one of passed strings are 2-digit long (year is 4-digit, but it's multiply of two). Creating atoi taking 2-digit string would eliminate the for loop. Example:

    // Converts string of 2 characters into a positive integer, returns -1 on error
    func atoi2(s string) int {
        x := uint(s[0]) - uint('0')
        y := uint(s[1]) - uint('0')
        if x > 9 || y > 9 {
            return -1 // error
        }
        return int(x*10 + y)
    }
    

    Converting year into the number would need 2-step approach then:

    year := atoi2(strdate[0:2])*100 + atoi2(strdate[2:4])
    

    This gives additional improvement:

    BenchmarkParseDate4 50000000            61 ns/op
    

    Note that inlined version proposed by @peterSO is only slightly faster (54 ns/op in my case), but the solution above gives you possibility of error checking, while the inlined version would blindly take all the characters converting them into dates.

    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥15 关于#tensorflow#的问题:有没有什么方法可以让机器自己学会像素风格的图片
  • ¥15 Oracle触发器字段变化时插入指定值
  • ¥15 docker无法进入容器内部
  • ¥15 qt https 依赖openssl 静态库
  • ¥15 python flask 报错
  • ¥15 改个密码引发的项目启动问题
  • ¥100 CentOS7单线多拨
  • ¥15 debian安装过程中老是出现无法将g21dr复制到g21dr怎么解决呀?
  • ¥15 如何用python实现跨工作簿的指定区域批量复制粘贴
  • ¥15 基于CH573f的雷迪安CR1400m通讯代码