dqf67993 2013-11-21 22:53
浏览 202

在GO(Golang)中解析用户代理-Tobie / ua-parser

I am trying to stream (a lot) of user agents through a GO (Golang) program to extract different information about these ua agents like device type, OS, etc.

The GO code in Tobie Langel's UA Parser Repo looks very promising:

https://github.com/tobie/ua-parser/tree/master/go/uaparser

I created a simple program, in which I basically add streaming functionality to the example on the README page. To compare performance, I created the same type of simple program with a Ruby gem that uses a similar approach and same regexes.yaml file.

https://github.com/toolmantim/user_agent_parser

After compiling the Go program and testing both, the Ruby version is running 2-3 times faster than the GO version.

As far as I can see, both programs are loading and processing the ua agents in a similar manner.

I am new to GO and am wondering if anyone sees any major optimizations or fixes that could make programs using the GO portion of this repo run faster.

I am also interested to know if anyone knows of any other GO libraries I can use to parse user agents that work well.

---TESTING SIMPLE PROGRAMS TO COMPARE REGEX VS PCRE LIBS (as suggested in the comments below)

I have created the programs below, one using PCRE and one using the standard regex library. However, I don't seem to be getting a performance boost with PCRE. In fact, the PCRE library seems to be a little slower. Am I approaching this the wrong way?

--With standard regex library

package main

import (
  "fmt"
  "regexp"
  "strings"
  "bufio"
  "os"
)

func main() {

  var regex = regexp.MustCompile(`Mac`)
  scanner := bufio.NewScanner(os.Stdin)

  for scanner.Scan() {

    line := scanner.Text()
    fields := strings.Split(line, "\t")
    fmt.Println(regex.FindIndex([]byte(fields[0])))

  }

}  

--With PCRE library

package main

import (
  "fmt"
  pcre "github.com/glenn-brown/golang-pkg-pcre/src/pkg/pcre"
  "bufio"
  "os"
  "strings"
)

func main() {

  scanner:= bufio.NewScanner(os.Stdin)
  var regex = pcre.MustCompile(`Mac`, 0)

  for scanner.Scan() {

    line := scanner.Text()
    fields := strings.Split(line, "\t")
    fmt.Println(regex.FindIndex([]byte(fields[0]),0))

 }
}  
  • 写回答

1条回答 默认 最新

  • duanli9930 2013-12-22 04:04
    关注

    I would consider the rubex library. I changed ua-parser to use rubex instead, and I saw a 7x speed improvement. The library claims a 10x improvement, so I would give it a try with your particular application.

    评论

报告相同问题?

悬赏问题

  • ¥15 csmar数据进行spss描述性统计分析
  • ¥15 各位请问平行检验趋势图这样要怎么调整?说标准差差异太大了
  • ¥15 delphi webbrowser组件网页下拉菜单自动选择问题
  • ¥15 wpf界面一直接收PLC给过来的信号,导致UI界面操作起来会卡顿
  • ¥15 init i2c:2 freq:100000[MAIXPY]: find ov2640[MAIXPY]: find ov sensor是main文件哪里有问题吗
  • ¥15 运动想象脑电信号数据集.vhdr
  • ¥15 三因素重复测量数据R语句编写,不存在交互作用
  • ¥15 微信会员卡等级和折扣规则
  • ¥15 微信公众平台自制会员卡可以通过收款码收款码收款进行自动积分吗
  • ¥15 随身WiFi网络灯亮但是没有网络,如何解决?