I need to process volumes of text and one of the steps is to remove all non-alphanumeric characters. I'm trying to find an efficient way to do it.
So far I have two functions:
func stripMap(str, chr string) string {
return strings.Map(func(r rune) rune {
if strings.IndexRune(chr, r) < 0 {
return r
}
return -1
}, str)
}
Here I actually have to feed a string of all non-alpha characters.
And plain old regex
func stripRegex(in string) string {
reg, _ := regexp.Compile("[^a-zA-Z0-9 ]+")
return reg.ReplaceAllString(in, "")
}
The regex one seems to be much slower
BenchmarkStripMap-8 30000 37907 ns/op 8192 B/op 2 allocs/op
BenchmarkStripRegex-8 10000 131449 ns/op 57552 B/op 35 allocs/op
Looking for suggestions. Any other better way to do it? Improve the above?