Golang的地图访问瓶颈

I am using Golang to implement naive bayesian classification for a dataset with over 30000 possible tags. I have built the model and I am in the classification phase. I am working on classifying 1000 records and this is taking up to 5 minutes. I have profiled the code with pprof functionality; the top10 are shown below:

Total: 28896 samples
   16408  56.8%  56.8%    24129  83.5% runtime.mapaccess1_faststr
    4977  17.2%  74.0%     4977  17.2% runtime.aeshashbody
    2552   8.8%  82.8%     2552   8.8% runtime.memeqbody
    1468   5.1%  87.9%    28112  97.3% main.(*Classifier).calcProbs
     861   3.0%  90.9%      861   3.0% math.Log
     435   1.5%  92.4%      435   1.5% runtime.markspan
     267   0.9%  93.3%      302   1.0% MHeap_AllocLocked
     187   0.6%  94.0%      187   0.6% runtime.aeshashstr
     183   0.6%  94.6%     1137   3.9% runtime.mallocgc
     127   0.4%  95.0%      988   3.4% math.log10

Surprisingly the map access seems to be the bottleneck. Has anyone experienced this. What other key, value datastructure can be used to avoid this bottleneck? All the map access is done in the following piece of code given below:

func (nb *Classifier) calcProbs(data string) *BoundedPriorityQueue{
    probs := &BoundedPriorityQueue{} 
    heap.Init(probs)

    terms := strings.Split(data, " ")
    for class, prob := range nb.classProb{
        condProb := prob
        clsProbs := nb.model[class]
        for _, term := range terms{
            termProb := clsProbs[term]
            if termProb != 0{
                condProb += math.Log10(termProb)
            }else{
                condProb += -6 //math.Log10(0.000001)
            }
        }
       entry := &Item{
            value: class,
            priority: condProb,
        }
        heap.Push(probs,entry)
    }
    return probs
}

The maps are nb.classProb which is map[string]float64 while the nb.model is a nested map of type

map[string]map[string]float64

写回答
好问题 0 提建议
追加酬金
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

2条回答默认最新

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
doumu9799 2014-01-11 00:46
关注
In addition to what @tomwilde said, another approach that may speed up your algorithm is string interning. Namely, you can avoid using a map entirely if you know the domain of keys ahead of time. I wrote a small package that will do string interning for you.

解决无用
评论打赏
分享
举报

评论

按下Enter换行，Ctrl+Enter发表内容

报告相同问题？

关注问题

Golang动态访问struct属性
2017-12-24 14:05

回答 3 已采纳 Use the reflect package to set a field by name: // setField sets field of v with given name to gi
Golang从模板访问设置
2017-10-21 20:29

回答 1 已采纳 I don't know can you access directly to a global variable, but I have been using template function
在Golang中汇集地图
2017-02-19 21:07

回答 2 已采纳 If your maps change (a lot) in size by deleting or adding entries this will cause new allocations
Golang笔记
2023-10-18 15:27

The Straggling Crow的博客协程是用户态轻量级线程协程是线程调度的基本单位通常在函数前加上go关键字就能实现并发。一个Goroutine会以一个很小的栈启动2KB或4KB，当遇到栈空间不足时，栈会自动伸缩，因此可以轻易实现成千上万个goroutine...
golang移动访问文件系统
2017-03-11 05:47

回答 1 已采纳 Yes. Take a look at Reverse Bindings here; https://godoc.org/golang.org/x/mobile/cmd/gobind To su
访问Golang Web服务器
2018-02-22 04:20

回答 2 已采纳 In your golang code, you don't need to do anything else except specifying the port your app listen
golang的http.Server如何绑定IP才能禁止外网访问? golang http
2023-01-20 12:07

回答 4 已采纳在 Go 语言中，http.Server 的 Addr 属性是用来绑定服务器监听地址的。您使用的 "127.0.0.1:8080" 和 "localhost:8080" 都是本地回环地址，只能被本地主
golang知识图谱
2021-09-06 17:01

csy2005csy的博客 TiDB github.com/pingcap/tidb 见识过mysql性能瓶颈之后你会想要选择的一款数据库 4. 完整标准库列表包子包说明 bufio bytes 提供了对字节切片操作的函数 crypto 收集了常见的加密常数 errors 实现了操作错误的...
vscode配置golang开发环境 golang vscode
2022-06-10 13:11

回答 2 已采纳 1.下载go。2.配置环境变量。3.在任意位置打开cmd进行测试go version4.打开cmd执行go env配置代理。5.vscode打开项目
在Golang中访问for循环之外的变量
2019-06-18 16:03

回答 1 已采纳 Your loop breaks when reader.Read() returns io.EOF. This happens when your input is consumed. At
golang 代码问题 golang idea vscode
2022-11-08 20:04

回答 2 已采纳这是统计每个英文字母的个数，不光是a
Golang#Typora-Golang笔记
2020-10-08 18:51

kakaops的博客 golang基础笔记
VScode调试golang代码环境配置 golang ide vscode
2022-08-12 13:59

回答 1 已采纳博客园博客园是一个面向开发者的知识分享社区。 https://www.
【吐血整理】超全golang面试题合集+golang学习指南+golang知识图谱+成长路线一份涵盖大部分golang程序员所需要掌握的核心知识。
2021-01-11 12:37

小白debug的博客 Golang开发新手常犯的50个错误数据类型连nil切片和空切片一不一样都不清楚？那BAT面试官只好让你回去等通知了。 golang面试题：字符串转成byte数组，会发生内存拷贝吗？ golang面试题：翻转含有中文、数字、英文...
【Golang】一篇文章带你快速了解Go语言&为什么你要学习Go语言
2023-04-20 22:32

凉云生烟的博客 Go语言（或 Golang）起源于 2007 年，并在 2009 年正式对外发布。Go 是非常年轻的一门语言，它的主要目标是“兼具 Python 等动态语言的开发速度和 C/C++ 等编译型语言的性能与安全性”。Go语言是编程语言设计的又一...
没有解决我的问题, 去提问

悬赏问题

¥20 西门子S7-Graph,S7-300，梯形图
¥50 用易语言http 访问不了网页
¥50 safari浏览器fetch提交数据后数据丢失问题
¥15 matlab不知道怎么改，求解答！！
¥15 永磁直线电机的电流环pi调不出来
¥15 用stata实现聚类的代码
¥15 请问paddlehub能支持移动端开发吗？在Android studio上该如何部署？
¥20 docker里部署springboot项目，访问不到扬声器
¥15 netty整合springboot之后自动重连失效
¥15 悬赏！微信开发者工具报错，求帮改

Golang的地图访问瓶颈

2条回答 默认 最新

悬赏问题

2条回答默认最新