有关树木和GC的性能建议

In my attempt to create a CSS3 parser (https://github.com/tdewolff/css) for my minification library, I'm in the process of optimizing the tokenizer, parser and minifier. The tokenizer is pretty fast but I believe the parser could be optimized.

The parser processes the tokens and generates a tree of nodes representing the syntax. All nodes satisfy this interface:

type Node interface {
    Type() NodeType
    String() string
}

Example nodes (see https://github.com/tdewolff/css/blob/master/node.go):

// root
type NodeStylesheet struct {
    NodeType
    Nodes []Node
}

// example node
type NodeRuleset struct {
    NodeType
    SelGroups []*NodeSelectorGroup
    Decls     []*NodeDeclaration
}

// leave (the only end-leave possible)
type NodeToken struct {
    NodeType
    TokenType
    Data string
}

Parsing stylesheet (see https://github.com/tdewolff/css/blob/master/parse.go):

func (p *parser) parseStylesheet() *NodeStylesheet {
    n := NewStylesheet()
    for {
        p.skipWhitespace()
        if p.at(ErrorToken) {
            return n
        }
        if p.at(CDOToken) || p.at(CDCToken) {
            n.Nodes = append(n.Nodes, p.shift())
        } else if cn := p.parseAtRule(); cn != nil {
            n.Nodes = append(n.Nodes, cn)
        } else if cn := p.parseRuleset(); cn != nil {
            n.Nodes = append(n.Nodes, cn)
        } else if cn := p.parseDeclaration(); cn != nil {
            n.Nodes = append(n.Nodes, cn)
        } else if !p.at(ErrorToken) {
            n.Nodes = append(n.Nodes, p.shift())
        }
    }
}

Each node is allocated on the heap and a significant time is spent on the GC and related tasks. Could I reduce that?

Can I put all element in a flat array for instance? Because the elements of the tree are filled sequentially (ie. it can be flattened). What techniques can I use to reduce (small) allocations on the heap?

Update

The minifier is actually not really slow, bootstrap (134kB) is minified in 28ms (NodeJS implementations take atleast 45ms and produce larger files http://goalsmashers.github.io/css-minification-benchmark/). But it would be great if I could squeeze out even more!

I know that some time is spent on []byte -> string casting, but the []byte from the tokenizer needs to be copied anyways because its memory can be reused at any tokenizer.Next() call. Since it needs to be copied anyways, I figured casting to string was better because it made much of the code easier (checking for equality).

I can make a tokenizer variant that keeps the whole file in memory, which is fine for the parser because the parser doesn't stream anyways.

Update 2

I did load the whole file into memory for the parser and replaced all string to '[]byte`, now it's 10% faster! Bootstrap.css now takes 23ms.

I don't think it's worth the hassle to flatten the tree, and it removes some of the flexibility (user able to modify the tree).

写回答
好问题 0 提建议
追加酬金
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

报告相同问题？

关注问题

代码出现GC回收报错的排查和解决 java linux
2022-10-12 17:49

回答 1 已采纳代码整改，分批次多线程并发处理数据。不要让一个线程长时间处理大批量数据。财大气粗又不想改代码的话就加内存吧，加到不报错为止，加内存后把-Xmx提上去。
Java关于GC和引用的问题 java
2019-02-03 02:20

回答 2 已采纳没错，GC的原理就是从根对象开始遍历它引用的其它变量。然后在这个过程中所有不可到达的对象，都会被GC。你remove了，之后没有任何活动的变量指向它，那么就会被回收。
关于jvm的GC，和循环创建对象 java 有问必答
2022-06-08 15:04

回答 3 已采纳首先第一点这个不会回收；因为局部变量也属于GC roots。死循环这个栈帧会一直存在了，所以不会回收。第二点，一般的方法，for或者while 只要你当前的局部变量没要逃逸，当栈帧结束都会自动回收，也
Unity手游性能优化的经验总结
2022-05-18 11:08

肖远行的博客一、定位游戏性能瓶颈 1.1 游戏循环基本循环：游戏逻辑-渲染提交-等待渲染完成（注意：游戏逻辑指的是除去渲染之外所有的CPU运算）。基本的游戏循环可以理解为先执行游戏逻辑，比如获得输入，然后更新玩家位置，...
大量分配的数据块导致巨大的GC性能问题
2014-12-10 09:50

回答 1 已采纳 Dmitry Vyukov of the Go team says this is a Go runtime performance issue you can trigger with a hu
clr工程中，gcroot定义了c#类的对象，把这个对象保存为成员变量，那么，在重复使用的时候，需要释放吗 c++
2019-12-23 18:24

回答 1 已采纳不需要释放，GC会自己回收的。
请问System.gc()这个方法执行后是立即回收内存吗? java
2017-09-04 01:42

回答 3 已采纳不会的，System.gc();只是建议Java虚拟机对此部分内存进行回收，但是不一定会发生GC
Unity3D最全性能优化参考手册（渲染、代码、UI）
2021-09-04 22:34

Sun.ME的博客 Unity3D最全性能优化参考手册,从渲染流程、代码分析原理入手
java进程里GC线程数为什么会那么多？ java
2023-01-12 17:21

回答 9 已采纳 GC (garbage collection) 线程数量可能会因环境和应用配置而有所不同。在某些情况下，增加 GC 线程数量可以提高垃圾收集的效率。例如，在多核系统上，使用多个 GC 线程可以利用多核
springboot项目maven打包报java.lang.OutOfMemoryError: GC overhead limit exceeded java maven spring boot 有问必答
2022-01-29 11:23

回答 3 已采纳我看你是在idea里打包的吧，估计是idea默认的内存不足。可以在菜单的 Help》》Change Memory Setting 里把idea内存设置大些。调整完重启idea再试试
Java中ArrayList的相关问题？GC回收问题？ java
2021-08-10 08:49

回答 2 已采纳做完16-17行的操作，第index个元素不在elementData中了，但是在elementData的末尾多出来一个元素，他和到数第二个元素是重复的；而且在ArrayList对象中再也访问不到，但
移动平台Unity3D 应用性能优化
2017-06-23 17:29

Frank-Geng的博客一移动平台硬件架构移动平台无论是Android 还是 IOS 用的都是统一内存架构，GPU和CPU共享一个物理内存，通常我们有“显存”和“内存”两种叫法，可以认为是这块物理内存的所有者不同，当这段映射到cpu，就是通常意义...
gc如何处理切片内存回收
2018-09-01 07:51

回答 1 已采纳 Go uses mark-and-sweep collector as it's present implementation. As per the algorithm, there will
移动平台 Unity3D 应用性能优化
2017-09-13 18:31

九逍工作室的博客做了大概半年多应用，总是对应用流畅度感到不满，性能的优化势在必行，在项目的进展过程中，总结了一些关于移动平台上Unity3D的又能优化经验，供分享移动平台硬件架构移动平台无论是Android 还是 ...
Unity 面试题汇总（五）性能优化知识点相关
2021-11-25 14:06

仙魁XAN的博客 Unity 面试题汇总（五）性能优化知识点相关 ...7、界面的延迟加载和定时卸载策略 8、避免频繁调用GameObject.SetActive 9、移动端性能优化心得 10、逻辑代码方法 11、GPU Instancing 1、资源分离打包与加
没有解决我的问题, 去提问

悬赏问题

¥15 Vue3 大型图片数据拖动排序
¥15 划分vlan后不通了
¥15 GDI处理通道视频时总是带有白色锯齿
¥20 用雷电模拟器安装百达屋apk一直闪退
¥15 算能科技20240506咨询（拒绝大模型回答）
¥15 自适应 AR 模型参数估计Matlab程序
¥100 角动量包络面如何用MATLAB绘制
¥15 merge函数占用内存过大
¥15 使用EMD去噪处理RML2016数据集时候的原理
¥15 神经网络预测均方误差很小但是图像上看着差别太大

码龄粉丝数原力等级 --

有关树木和GC的性能建议

Update

Update 2

0条回答默认最新

悬赏问题

有关树木和GC的性能建议

Update

Update 2

0条回答 默认 最新

悬赏问题

0条回答默认最新