In my attempt to create a CSS3 parser (https://github.com/tdewolff/css) for my minification library, I'm in the process of optimizing the tokenizer, parser and minifier. The tokenizer is pretty fast but I believe the parser could be optimized.
The parser processes the tokens and generates a tree of nodes representing the syntax. All nodes satisfy this interface:
type Node interface {
Type() NodeType
String() string
}
Example nodes (see https://github.com/tdewolff/css/blob/master/node.go):
// root
type NodeStylesheet struct {
NodeType
Nodes []Node
}
// example node
type NodeRuleset struct {
NodeType
SelGroups []*NodeSelectorGroup
Decls []*NodeDeclaration
}
// leave (the only end-leave possible)
type NodeToken struct {
NodeType
TokenType
Data string
}
Parsing stylesheet (see https://github.com/tdewolff/css/blob/master/parse.go):
func (p *parser) parseStylesheet() *NodeStylesheet {
n := NewStylesheet()
for {
p.skipWhitespace()
if p.at(ErrorToken) {
return n
}
if p.at(CDOToken) || p.at(CDCToken) {
n.Nodes = append(n.Nodes, p.shift())
} else if cn := p.parseAtRule(); cn != nil {
n.Nodes = append(n.Nodes, cn)
} else if cn := p.parseRuleset(); cn != nil {
n.Nodes = append(n.Nodes, cn)
} else if cn := p.parseDeclaration(); cn != nil {
n.Nodes = append(n.Nodes, cn)
} else if !p.at(ErrorToken) {
n.Nodes = append(n.Nodes, p.shift())
}
}
}
Each node is allocated on the heap and a significant time is spent on the GC and related tasks. Could I reduce that?
Can I put all element in a flat array for instance? Because the elements of the tree are filled sequentially (ie. it can be flattened). What techniques can I use to reduce (small) allocations on the heap?
Update
The minifier is actually not really slow, bootstrap (134kB) is minified in 28ms (NodeJS implementations take atleast 45ms and produce larger files http://goalsmashers.github.io/css-minification-benchmark/). But it would be great if I could squeeze out even more!
I know that some time is spent on []byte
-> string
casting, but the []byte
from the tokenizer needs to be copied anyways because its memory can be reused at any tokenizer.Next()
call. Since it needs to be copied anyways, I figured casting to string was better because it made much of the code easier (checking for equality).
I can make a tokenizer variant that keeps the whole file in memory, which is fine for the parser because the parser doesn't stream anyways.
Update 2
I did load the whole file into memory for the parser and replaced all string
to '[]byte`, now it's 10% faster! Bootstrap.css now takes 23ms.
I don't think it's worth the hassle to flatten the tree, and it removes some of the flexibility (user able to modify the tree).