doucai5315 2017-07-07 02:37
浏览 60

Antlr4-输入端没有可行的选择

I am trying to create a simple HOCON parser (started from the existing JSON one).

The grammar is defined as:

/** Taken from "The Definitive ANTLR 4 Reference" by Terence Parr */

// Derived from http://json.org
grammar HOCON;

hocon
   : value
   | pair
   ;

obj
   : object_begin pair (','? pair)* object_end
   | object_begin object_end
   ;

pair
   : STRING KV? value {fmt.Println("pairstr",$STRING.GetText())}
   | KEY KV? value {fmt.Println("pairkey",$KEY.GetText())}
   ;

array
   : array_begin value (',' value)* array_end
   | array_begin array_end
   ;

value
   : STRING {fmt.Println($STRING.GetText())}
   | REFERENCE {fmt.Println($REFERENCE.GetText())}
   | RAWSTRING {fmt.Println($RAWSTRING.GetText())}
   | NUMBER {fmt.Println($NUMBER.GetText())}
   | obj
   | array
   | 'true'
   | 'false'
   | 'null'
   ;

COMMENT
   : '#' ~( '' | '
' )* -> skip
   ;

STRING
   : '"' (ESC | ~ ["\\])* '"'
   | '\'' (ESC | ~ ['\\])* '\''
   ;

RAWSTRING
   : (ESC | ALPHANUM)+
   ;

KEY
   : ( '.' | ALPHANUM | '-')+
   ;

REFERENCE
   : '${' (ALPHANUM|'.')+ '}'
   ;

fragment ESC
   : '\\' (["\\/bfnrt] | UNICODE)
   ;


fragment UNICODE
   : 'u' HEX HEX HEX HEX
   ;

fragment ALPHANUM
   : [0-9a-zA-Z]
   ;

fragment HEX
   : [0-9a-fA-F]
   ;

KV
   : [=:]
   ;

array_begin
   : '[' { fmt.Println("BEGIN [") }
   ;

array_end
   : ']' { fmt.Println("] END") }
   ;

object_begin
   : '{' { fmt.Println("OBJ {") }
   ;

object_end
   : '}' { fmt.Println("} OBJ") }
   ;

NUMBER
   : '-'? INT '.' [0-9] + EXP? | '-'? INT EXP | '-'? INT
   ;

fragment INT
   : '0' | [1-9] [0-9]*
   ;

// no leading zeros

fragment EXP
   : [Ee] [+\-]? INT
   ;

// \- since - means "range" inside [...]

WS
   : [ \t
] + -> skip
   ;

the error is:

line 2:2 no viable alternative at input '{journal'
pairkey akka.persistence

the sample input that gives the error is:

akka.persistence {
  journal {
    # Absolute path to the journal plugin configuration entry used by
    # persistent actor or view by default.
    # Persistent actor or view can override `journalPluginId` method
    # in order to rely on a different journal plugin.
    plugin = ""
  }
}

however if I will update it to use quoted strings:

akka.persistence {
  'journal' {
    # Absolute path to the journal plugin configuration entry used by
    # persistent actor or view by default.
    # Persistent actor or view can override `journalPluginId` method
    # in order to rely on a different journal plugin.
    'plugin' = ""
  }
}

everything works as expected.

Looks like I miss something in the KEY definition, but I can't really find out what exactly.

The Go code to test it out is:

package main

import (
    "github.com/antlr/antlr4/runtime/Go/antlr"
    "go-hocon/parser"
)

func main() {
    is, _ := antlr.NewFileStream("test/simple1.conf")

    lex := parser.NewHOCONLexer(is)
    p := parser.NewHOCONParser(antlr.NewCommonTokenStream(lex, 0))
    p.BuildParseTrees = true
    p.Hocon()
}
  • 写回答

1条回答 默认 最新

  • duanqiang2617 2017-07-07 11:16
    关注

    Your first input makes journal lex as a RAWSTRING.

    [@0,0:15='akka.persistence',<KEY>,1:0]
    [@1,17:17='{',<'{'>,1:17]
    [@2,22:28='journal',<RAWSTRING>,2:2]
    [@3,30:30='{',<'{'>,2:10]
    [@4,277:282='plugin',<RAWSTRING>,7:4]
    [@5,284:284='=',<KV>,7:11]
    [@6,286:287='""',<STRING>,7:13]
    [@7,292:292='}',<'}'>,8:2]
    [@8,295:295='}',<'}'>,9:0]
    [@9,298:297='<EOF>',<EOF>,10:0]
    line 2:2 no viable alternative at input '{journal'
    

    On the other hand, 'journal' lexes as a string, but has those single quotes which you clearly don't want:

    [@0,0:15='akka.persistence',<KEY>,1:0]
    [@1,17:17='{',<'{'>,1:17]
    [@2,22:30=''journal'',<STRING>,2:2]  <-- now it's a string implicit token
    [@3,32:32='{',<'{'>,2:12]
    [@4,279:284='plugin',<RAWSTRING>,7:4]
    [@5,286:286='=',<KV>,7:11]
    [@6,288:289='""',<STRING>,7:13]
    [@7,294:294='}',<'}'>,8:2]
    [@8,297:297='}',<'}'>,9:0]
    [@9,300:299='<EOF>',<EOF>,10:0]
    line 7:4 no viable alternative at input '{plugin'
    line 8:2 mismatched input '}' expecting {'true', 'false', 'null', '[', '{', STRING, RAWSTRING, REFERENCE, KV, NUMBER}
    

    Why? Because lexer rules bind in the following way: 1. Match longest input first. 2. Match implicit tokens (like 'journal') 3. If length of input match is equal, match based on the order of the lexer rules.

    In your case, putting 'journal' makes it match as an implicit token, so it seems to work okay. But only because of those single quotes, which makes it match per rule 2 above Without the quotes, these two tokens are being matched as RAWSTRING, which doesn't fit the rule

    pair
       : STRING KV? value //{fmt.Println("pairstr",$STRING.GetText())}
    

    Hence the error.

    How to fix? Well, I reversed the lexer rules:

    RAWSTRING
       : (ESC | ALPHANUM)+
       ;
    
    STRING
       : '"' (ESC | ~ ["\\])* '"'
       | '\'' (ESC | ~ ['\\])* '\''
       ;
    

    And changed pair:

    pair
       : RAWSTRING KV? value //{fmt.Println("pairstr",$STRING.GetText())}
    

    Now it parses fine:

    [@0,0:15='akka.persistence',<KEY>,1:0]
    [@1,17:17='{',<'{'>,1:17]
    [@2,22:28='journal',<RAWSTRING>,2:2]
    [@3,30:30='{',<'{'>,2:10]
    [@4,277:282='plugin',<RAWSTRING>,7:4]
    [@5,284:284='=',<KV>,7:11]
    [@6,286:287='""',<STRING>,7:13]
    [@7,292:292='}',<'}'>,8:2]
    [@8,295:295='}',<'}'>,9:0]
    [@9,298:297='<EOF>',<EOF>,10:0]
    
    评论

报告相同问题?

悬赏问题

  • ¥15 程序不包含适用于入口点的静态Main方法
  • ¥15 素材场景中光线烘焙后灯光失效
  • ¥15 请教一下各位,为什么我这个没有实现模拟点击
  • ¥15 执行 virtuoso 命令后,界面没有,cadence 启动不起来
  • ¥50 comfyui下连接animatediff节点生成视频质量非常差的原因
  • ¥20 有关区间dp的问题求解
  • ¥15 多电路系统共用电源的串扰问题
  • ¥15 slam rangenet++配置
  • ¥15 有没有研究水声通信方面的帮我改俩matlab代码
  • ¥15 ubuntu子系统密码忘记