dongxiang3205 2012-05-31 02:32
浏览 60
已采纳

分号插入ala google go with flex

I'm interested in adding semi-colon insertion ala Google Go to my flex file.

From the Go documentation:

Semicolons

Like C, Go's formal grammar uses semicolons to terminate statements; unlike C, those semicolons do not appear in the source. Instead the lexer uses a simple rule to insert semicolons automatically as it scans, so the input text is mostly free of them.

The rule is this. If the last token before a newline is an identifier (which includes words like int and float64), a basic literal such as a number or string constant, or one of the tokens

break continue fallthrough return ++ -- ) }

the lexer always inserts a semicolon after the token. This could be summarized as, “if the newline comes after a token that could end a statement, insert a semicolon”.

A semicolon can also be omitted immediately before a closing brace, so a statement such as

go func() { for { dst <- <-src } }()

needs no semicolons. Idiomatic Go programs have semicolons only in places such as for loop clauses, to separate the initializer, condition, and continuation elements. They are also necessary to separate multiple statements on a line, should you write code that way.

One caveat. You should never put the opening brace of a control structure (if, for, switch, or select) on the next line. If you do, a semicolon will be inserted before the brace, which could cause unwanted effects. Write them like this

if i < f() {
    g()
}

not like this

if i < f()  // wrong! 
{           // wrong!
    g()     // wrong!
}           // wrong!

How would I go about doing this (how can I insert tokens in the stream, how can I see the last token that was matched to see if it is a good idea, etc etc etc)?

I am using bison too, but Go seems to just use their lexer for semicolon insertion.

  • 写回答

2条回答 默认 最新

  • ds2010630 2012-06-04 21:47
    关注

    You could pass lexer result tokens through a function that inserts semicolons where necessary. Upon detection of the need to insert, the next token can be put back to the input stream, basically lexing it again in the next turn.

    Below is an example that inserts a SEMICOLON before a newline, when it follows a WORD. The bison file "insert.y" is this:

    %{
    #include <stdio.h>
    
    void yyerror(const char *str) {
      printf("ERROR: %s
    ", str);
    }
    
    int main() {
      yyparse();
      return 0;
    }
    %} 
    %union {
      char *string;
    }
    %token <string> WORD
    %token SEMICOLON NEWLINE
    %%
    input: 
         | input WORD          {printf("WORD: %s
    ", $2); free($2);}
         | input SEMICOLON     {printf("SEMICOLON
    ");}
         ;
    %%
    

    and the lexer is generated by flex from this:

    %{
    #include <string.h>
    #include "insert.tab.h"
    int f(int token);
    %}
    %option noyywrap
    %%
    [ \t]          ;
    [^ \t
    ;]+     {yylval.string = strdup(yytext); return f(WORD);}
    ;              {return f(SEMICOLON);}
    
                 {int token = f(NEWLINE); if (token != NEWLINE) return token;}
    %%
    int insert = 0;
    
    int f(int token) {
      if (insert && token == NEWLINE) {
        unput('
    ');
        insert = 0;
        return SEMICOLON;
      } else {
        insert = token == WORD;
        return token;
      }
    }
    

    For input

    abc def
    ghi
    jkl;
    

    it prints

    WORD: abc
    WORD: def
    SEMICOLON
    WORD: ghi
    SEMICOLON
    WORD: jkl
    SEMICOLON
    

    Unputting a non-constant token requires a little extra work - I have tried to keep the example simple, just to give the idea.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥20 数学建模,尽量用matlab回答,论文格式
  • ¥15 昨天挂载了一下u盘,然后拔了
  • ¥30 win from 窗口最大最小化,控件放大缩小,闪烁问题
  • ¥20 易康econgnition精度验证
  • ¥15 msix packaging tool打包问题
  • ¥28 微信小程序开发页面布局没问题,真机调试的时候页面布局就乱了
  • ¥15 python的qt5界面
  • ¥15 无线电能传输系统MATLAB仿真问题
  • ¥50 如何用脚本实现输入法的热键设置
  • ¥20 我想使用一些网络协议或者部分协议也行,主要想实现类似于traceroute的一定步长内的路由拓扑功能