From: duerst Date: 2022-09-22T03:04:43+00:00 Subject: [ruby-core:109984] [Ruby master Feature#19013] Error Tolerant Parser Issue #19013 has been updated by duerst (Martin D��rst). The topic of parsing incomplete syntax also came up in Kevin Newton's talk (see https://rubykaigi.org/2022/presentations/kddnewton.html) at RubyKaigi 2022. In the talk, he said he is working on a new parser. Maybe these efforts could be combined? ---------------------------------------- Feature #19013: Error Tolerant Parser https://bugs.ruby-lang.org/issues/19013#change-99233 * Author: yui-knk (Kaneko Yuichiro) * Status: Open * Priority: Normal ---------------------------------------- # Background Implementation for Language Server Protocol (LSP) sometimes needs to parse incomplete ruby script for example users want to complement expressions in the middle of statement like below: ```ruby class A def m a = 10 if # here users want to run completion end end ``` In such case, LSP implementation wants to get partial AST instead of syntax error. # Proposal At the moment I want to propose 3 types of tolerance ## 1. Complement `end` when lexer hits to end-of-input but `end` is not enough This is a case. Lexer will generate 1 `end` before generates end-of-input. ```ruby describe "1" do describe "2" do describe "3" do it "here" do end end end ``` ## 2. Extract "end" as keyword not identifier based on an indent This is a case. Normal parser recognizes "end" on line 4 as "local variable or method". This causes not only syntax error but also `bar` method definition is assumed as `Z::Foo#bar`. Other approach is suppress `!IS_lex_state(EXPR_DOT)` checks for "end". ```ruby module Z class Foo foo. end def bar end end ``` ## 3. Change locations of `error` Currently `error` is put into `top_stmts` and `stmts` like `top_stmts: error top_stmt` and `stmts: error stmt`. However these are too strict to catch syntax error then want to move it to `stmt: error` and `expr_value: error`. # Interface * Adding `error_tolerant` option to `RubyVM::AbstractSyntaxTree.parse` * Adding `--error-tolerant-parser` option to ruby command for debugging * This option is valid only when `���dump=yydebug`, `--dump=parsetree` or `--dump=parsetree_with_comment` is passed # Compatibility Changing the location of `error` can lead incompatibility. At least I observed 2 test cases in ruby/ruby are broken by this change. I think both of them depend on how ripper behaves after ripper raises syntax error. * RDoc: https://github.com/yui-knk/ruby/commit/1dabbe508f0cc3dd4f83aa72502bbf347029dd8c * However ruby script in heredoc is invalid... * irb: https://github.com/yui-knk/ruby/commit/e18be19ecd044eb26a56f6f9ba4f19d40c01a9c7 * Range of error coloring is changed All other changes are related to not parser but lexer and they are controlled by `error_tolerant` option. Therefore no behavior change is expected for ruby parser and ripper. # Implementation https://github.com/yui-knk/ruby/tree/error_recovery_indent_aware -- https://bugs.ruby-lang.org/ Unsubscribe: