[ruby-core:109977] [Ruby master Feature#19013] Error Tolerant Parser
From:
"yui-knk (Kaneko Yuichiro)" <noreply@...>
Date:
2022-09-21 12:28:28 UTC
List:
ruby-core #109977
Issue #19013 has been reported by yui-knk (Kaneko Yuichiro).
----------------------------------------
Feature #19013: Error Tolerant Parser
https://bugs.ruby-lang.org/issues/19013
* Author: yui-knk (Kaneko Yuichiro)
* Status: Open
* Priority: Normal
----------------------------------------
# Background
Implementation for Language Server Protocol (LSP) sometimes needs to parse incomplete ruby script for example users want to complement expressions in the middle of statement like below:
```ruby
class A
def m
a = 10
if # here users want to run completion
end
end
```
In such case, LSP implementation wants to get partial AST instead of syntax error.
# Proposal
At the moment I want to propose 3 types of tolerance
## 1. Complement `end` when lexer hits to end-of-input but `end` is not enough
This is a case. Lexer will generate 1 `end` before generates end-of-input.
```ruby
describe "1" do
describe "2" do
describe "3" do
it "here" do
end
end
end
```
## 2. Extract "end" as keyword not identifier based on an indent
This is a case. Normal parser recognizes "end" on line 4 as "local variable or method".
This causes not only syntax error but also `bar` method definition is assumed as `Z::Foo#bar`.
Other approach is suppress `!IS_lex_state(EXPR_DOT)` checks for "end".
```ruby
module Z
class Foo
foo.
end
def bar
end
end
```
## 3. Change locations of `error`
Currently `error` is put into `top_stmts` and `stmts` like `top_stmts: error top_stmt` and `stmts: error stmt`.
However these are too strict to catch syntax error then want to move it to `stmt: error` and `expr_value: error`.
# Interface
* Adding `error_tolerant` option to `RubyVM::AbstractSyntaxTree.parse`
* Adding `--error-tolerant-parser` option to ruby command for debugging
* This option is valid only when `窶電ump=yydebug`, `--dump=parsetree` or `--dump=parsetree_with_comment` is passed
# Compatibility
Changing the location of `error` can lead incompatibility. At least I observed 2 test cases in ruby/ruby are broken by this change.
I think both of them depend on how ripper behaves after ripper raises syntax error.
* RDoc: https://github.com/yui-knk/ruby/commit/1dabbe508f0cc3dd4f83aa72502bbf347029dd8c
* However ruby script in heredoc is invalid...
* irb: https://github.com/yui-knk/ruby/commit/e18be19ecd044eb26a56f6f9ba4f19d40c01a9c7
* Range of error coloring is changed
All other changes are related to not parser but lexer and they are controlled by `error_tolerant` option. Therefore no behavior change is expected for ruby parser and ripper.
# Implementation
https://github.com/yui-knk/ruby/tree/error_recovery_indent_aware
--
https://bugs.ruby-lang.org/
Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>