From: "yui-knk (Kaneko Yuichiro) via ruby-core" Date: 2025-09-11T03:32:12+00:00 Subject: [ruby-core:123213] [Ruby Bug#21097] `x = a rescue b in c` and `def f = a rescue b in c` parsed differently between parse.y and prism Issue #21097 has been updated by yui-knk (Kaneko Yuichiro). I am working on this ticket and have some questions about grammar where I'd like to ask for Matz's opinions. # Precedence of not, in, rescue The precedence of not, in, rescue is referred on https://bugs.ruby-lang.org/issues/21097#note-1. Q1. Is it okay to combine them as follows? ```ruby # def f = not (1 in 2) def f = not 1 in 2 # def f = (not a) rescue true def f = not a rescue true # def f = (not (a in 1)) rescue true def f = not a in 1 rescue true # def f = (not a) rescue (1 in 1) def f = not a rescue 1 in 1 ``` This is consistent with how it's interpreted when written in the body of a method definition that has an `end`. ```ruby def f1 # not (1 in 2) not 1 in 2 end def f2 # (not a) rescue true not a rescue true end def f3 # (not (a in 1)) rescue true not a in 1 rescue true end def f4 # (not a) rescue (1 in 1) not a rescue 1 in 1 end ``` By the way, the operator precedence derived from this interpretation is `modifier_rescue < not < in`, which differs from the `in < not < modifier_rescue` defined in [the precedence table](https://github.com/ruby/ruby/blob/6a4222f47519fe79ab940c9f653b0fad0771176f/parse.y#L2878-L2883). # Precedence of =, in, rescue I think this ticket is about making the precedence of `=`, `in`, and `rescue` consistent. From the desire for `x = a rescue b in c` to be interpreted as `x = (a rescue (b in c))`, it's thought that the precedence of these operators should be in the order of `=` < `modifier_rescue` < `in`. Q2. The following code does not follow this precedence in either `parse.y` or `prism`, so is it correct to say that it's supposed to be interpreted as follows? ```ruby x = b in c # x = (b in c) def f = b in c # def f = (b in c ) ``` # What's happening in `parse.y` In the `parse.y` definition, single-line pattern matching can't appear on the right-hand side of an assignment. Furthermore, the left-hand side of single-line pattern matching (the left side of `in` or `=>`) is an `arg` rule, and an `arg` can contain an assignment. Because of this, `x = a rescue b in c` is interpreted as `(x = a rescue b) in c`. # Try to make single-line pattern matching to be `arg` In Ruby's grammar, the right-hand side of an assignment is often the `arg` rule. For example, the right-hand side of `x = 1`, `x = 1 + 2`, and `x = obj.m` are all `arg`. These assignments have the unique characteristic that they can be arguments to a method. This means that `m(x = 2, x)` is valid code. Currently, single-line pattern matching is an `expr` rule, which is not part of `arg`. The `arg` grammar element, as its name suggests, can be a method argument. If we were to make single-line pattern matching an `arg`, it would cause conflicts with several tokens. ## Conflict on ���=>��� Let's consider the code `m(v => [1])`. This can be interpreted as either a pattern matching `=>` or a hash `=>`. ## Conflict on ���,��� Let's consider the code `m(v in 1, 2, 3)`. Because the brackets `[]` can be omitted in an array pattern, this code can be interpreted as either an array pattern `v in 1, 2, 3` or as a method call to `m` with multiple arguments like `m((v in 1), 2, 3)`. ## Conflict on ���|��� Let's consider the code `m(v in :a | :b)`. The `|` symbol is used in pattern matching to separate patterns. However, it's also used as a binary operator. Therefore, this code can be interpreted as either a method call to `m` with a pattern matching argument of `v in :a | :b`, or as a method call to `m` with an argument that connects `v in :a` and `:b` with the binary operator `|`. ## Conflict on ���^��� Let's consider the code `m(v in 1, ^a)`. It may be a bit surprising that an ambiguity arises with the `^` used for variable pinning. However, recall that the brackets `[]` can be omitted in an array pattern, and that a trailing comma can be used in an array pattern. This code can therefore be interpreted in two ways: either the `v in 1, ^a` part is treated as a pattern, or it's a pattern matching `v in 1,` (with a trailing comma) connected to `a` by the binary operator `^`. ## Should the ambiguity be resolved? These ambiguities aren't a parser problem; they're a grammar problem. It's possible to resolve these ambiguities by deciding on a specific interpretation for each case. For example, we could resolve the conflict with `=>` by only making `in` part of the `arg` rule for single-line pattern matching, or we could fix the conflict with `,` by prohibiting the omission of brackets `[]` in array patterns. However, let's consider a different approach here. # An alternative approach: Allow it only in assignments that aren't `arg` rule Many assignments in Ruby are `arg`, but there are some exceptions. For example, a method call where the parentheses are omitted (called a command) can be written on the right-hand side of an assignment, but that assignment cannot be used as an argument. ```ruby x = cmd 1, 2 # OK m(x = cmd 1, 2) # Syntax Error ``` Let's consider an approach that allows single-line pattern matching on the right-hand side only for these kinds of assignments, which I'll temporarily call "top-level assignments." For example, the implementation would be as [here](https://github.com/yui-knk/ruby/compare/f9bffff3d4b45d4c6fed42f13d556c520f3cfd67...ecde4c8227ac06ee56ff75cfcbf75102083685a6). Q3. What do you think about limiting assignments with pattern matching to the top level? # What can be written in the body of an endless method definition? Now, various things can be written in the body of an endless method definition. ```ruby def m = 1 + 2 def m = cmd 1, 2 def m = a in b def f = a rescue b def f = a rescue b in c ``` Let's go over a few examples of things that cannot be written in the body of an endless method definition. Since an `arg` can be written in the body, we will specifically look at the production rules included in `stmt` and `expr`. ```ruby # expr def m = cmd 1, 2 do end # SyntaxError def m = !cmd 1, 2 # SyntaxError !cmd 1, 2 # ok x = !cmd 1, 2 # SyntaxError # stmt def m = x = cmd 1, 2 # SyntaxError x = cmd 1, 2 # ok def m = x += cmd 1, 2 # SyntaxError x += cmd 1, 2 # ok def m = def m2 = obj.m 1 # SyntaxError def m2 = obj.m 1 # ok ``` I will provide a simple explanation for each. In the first code example, when we write `private def m = cmd 1, 2 do end`, an ambiguity arises over whether the `do end` block is attached to `private` or to `cmd`. The second code example is simply a matter of whether or not to allow it. Since it's allowed when it's not an assignment but prohibited on the right-hand side of an assignment, the question is which behavior to make consistent. For the three cases starting with `stmt`, a conflict occurs with `and`, `or`, and `do`. This is because the grammar has two competing rules: `command_rhs: command_call_value` and `command_rhs: command_call_value modifier_rescue after_rescue stmt`. ```ruby # def m = x = cmd 1, 2 rescue (a and b) # (def m = x = cmd 1, 2 rescue a) and b def m = x = cmd 1, 2 rescue a and b # private def m = x = (cmd 1, 2 do exp end) # private (def m = x = cmd 1, 2) do exp end private def m = x = cmd 1, 2 do exp end ``` For example, we can resolve this by changing `command_call_value` to `command` and the `stmt` after `rescue` to an `arg` (simply changing `stmt` to `expr` would still leave a conflict with `do...end` because `expr: command_call` exists). ```ruby command_rhs : command %prec tOP_ASGN | command modifier_rescue after_rescue arg ``` Q4. What can be written in the body of an endless method definition? # Associativity of modifier rescue When used as a postfix operator, `rescue` is a left-associative operator, but its behavior on the right-hand side of an assignment is slightly different. On the right-hand side of an assignment, only one postfix `rescue` is associated to an expression. ```ruby # (((a rescue b) rescue c) rescue d) a rescue b rescue c rescue d # (x = a rescue b) rescue c rescue d x = a rescue b rescue c rescue d ``` Q5. In the case of an endless method definition, it uses the same binding method as a normal (non-assignment) case. Should this be left as is? ```ruby # def m = (((a rescue b) rescue c) rescue d) def m = a rescue b rescue c rescue d ``` When we previously discussed the combination of `and` or `or` with an endless method definition, a comment in a bug report ( [https://bugs.ruby-lang.org/issues/19392#note-9](https://bugs.ruby-lang.org/issues/19392#note-9)) mentioned the analogy between an endless method definition and an assignment. Based on this, I thought that if we follow the behavior of an assignment, it might also be possible to interpret `(def m = a rescue b) rescue c rescue d`. # Certain endless method definitions are `arg` Because some endless method definitions are currently defined as `arg`, the following code is valid. ```ruby private :m, def m = 1 rescue 2 private x = def m = 1 rescue 2 ``` However, the endless method definition in the following codes is a `stmt`, which causes a syntax error. ```ruby private :m, def m = 1 rescue 2 in 3 private x = def m = 1 rescue 2 in 3 ``` This difference depends on whether pattern matching is written inside the postfix `rescue`, so in some cases, you can't tell the difference without reading quite far ahead. Adding the ability to use pattern matching on the right-hand side of an assignment, defined as a `stmt`, can be seen as a move towards increasing the gap between assignment that can be used as `arg` and those that cannot. At the same time, when the right-hand side is a command, it behaves similarly, and the command syntax has been in Ruby for a long time. So, one could say it's a characteristic of Ruby's grammar to want to include as many writable things as possible in the `arg` rule. It seems that the technique of writing an assignment as an argument has become widespread among people who participate in code golfing, a sport of writing code as short as possible. ```ruby x = cmd 1, 2, 3 # ok m(x = cmd 1, 2, 3) # SyntaxError ``` Q6. Should this distinction be maintained? Also, in cases where there is only one argument, like `private def m = 1 rescue 2 in 3`, the body can be passed as an argument even if it's not an `arg`. # The list of questions * Q1. Is it okay to combine them as follows? * Q2. The following code does not follow this precedence in either `parse.y` or `prism`, so is it correct to say that it's supposed to be interpreted as follows? * Q3. What do you think about limiting assignments with pattern matching to the top level? * Q4. What can be written in the body of an endless method definition? * Q5. In the case of an endless method definition, it uses the same binding method as a normal (non-assignment) case. Should this be left as is? * Q6. Should this distinction be maintained? Also, in cases where there is only one argument, like `private def m = 1 rescue 2 in 3`, the body can be passed as an argument even if it's not an `arg`. ---------------------------------------- Bug #21097: `x = a rescue b in c` and `def f = a rescue b in c` parsed differently between parse.y and prism https://bugs.ruby-lang.org/issues/21097#change-114543 * Author: tompng (tomoya ishida) * Status: Assigned * Assignee: nobu (Nobuyoshi Nakada) * ruby -v: ruby 3.5.0dev (2025-01-27T08:19:32Z master c3c7300b89) +YJIT +MN +PRISM [arm64-darwin22] * Backport: 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: UNKNOWN ---------------------------------------- ~~~ruby x = a rescue b in c (x = (a rescue b)) in c # parse.y, prism(ruby 3.4) x = (a rescue (b in c)) # prism(ruby 3.5) ~~~ ~~~ruby def f = a rescue b in c #=> true(parse.y), :f(prism) (def f = (a rescue b)) in c # parse.y def f = (a rescue (b in c)) # prism ~~~ There is no difference between prism and parse.y parsing these codes ~~~ruby a rescue b in c # a rescue (b in c) x = a rescue b # x = (a rescue b) x = b in c # (x = b) in c def f = a rescue b # def f = (a rescue b) def f = b in c # (def f = a) in b ~~~ -- https://bugs.ruby-lang.org/ ______________________________________________ ruby-core mailing list -- ruby-core@ml.ruby-lang.org To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/