From: email@... Date: 2019-06-08T11:56:42+00:00 Subject: [ruby-core:93022] [Ruby trunk Feature#14912] Introduce pattern matching syntax Issue #14912 has been updated by pitr.ch (Petr Chalupa). Hi, I am really looking forward to this feature. Looks great! However I'd like to make few suggestions which I believe should be part of the first pattern matching experimental release. I'll include use-cases and try to explain why it would be better to do so. ### (1) Pattern matching as first-class citizen Everything in Ruby is dynamically accessible (methods, classes, blocks, etc.) so it would be pity if patterns would be an exception from that. There should be an object which will represent the pattern and which can be lifted from the pattern literal. It may seem that just wrapping the pattern in a lambda as follows is enough to get an object which represents the pattern. ```ruby -> value do case value in (1..10) => i do_something_with i end end ``` In some cases it is sufficient however lets explore some interesting use cases which cannot be implemented without the first-class pattern-matching. **First use-case** to consider is searching for a value in a data structure. Let's assume we have a data-structure (e.g. some in memory database) and we want to provide an API to search for an element with a pattern matching e.g. `#search`. The structure stores log messages as follows `["severity", "message"]`. Then something as follows would be desirable. ```ruby def drain_erros(data) # pops all messages matching at least one pattern # and evalueates the appropriate branch with the destructured log message # for each matched message data.pop_all case in ["fatal", message] deal_with_fatal message in ["error", message] deal_with_error message end end ``` There are few things to consider. Compared to the already working implementation there is no message given to the case since that will be later provided in the pop_all method. Therefore the case in here has to evaluate to an object which encapsulates the pattern matching allowing to match candidates from the data-structure later in the pop_all implementation. Another important feature is that the object has to allow to match a candidate without immediately evaluating the appropriate branch. It has to give the pop_all method a chance to remove the element from the data-structure first before the arbitrary user code from the branch is evaluated. That is especially important if the data-structure is thread-safe and does locking, then it cannot hold the lock while it runs arbitrary user code. Firstly it limits the concurrency since no other operation can be executed on the data-structure and secondly it can lead to deadlocks since the common recommendation is to never call a user code w hile still holding an internal lock. Probably the simplest implementation which would allow the use-case work is to make case in without a message given behave as a syntax sugar for following. ```ruby case in [/A/, b] b.succ end # turns into -> value do case value in [/A/, b] -> { b.succ } end end ``` Then the implementation of pop_all could then look as follows. ```ruby def pop_all(pattern) each do |candidate| # assuming each locks the structure to read the candidate # but releases the lock while executing the block which could # be arbitrary user code branch_continuation = pattern.call(candidate) if branch_continuation # candidate matched delete candidate # takes a lock internally to delete the element branck_continuation.call end end end ``` In this example it never leaks the inner lock. **Second use case** which somewhat expands the first one is to be able to implement `receive` method of the concurrent abstraction called Actors. (`receive` blocks until matching message is received.) Let's consider an actor which receives 2 Integers adds them together and then replies to an actor which asks for a result with `[:sum, myself]` message then it terminates. ```ruby Actor.act do # block until frist number is received first = receive case in Numeric => value value end # block until second number is received, then add them sum = first + receive case in Numeric => value value end # when a :sum command is received with the sender reference # send sum back receive case in [:sum, sender] sender.send sum end end ``` It would be great if we could use pattern matching for messages as it is used in Erlang and in Elixir. The receive method as the `pop_all` method needs to first find the first matching message in the mailbox without running the user code immediately, then it needs to take the matching message from the Actor's mailbox (while locking the mailbox temporarily) before it can be passed to the arbitrary user code in the case branch (without the lock held). If `case in` without message is first class it could be useful to also have shortcut to define simple mono patterns. ```ruby case in [:sum, sender] sender.send sum end # could be just in [:sum, sender] { sender.send sum } ``` ```ruby case in ["fatal", _] -> message message end # could be just, default block being identity function in ["fatal", _] ``` Then the Actor example could be written only as follows: ```ruby Actor.act do # block until frist number is received first = receive in Numeric # block until second number is received, then add them sum = first + receive in Numeric # when a :sum command is received with the sender reference # send sum back receive in [:sum, sender] { sender.send sum } end ``` ### (2) Matching of non symbol key Hashes This was already mentioned as one of the problems to be looked at in future in the RubyKaigi's talk. If `=>` is taken for as pattern then it cannot be used to match hashes with non-Symbol keys. I would suggest to use just `=` instead, so `var = pat`. Supporting non-Symbol hashes is important for use cases like: 1. Matching data loaded from JSON where keys are strings ```ruby case { "name" => "Gustav", **other_data } in "name" => (name = /^Gu.*/), **other name #=> "Gustav" other #=> other_data end ``` 2. Using pattern to match the key ```ruby # let's assume v1 of a protocol sends massege {foo: data} # but v2 sends {FOO: data}, # where data stays the same in both versions, # then it is desirable to have one not 2 branches case message_as_hash in (:foo | :FOO) => data process data end ``` Could that work or is there a problem with parsing `=` in the pattern? ## Note about `in [:sum, sender] { sender.send sum }` `in [:sum, sender] { sender.send sum }` is quite similar to `->` syntax for lambdas. However in this suggestion above it would be de-sugared to `-> value { case value; in [:sum, sender]; -> { sender.send sum }}` which is not intuitive. A solution to consider would be to not to de-sugar the branch into another inner lambda but allow to check if an object matches the pattern (basically asking if the partial function represented by the block with a pattern match accepts the object). Then the example of implementing pop_all would look as follows. ```ruby def pop_all(pattern) each do |candidate| # assuming each locks the structure to read the candidate # but releases the lock while executing the block which could # be arbitrary user code # does not execute the branches only returns true/false if pattern.matches?(candidate) # candidate matched delete candidate # takes a lock internally to delete the element pattern.call candidate end end end ``` What are your thoughts? Do you think this could become part of the first pattern matching release? ---------------------------------------- Feature #14912: Introduce pattern matching syntax https://bugs.ruby-lang.org/issues/14912#change-78398 * Author: ktsj (Kazuki Tsujimoto) * Status: Assigned * Priority: Normal * Assignee: ktsj (Kazuki Tsujimoto) * Target version: 2.7 ---------------------------------------- I propose new pattern matching syntax. # Pattern syntax Here's a summary of pattern syntax. ``` # case version case expr in pat [if|unless cond] ... in pat [if|unless cond] ... else ... end pat: var # Variable pattern. It matches any value, and binds the variable name to that value. | literal # Value pattern. The pattern matches an object such that pattern === object. | Constant # Ditto. | var_ # Ditto. It is equivalent to pin operator in Elixir. | (pat, ..., *var, pat, ..., id:, id: pat, ..., **var) # Deconstructing pattern. See below for more details. | pat(pat, ...) # Ditto. Syntactic sugar of (pat, pat, ...). | pat, ... # Ditto. You can omit the parenthesis (top-level only). | pat | pat | ... # Alternative pattern. The pattern matches if any of pats match. | pat => var # As pattern. Bind the variable to the value if pat match. # one-liner version $(pat, ...) = expr # Deconstructing pattern. ``` The patterns are run in sequence until the first one that matches. If no pattern matches and no else clause, NoMatchingPatternError exception is raised. ## Deconstructing pattern This is similar to Extractor in Scala. The patten matches if: * An object have #deconstruct method * Return value of #deconstruct method must be Array or Hash, and it matches sub patterns of this ``` class Array alias deconstruct itself end case [1, 2, 3, d: 4, e: 5, f: 6] in a, *b, c, d:, e: Integer | Float => i, **f p a #=> 1 p b #=> [2] p c #=> 3 p d #=> 4 p i #=> 5 p f #=> {f: 6} e #=> NameError end ``` This pattern can be used as one-liner version like destructuring assignment. ``` class Hash alias deconstruct itself end $(x:, y: (_, z)) = {x: 0, y: [1, 2]} p x #=> 0 p z #=> 2 ``` # Sample code ``` class Struct def deconstruct; [self] + values; end end A = Struct.new(:a, :b) case A[0, 1] in (A, 1, 1) :not_match in A(x, 1) # Syntactic sugar of above p x #=> 0 end ``` ``` require 'json' $(x:, y: (_, z)) = JSON.parse('{"x": 0, "y": [1, 2]}', symbolize_names: true) p x #=> 0 p z #=> 2 ``` # Implementation * https://github.com/k-tsj/ruby/tree/pm2.7-prototype * Test code: https://github.com/k-tsj/ruby/blob/pm2.7-prototype/test_syntax.rb # Design policy * Keep compatibility * Don't define new reserved words * 0 conflict in parse.y. It passes test/test-all * Be Ruby-ish * Powerful Array, Hash support * Encourage duck typing style * etc * Optimize syntax for major use case * You can see several real use cases of pattern matching at following links :) * https://github.com/k-tsj/power_assert/blob/8e9e0399a032936e3e3f3c1f06e0d038565f8044/lib/power_assert.rb#L106 * https://github.com/k-tsj/pattern-match/network/dependents -- https://bugs.ruby-lang.org/ Unsubscribe: