[ruby-core:117735] [Ruby master Feature#18583] Pattern-matching: API for custom unpacking strategies?
From:
"ntl (Nathan Ladd) via ruby-core" <ruby-core@...>
Date:
2024-04-28 16:11:55 UTC
List:
ruby-core #117735
Issue #18583 has been updated by ntl (Nathan Ladd).
Could the match operator, `=~`, could be used as a general complement to `===`?
Example (following Victor's original sketch):
``` ruby
class Matcher
def initialize(regexp)
@regexp = regexp
end
def ===(obj)
@regexp.match?(obj)
end
def =~(obj)
match_data = @regexp.match(obj)
match_data
end
end
case "some string"
in ^(Matcher.new(/(?<some_named_capture>some) string/) => match_data
some_named_capture = match_data[:some_named_capture]
puts "Match: #{some_named_capture}"
end
```
This would add `=~` to the pattern matching protocol that's currently comprised of `===`, `deconstruct` and `deconstruct_keys`. It would make `===` significantly more useful, and regular expressions provide a great example of why: when matching a string to a regular expression pattern, the string is already in lexical scope, but the match data is novel and only comes into existence upon a successful match:
```
subject = "some string"
case subject
in ^(Matcher.new(/(?<some_named_capture>some) string/) => match_data
# Capturing the match data variable instead of the original string doesn't make the original string inaccessible:
puts "Match subject: #{subject.inspect}"
end
```
I also suspect this could be embedded into the pattern syntax itself, and could allow for some interesting possibilities. One example that leaps to mind is reifying primitive data parsed from JSON into a data structure:
``` ruby
SomeStruct = Struct.new(:some_attr, :some_other_attr) do
def self.===(data)
data.is_a?(Hash) && data.key?(:some_attr) && data.key?(:some_other_attr)
end
def self.=~(data)
new(**data)
end
end
# Parse JSON into raw (primitive) data
some_data = JSON.parse(<<JSON)
{
"some_attr": "some value",
"some_other_attr": "some other value"
}
JSON
# Reify data structure from raw data
case some_data
in SomeStruct => some_struct
puts some_struct.inspect
end
```
----------------------------------------
Feature #18583: Pattern-matching: API for custom unpacking strategies?
https://bugs.ruby-lang.org/issues/18583#change-108142
* Author: zverok (Victor Shepelev)
* Status: Open
----------------------------------------
I started to think about it when discussing https://github.com/ruby/strscan/pull/30.
The thing is, usage of StringScanner for many complicated parsers invokes some kind of branching.
In pseudocode, the "ideal API" would allow to write something like this:
```ruby
case <what next matches>
in /regexp1/ => value_that_matched
# use value_that_matched
in /regexp2/ => value_that_matched
# use value_that_matched
# ...
```
This seems "intuitively" that there *should* be some way of implementing it, but we fall short. We can do some StringScanner-specific matcher object which defines its own `#===` and use it with pinning:
```ruby
case scanner
in ^(Matcher.new(/regexp1/)) => value_that_matched
# ...
```
But there is no API to tell how the match result will be unpacked, just the whole `StringScanner` will be put into `value_that_matched`.
So, I thought that maybe it would be possible to define some kind of API for pattern-like objects, the method with signature like `try_match_pattern(value)`, which by default is implemented like `return value if self === value`, but can be redefined to return something different, like part of the object, or object transformed somehow.
This will open some interesting (if maybe uncanny) possibilities: not just slicing out the necessary part, but something like
```ruby
value => ^(type_caster(Integer)) => int_value
```
So... Just a discussion topic!
--
https://bugs.ruby-lang.org/
______________________________________________
ruby-core mailing list -- ruby-core@ml.ruby-lang.org
To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/