From: "Eregon (Benoit Daloze)" Date: 2021-11-03T23:46:38+00:00 Subject: [ruby-core:105924] [Ruby master Feature#12745] String#(g)sub(!) should pass a MatchData to the block, not a String Issue #12745 has been updated by Eregon (Benoit Daloze). The fix for #12689 seems clear, such variables need to be Thread-local (or even Fiber-local). That's already what TruffleRuby does (mentioned in that issue) and it seems JRuby is doing the same (https://github.com/jruby/jruby/issues/3031#issuecomment-660601045). And this is only a problem if a given method activation has blocks and those blocks from that same method call are executed in different threads and one of them mutates $~, which seems already fairly rare. ---------------------------------------- Feature #12745: String#(g)sub(!) should pass a MatchData to the block, not a String https://bugs.ruby-lang.org/issues/12745#change-94466 * Author: herwin (Herwin W) * Status: Feedback * Priority: Normal * Assignee: matz (Yukihiro Matsumoto) ---------------------------------------- A simplified (and stupid) example: replace some placeholders in a string with function calls ~~~ruby def placeholder(val) raise 'Incorrect value' unless val == 'three' '3' end str = '1.2.[three].4' str.gsub!(/\[(\w+)\]/) { |m| placeholder(m) } ~~~ This raises the 'incorrect value' because we don't pass the match 'three', but the full string '[three]'. It looks like we have 3 options to fix that: 1. Match `[three]` instead of `three` in the placeholder replacement method 2. Pass `m[1..-2]` instead of `m` to the method (or strip it in `placeholder`) 3. Use `$1` in the method call, ignore the value that's passed to the block Options 1 and 2 look kind of code duplication to me (and they're possible in the simplified example, but might get tricky in real situations). I don't like option 3 because you completely ignore the value that's been passed to the block in favor of global variables, you can't use named captures, and writing code this way makes it incompatible with Rubinius. I think it would be more logical to pass a `MatchData` (like what you'd get with `String#match`) instead of a `String` to the block. The `#to_s` returns the whole string, so in 90% of the use cases the code could remain unaltered, but the remaining 10% makes it a change that shouldn't be backported to 2.3. Attached is a very naive patch to pass a matchdata to the block called by `String#sub`. The additional change in `rbinstall.rb` was required to run `make install`, which actually shows an incompatiblity (which I hadn't anticipated) ---Files-------------------------------- ruby_string_sub_matchdata.diff (952 Bytes) -- https://bugs.ruby-lang.org/ Unsubscribe: