[#123172] [Ruby Bug#21560] RUBY_MN_THREADS=1 causes large performance regression in Puma 7 — "schneems (Richard Schneeman) via ruby-core" <ruby-core@...>

Issue #21560 has been reported by schneems (Richard Schneeman).

13 messages 2025/09/03

[#123197] [Ruby Misc#21566] Transfer Shopify/yjit-bench and speed.yjit.org to ruby/ruby-bench and *.ruby-lang.org — "k0kubun (Takashi Kokubun) via ruby-core" <ruby-core@...>

Issue #21566 has been reported by k0kubun (Takashi Kokubun).

7 messages 2025/09/08

[#123207] [Ruby Bug#21568] Requiring core libraries when already requiring mutliple user defined libraries with the same name can error — "alexalexgriffith (Alex Griffith) via ruby-core" <ruby-core@...>

Issue #21568 has been reported by alexalexgriffith (Alex Griffith).

9 messages 2025/09/10

[#123209] [Ruby Bug#21569] [armv7, musl] SIGBUS in ibf_load_object_float due to unaligned VFP double load when reading IBF — "amacxz (Aleksey Maximov) via ruby-core" <ruby-core@...>

SXNzdWUgIzIxNTY5IGhhcyBiZWVuIHJlcG9ydGVkIGJ5IGFtYWN4eiAoQWxla3NleSBNYXhpbW92

8 messages 2025/09/10

[#123257] [Ruby Misc#21606] DevMeeting-2025-10-23 — "mame (Yusuke Endoh) via ruby-core" <ruby-core@...>

Issue #21606 has been reported by mame (Yusuke Endoh).

9 messages 2025/09/16

[#123261] [Ruby Bug#21607] require 'concurrent-ruby' causes segfault with Ruby 3.4.6 on linux/i686 — "satadru (Satadru Pramanik) via ruby-core" <ruby-core@...>

Issue #21607 has been reported by satadru (Satadru Pramanik).

17 messages 2025/09/16

[#123279] [Ruby Misc#21609] Propose Stan Lo (@st0012) as a core committer — "tekknolagi (Maxwell Bernstein) via ruby-core" <ruby-core@...>

SXNzdWUgIzIxNjA5IGhhcyBiZWVuIHJlcG9ydGVkIGJ5IHRla2tub2xhZ2kgKE1heHdlbGwgQmVy

12 messages 2025/09/17

[#123288] [Ruby Bug#21610] Use ec->interrupt_mask to prevent interrupts. — "ioquatix (Samuel Williams) via ruby-core" <ruby-core@...>

SXNzdWUgIzIxNjEwIGhhcyBiZWVuIHJlcG9ydGVkIGJ5IGlvcXVhdGl4IChTYW11ZWwgV2lsbGlh

7 messages 2025/09/18

[#123319] [Ruby Feature#21615] Introduce `Array#values` — "matheusrich (Matheus Richard) via ruby-core" <ruby-core@...>

Issue #21615 has been reported by matheusrich (Matheus Richard).

9 messages 2025/09/23

[#123350] [Ruby Bug#21618] Allow to use the build-in prism version to parse code — "Earlopain (Earlopain _) via ruby-core" <ruby-core@...>

Issue #21618 has been reported by Earlopain (Earlopain _).

15 messages 2025/09/30

[ruby-core:123316] [Ruby Feature#21557] Ractor.shareable_proc to make sharable Proc objects, safely and flexibly

From: "Eregon (Benoit Daloze) via ruby-core" <ruby-core@...>
Date: 2025-09-23 19:20:48 UTC
List: ruby-core #123316
Issue #21557 has been updated by Eregon (Benoit Daloze).

Assignee set to ko1 (Koichi Sasada)
Target version set to 3.5

Thank you for implementing this, @ko1!

----------------------------------------
Feature #21557: Ractor.shareable_proc to make sharable Proc objects, safely and flexibly
https://bugs.ruby-lang.org/issues/21557#change-114682

* Author: Eregon (Benoit Daloze)
* Status: Closed
* Assignee: ko1 (Koichi Sasada)
* Target version: 3.5
----------------------------------------
Following #21039 and #21550, this is a complete proposal which does not require reading these previous proposals (since that caused some confusion).
That way, it is hopefully as clear as possible.
It also explains how it solves everything we discussed in the previous tickets.
It solves all real-world examples from https://bugs.ruby-lang.org/issues/21550#note-7.

To use Ractor effectively, one needs to create Procs which are shareable between Ractors.
Of course, such Procs must not refer to any unshareable object (otherwise the Ractor invariant is broken and segfaults follow).

One key feature of blocks/Procs is to be able to capture outer variables, e.g.:
```ruby
data = ...
task = -> { do_work(data) }
```
Ractor shareable procs should be able to use captured variables, because this is one of the most elegant ways to pass data/input in Ruby.

But there is a fundamental conflict there, reassigning captured variables cannot be honored by shareable procs, otherwise it breaks the Ractor invariant.
So creating a shareable proc internally makes a shallow copy of the environment, to not break the Ractor invariant.
We cannot prevent assigning local variables (i.e. raise an exception on `foo = value`), that would be way to weird.
But we can raise an error when trying to create a shareable proc in an incompatible situation, that makes it safe by preventing the unsafe cases.

## Reassigning a captured variable inside the block

Concretely, it seems we all already agree that this should be a `Ractor::IsolationError`:
```ruby
def example
  a = 1
  b = proc { v = a; a += 1; v }
  r = Ractor.shareable_proc(&b) # Ractor::IsolationError: cannot isolate a block because it accesses outer variables (a) which are reassigned inside the block
  [b, r]
end
example.map(&:call)
```
And that's because without the error the result would be `[1, 1]` which is unexpected (it should be `[1, 2]`), `r.call` should have updated `a` to 2 but it only updated `a` in its environment copy.
That basically breaks the lexical scoping of variables captured by blocks.
We can check this by static analysis, in fact we already use static analysis for `Ractor.new`: `a = 0; Ractor.new { a = 2 }` which gives `can not isolate a Proc because it accesses outer variables (a). (ArgumentError)`.

## Reassigning a captured variable outside the block

The second problematic case is:
```ruby
# error: the code clearly assumes it can reassigns `a` but the `shareable_proc` would not respect it, i.e. `shareable_proc` would break Ruby block semantics
# Also note the Ractor.shareable_proc call might be far away from the block, so one can't tell when looking at the block that it would be broken by `shareable_proc` (if no error for this case)
def example
  a = 1
  b = proc { a }
  Ractor.shareable_proc(&b) # Ractor::IsolationError: cannot isolate a block because it accesses outer variables (a) which are reassigned outside the block
  a = 2
end
```
This is very similar (it is the symmetric case), the `shareable_proc` cannot honor the `a = 2` assignment, so it should not allow creating a `shareable_proc` in that context and should be `Ractor::IsolationError`.

If you don't see the issue in that small example, let's use this example:
```ruby
page_views = 0

background_jobs.schedule_every(5.seconds) {
  puts "#{page_views} page views so far"
}

threaded_webserver.on("/") do
  page_views += 1
  "Hello"
end
```
If `background_jobs` uses `Thread`, everything is fine.
If it uses `Ractor`, it needs to make that `schedule_every` block shareable, and if we don't add this safety check then it will always incorrectly print `0 page views so far`.
This is what I mean by breaking Ruby block semantics.
In this proposal, we prevent this broken semantics situation by `Ractor::IsolationError` when trying to make that `schedule_every` block shareable.

One more reason here to forbid this case is that a block that is made shareable is never executed immediately on the current Ractor, because there is no need to make it shareable for that case. And so it means the block will be executed later, by some other Ractor.
And that block, if it expects to be executed later, then definitely expects to see up-to-date captured variables (as in the author of the block expects that).

We would check this situation by static analysis.
There are multiple ways to go about it with trade-offs between precision and implementation complexity.
I think we could simplify to: disallow `Ractor.shareable_proc` for any block which captures a variable which is potentially reassigned.
In other words, only allow `Ractor.shareable_proc` if all the variables it captures are assigned (exactly) once.
More on that later in section `Edge Cases`.

## Ractor.new

Note that everything about `Ractor.shareable_proc` should also apply to `Ractor.new`, that way it's convenient to pass data via captured variables for `Ractor.new` too, example:
```ruby
x = ...
y = ...
Ractor.new { compute(x, y) }
```

Currently `Ractor.new` does not allow capturing outer variables at all and needs workarounds such as:
```ruby
x = ...
y = ...
Ractor.new(x, y) { |x, y| compute(x, y) }
```

## define_method

`define_method` (and of course `define_singleton_method` too) have been an issue since the beginning of Ractors,
because methods defined by `define_method` just couldn't be called from a Ractor (because the block/Proc wouldn't be shareable and so can't be called from other Ractors).
A workaround is to make the block/Proc shareable, but this is inconvenient, verbose and shouldn't be necessary:
```ruby
def new_ostruct_member!(name) # :nodoc:
  unless @table.key?(name) || is_method_protected!(name)
    if defined?(::Ractor)
      getter_proc = nil.instance_eval{ Proc.new { @table[name] } }
      setter_proc = nil.instance_eval{ Proc.new {|x| @table[name] = x} }
      ::Ractor.make_shareable(getter_proc)
      ::Ractor.make_shareable(setter_proc)
    else
      getter_proc = Proc.new { @table[name] }
      setter_proc = Proc.new {|x| @table[name] = x}
    end
    define_singleton_method!(name, &getter_proc)
    define_singleton_method!("#{name}=", &setter_proc)
  end
end
```

Instead, this proposal brings the idea for `define_method` to automatically call `Ractor.shareable_proc` on the given block/Proc (and fallback to the original Proc if it would raise), as if it was defined like:
```ruby
def define_method(name, &body)
  body = Ractor.shareable_proc(self: nil, body) rescue body
  Primitive.define_method(name, &body)
end
```
(note that `define_method` knows the `body` Proc's `self` won't be the original `self` anyway, so it's fine to change it to `nil`)

This way workarounds like above are no longer needed and the code can be as simple as it used to be:
```ruby
def new_ostruct_member!(name) # :nodoc:
  unless @table.key?(name) || is_method_protected!(name)
    define_singleton_method!(name) { @table[name] }
    define_singleton_method!("#{name}=") { |x| @table[name] = x }
  end
end
```

Much nicer, and solves a longstanding issue with Ractor.

There should be no compatibility issue since the block is only made shareable when it's safe to do so.
This is another argument for making `Ractor.shareable_proc` safe.

## Ractor.shareable_proc and Ractor.shareable_lambda

I believe we don't need `Ractor.shareable_lambda` (mentioned in other tickets).
`Ractor.shareable_proc` should always preserve the lambda-ness (`Proc#lambda?`) of the given Proc.
The role of `Ractor.shareable_proc` is to make the Proc shareable, not change arguments handling.
If one wants a shareable lambda they can just use `Ractor.shareable_proc(&-> { ... })`.

BTW, the added value of `Ractor.shareable_proc(self: nil, &proc)` vs just `Ractor.make_shareable(proc, copy: true)` is that it enables changing the receiver of the Proc without needing `nil.instance_eval { ... }` around, and it is much clearer.

`Ractor.make_shareable(proc)` should be an error [as mentioned here](https://bugs.ruby-lang.org/issues/21039#note-14), because it would mutate the proc inplace and that's too surprising and unsafe (e.g. it would break `Proc#binding` on that Proc instance).
`Ractor.make_shareable(proc, copy: true)` can be the same as `Ractor.shareable_proc(self: self, &proc)` (only works if `self` is shareable then), or an error.

## Edge Cases

For these examples I'll use `enqueue`, which defines a block to execute later, either in a Thread or Ractor.
For the Ractor case, `enqueue` would make the block shareable and send it to a Ractor to execute it.
This is a bit more realistic than using plain `Ractor.shareable_proc` instead of `enqueue`, since it makes it clearer the block won't be executed right away on the main Ractor but later on some other Ractor.

### Nested Block Cases

If the assignment is in a nested block, it's an error (this case is already detected for `Ractor.new` BTW):
```ruby
a = 1
enqueue { proc { a = 1 } } # Ractor::IsolationError: cannot isolate a block because it accesses outer variables (a) which are reassigned inside the block
```

Similarly, if the assignment is in an some block outside, it's the same as if it was assigned directly outside:
```ruby
a = 1
p = proc { a = 2 }
enqueue { a } # Ractor::IsolationError: cannot isolate a block because it accesses outer variables (a) which are reassigned outside the block
```

### Loop Cases

This would be a `Ractor::IsolationError`, because `a` is reassigned.
It would read a stale value and silently ignore reassignments if there was no `Ractor::IsolationError`.
```ruby
a = 0
while condition
  enqueue { p a } # Ractor::IsolationError: cannot isolate a block because it accesses outer variables (a) which are reassigned outside the block
  a += 1
end
```

This is the same case, using a rescue-retry loop:
```ruby
a = 0
begin
  enqueue { p a } # Ractor::IsolationError: cannot isolate a block because it accesses outer variables (a) which are reassigned outside 
  a += 1
  raise
rescue
  retry if condition
end
```

A `for` loop is like a while `loop` because the LHS variable (`a`) and all variables in the loop body are actually declared outside (weird, but that's how it is).
```ruby
for a in enum
  b = rand
  enqueue { p a } # Ractor::IsolationError: cannot isolate a block because it accesses outer variables (a) which are reassigned outside the block
end
binding.local_variables # => [:a, :b]
```

Any assignment inside one of these loops can potentially happen multiple times, so any variable assigned inside one of these loops cannot be captured by a shareable block (i.e., `Ractor::IsolationError` when trying to make a shareable block in such a case).
We will need the static analysis to detect such loops. That probably doesn't need a full Control Flow Graph, we just need to determine if an assignment is "inside a while/for/retry" loop (up to a scope barrier like `def`/`class`/`module`).

Regular "loops" using blocks are fine though, because they create a new environment/frame for each iteration.
These 2 blocks will always see `[0, 1]` and `[0, 2]`, whether shareable or not:
```ruby
a = 0
[1, 2].each do |e|
  enqueue { p [a, e] } # OK, each of these variables is assigned only once
end
```

### eval and binding

Static analysis cannot detect `eval` or `binding`.
In such an extreme and very rare case the fact that `shareable_proc` makes a copy of the environment is visible:
```ruby
a = 1
b = proc { a }
s = Ractor.shareable_proc(&b)
eval("a = 2") # or binding.local_variable_set(:a, 2), or b.binding.local_variable_set(:a, 2)
b.call # => 2
s.call # => 1
```
This seems unavoidable, unless we prevent shareable procs to use captured variables at all (quite restrictive).
BTW, `Proc#binding` is already not supported for a `shareable_proc`:
```
$ ruby -e 'nil.instance_exec { a = 1; b = proc { a }; b2 = Ractor.make_shareable(b); p b2.binding }'
-e:1:in `binding': Can't create Binding from isolated Proc (ArgumentError)
```
So `binding`/`eval` is in general already not fully respected with Ractor anyway (and cannot be).

### Multiple Assignments Before

This simple example assigns `a` twice.
It would be safe because `a` is always assigned before (in execution, not necessarily in source order) creating the block instance/Proc instance, but it is not so easy to detect. Depending on how precise the static analysis is it might allow this case. We can always allow more later and start with something simple.
```ruby
a = 1
a = 2
Ractor.shareable_proc { a } # Ractor::IsolationError if using the single-assignment-only static analysis, seems OK because not so common
```

### Error Message

Differentiating `... which are reassigned inside/outside the block` might be needlessly complicated, in such a case I think it's fine to simplify the error message and omit the part after `... which are reassigned`. The important part is that the outer variable is reassigned, not whether it's inside or outside.

## Alternatives

### Relaxing the checks for literal blocks

`Kernel#lambda` for example has behavior which depends on whether it's given a literal block or not:
```ruby
lambda(&proc {}) # => the lambda method requires a literal block (ArgumentError)
```

We could have such a difference, but I don't think it's very useful, if a variable is reassigned, it seems a bad idea to capture and shallow-copy with a shareable proc (unclear, unsafe, etc).
The semantics are also simpler if they are the same whether the block is literal or not.

### Removing the checks for reassigning a captured variable outside the block

That's basically option 1 of #21550.
This would allow known unsafe behavior and break the examples shown above (i.e. it would break code in a nasty way: some assignments are silently ignored, good luck to debug that).
It would be hard to then forbid such cases later as it could then be considered incompatible.
In my opinion we would commit a big language design mistake if we just give up and allow known unsafe cases like that, people wouldn't be able to trust that local variable assignments are respected (pretty fundamental, isn't it?) and that Ruby blocks behave as they always have been (with lexical scoping for local variables).
It would also do nothing to help with `define_method`.
`Ractor.shareable_proc(Proc)` is currently unsafe (the main point of #21039), let's address it, not ignore known problems, especially after a lot of discussion and thoughts on how to solve it properly.



-- 
https://bugs.ruby-lang.org/
______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/


In This Thread