From: "ioquatix (Samuel Williams)" Date: 2022-10-17T19:42:33+00:00 Subject: [ruby-core:110374] [Ruby master Bug#19062] Introduce `Fiber#locals` for shared inheritable state. Issue #19062 has been updated by ioquatix (Samuel Williams). The only thing I really care about is an efficient way to implicitly share specific state within an execution context. ``` let(x = 10) do Thread.new do get(x) # => 10 end Enumerator.new do get(x) # => 10 end end ``` Limiting variable binding to some kind of `let` block is a bit limiting and prevents lots of use cases which are common in Ruby. I'm okay with Fibers (read execution contexts) being the implicit scope because it's pretty much natural for most practical use cases. ``` Fiber/Thread.new do # An execution context set(x) Thread.new do get(x) # => 10 end.join set(x, 20) Enumerator.new do get(x, 20) # => 20 end end ``` That means all the ECs share a single mutable set of locals. It's basically the same as a local variable bound except rather than lexical scope, it's dynamic based on the ECs. There are two levels of mutability: 1. The ability to bind key-values. 2. The ability for values themselves to mutate. I would argue that since we can't practically prevent (2), trying to prevent (1) is pointless. However, feel free to convince me otherwise. I'd be okay with a copy-on-write scheme where the first update would pay the cost of the internal copy, since we expect writes to be much less frequent than reads and clones, both of which need to be O(1) where possible. Whatever model we come up with, it makes sense that threads and fibers are handled consistently, i.e. I'm not sure that `Thread.new` needs to dup the locals. It's not thread unsafe to update the locals in Ruby because of the GVL. It's just poorly synchronised. If users choose to write that kind of code, they'd need to provide their own locking, which I think is acceptable too. When you say something like "It seems extremely unsafe to inherit by default across threads to me" I would personally like to see the code example where it's unsafe, otherwise it's hard for me to understand exactly what the problem is and/or how we could address it. ---------------------------------------- Bug #19062: Introduce `Fiber#locals` for shared inheritable state. https://bugs.ruby-lang.org/issues/19062#change-99667 * Author: ioquatix (Samuel Williams) * Status: Open * Priority: Normal * Assignee: ioquatix (Samuel Williams) * Backport: 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN ---------------------------------------- After exploring , I felt uncomfortable about the performance of copying lots of inheritable attributes. Please review that issue for the background and summary of the problem. ## Proposal Introduce `Fiber#locals` which is a hash table of local attributes which are inherited by child fibers. ```ruby Fiber.current.locals[:x] = 10 Fiber.new do pp Fiber.current.locals[:x] # => 10 end ``` It's possible to reset `Fiber.current.locals`, e.g. ```ruby def accept_connection(peer) Fiber.new(locals: nil) do # This causes a new hash table to be allocated. # Generate a new request id for all fibers nested in this one: Fiber[:request_id] = SecureRandom.hex(32) @app.call(env) end.resume end ``` A high level overview of the proposed changes: ```ruby class Fiber def initialize(..., locals: Fiber.current.locals) @locals = locals || Hash.new end attr_accessor :locals def self.[] key self.current.locals[key] end def self.[]= key, value self.current.locals[key] = value end end ``` See the pull request for the full proposed implementation. ## Expected Usage Currently, a lot of libraries use `Thread.current[:x]` which is unexpectedly "fiber local". A common bug shows up when lazy enumerators are used, because it may create an internal fiber. Because `locals` are inherited, code which uses `Fiber[:x]` will not suffer from this problem. Any program that uses true thread locals for per-request state, can adopt the proposed `Fiber#locals` and get similar behaviour, without breaking on per-fiber servers like Falcon, because Falcon can "reset" `Fiber.current.locals` for each request fiber, while servers like Puma won't have to do that and will retain thread-local behaviour. Libraries like ActiveRecord can adopt `Fiber#locals` to avoid the need for users to opt into different "IsolatedExecutionState" models, since it can be transparently handled by the web server (see for more details). We hope by introducing `Fiber#locals`, we can avoid all the confusion and bugs of the past designs. -- https://bugs.ruby-lang.org/ Unsubscribe: