From: "ioquatix (Samuel Williams) via ruby-core" Date: 2024-11-05T14:20:05+00:00 Subject: [ruby-core:119732] [Ruby master Feature#20855] Introduce `Fiber::Scheduler#blocking_region` to avoid stalling the event loop. Issue #20855 has been updated by ioquatix (Samuel Williams). File clipboard-202411060314-mby8k.png added > I don't introduce blocking_region. I am not strongly attached to the name, I just used `blocking_region` as it is quite commonly used internally for handling of blocking operations and has existed for 13 years (https://github.com/ruby/ruby/commit/919978a8d9e25d52697c0677c1f2c0ccb50b4492#diff-d5867d8e382e49f5cdef27a4d24c1a4588954f96e00925092a586659bf1b1ba4R204). For alternative naming, how about `blocking_operation`? I also considered `thread_call_without_gvl` but I feel it's too specific as the scheduler doesn't need to care about what the work is - just that it is blocking, and some platforms like JRuby / TruffleRuby don't necessarily have the same concept of a GVL. > I'm not sure what is the work for blocking_region() callback. The work is an opaque object that responds to `#call` (but in the implementation, it is a `Proc` instance) which in this specific case, wraps the execution of `rb_nogvl` with the given `func` and `unblock_func`. > I can't understand what happens on this description. I couldn't understand the control flow with a given func for rb_nogvl. If `rb_nogvl` is called in the fiber scheduler, it can introduce latency, as releasing the GVL will prevent the event loop from progressing while `nogvl` function is executing. To avoid this, we wrap the arguments given to `rb_nogvl` into a Proc and invoke the fiber scheduler hook, so it can decide how to execute the work. The most basic (blocking) implementation would be something like this: ```ruby def blocking_region(work) Fiber.blocking(&work) end ``` Alternatively, using a thread: ```ruby def blocking_region(work) Thread.new(&work).join end ``` In terms of flow control, it's the same as all the fiber scheduler hooks, it routes the operation to the fiber scheduler for execution. The scheduler is allowed, within reason, to determine the execution policy. ![](clipboard-202411060314-mby8k.png) > In general it seems unsafe. It's a fair point - I found bugs in `zlib.c` because of this work, so for sure there exists problematic code... But I also don't want to be too reductionistic regarding "unsafe C code" being a problem... One option to mitigate this risk is to introduce a flag passed to `rb_nogvl` to either allow or prevent moving `func` to a different thread. I think this has wider value too - in the M:N scheduler, knowing whether blocking operations can be moved between threads might be extremely useful for similar reasons. Depending on how risk-adverse we are, we could decide to default to "allow by default" or "prevent by default". I'm personally leaning more towards allow by default as I think most usage of `rb_nogvl` should be safe in practice, but I'd also be okay with being conservative by default. The only problem I see with being conservative by default is a lot of performance may get left on the table until code is updated to use said flags. > I want to revert this patch with current description. Sorry @ko1, I was just excited to try this feature out. I've been running tests in `async` and other projects downstream using `ruby-head` to evaluate it. Why don't we discuss this at the developer meeting and figure out a path forward? If we still can't come to a conclusion we can revert it. How about that? ---------------------------------------- Feature #20855: Introduce `Fiber::Scheduler#blocking_region` to avoid stalling the event loop. https://bugs.ruby-lang.org/issues/20855#change-110398 * Author: ioquatix (Samuel Williams) * Status: Open ---------------------------------------- The current Fiber Scheduler performance can be significantly impacted by blocking operations that cannot be deferred to the event loop, particularly in high-concurrency environments where Fibers rely on non-blocking operations for efficient task execution. ## Problem Description Fibers in Ruby are designed to improve performance and responsiveness by allowing concurrent tasks to proceed without blocking one another. However, certain operations inherently block the fiber scheduler, leading to delayed execution across other fibers. When blocking operations are inevitable, such as system or CPU bound operations without event-loop support, they create bottlenecks that degrade the scheduler's overall performance. ## Proposed Solution The proposed solution in PR https://github.com/ruby/ruby/pull/11963 introduces a `blocking_region` hook in the fiber scheduler to improve handling of blocking operations. This addition allows code that releases the GVL (Global VM Lock) to be lifted out of the event loop, reducing the performance impact on the scheduler during blocking calls. By isolating these operations from the primary event loop, this enhancement aims to improve worst case performance in the presence of blocking operations. ### `blocking_region(work)` The new, optional, fiber scheduler hook `blocking_region` accepts an opaque callable object `work`, which encapsulates work that can be offloaded to a thread pool for execution. If the hook is not implemented `rb_nogvl` executes as usual. ## Example ```ruby require "zlib" require "async" require "benchmark" DATA = Random.new.bytes(1024*1024*100) duration = Benchmark.measure do Async do 10.times do Async do Zlib.deflate(DATA) end end end end # Ruby 3.3.4: ~16 seconds # Ruby 3.4.0 + PR: ~2 seconds. ``` ---Files-------------------------------- clipboard-202411060314-mby8k.png (120 KB) -- https://bugs.ruby-lang.org/ ______________________________________________ ruby-core mailing list -- ruby-core@ml.ruby-lang.org To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/