From: "ioquatix (Samuel Williams) via ruby-core" Date: 2024-11-05T21:49:01+00:00 Subject: [ruby-core:119753] [Ruby master Feature#20855] Introduce `Fiber::Scheduler#blocking_region` to avoid stalling the event loop. Issue #20855 has been updated by ioquatix (Samuel Williams). > Zlib.deflate is a CPU-bound operation right? So it makes sense for Fibers of the same Thread to execute the 10 operations sequentially. Yes, `Zlib.deflate` with a big enough input can become significantly CPU bound. Yes, executing that on fibers will be completely sequential and cause significant latency in the event loop. This proposal preserves the user visible sequentiality while taking advantage of thread-level parallelism internally to the scheduler. From the user's point of view, nothing is different (code still executes sequentially), but internally, the `rb_nogvl` callback in `Zlib.deflate` will not stall the event loop. As the hook is completely optional, the user-visible semantics of the scheduler must be identical with or without this hook. > IMHO I think it should be the responsibility of applications like falcon, etc to explicitly offload heavy work like zlib onto an application managed thread pool. In a certain way, that's exactly what this proposal does: `rb_nogvl` provides the information we need to offload heavy work and we can take advantage of it. The goal of the fiber scheduler has always been to be transparent to the user/application/library code. So it's just a matter of where you set the bar for "explicitly" - native extensions can explicitly offload heavy work using `rb_nogvl` as that's actually the only way to define code that can run safely in parallel - and as you well know there is no similar construct in pure Ruby. > I wonder if another way to tackle this problem is to add some metrics or callbacks to the fiber scheduler API, so that e.g. async could warn you when the event loop is blocked for a long time and that you should consider pulling work out into a thread pool. The original design of the fiber scheduler had this, it was called `enter_blocking_region` and `exit_blocking_region`: https://bugs.ruby-lang.org/issues/16786 but they were rejected as being too strongly connected to the implementation. Knowing there are blocking operations on the event loop is extremely valuable, so I regret not having those hooks. > Should this be integrated with the M:N scheduler in some way? For sure there is some overlap. However, my main concern is making the fiber scheduler is good as it can be, and blocking operations within the fiber scheduler/event loop is a reasonable criticism that has come up. So, I want a solution for the fiber scheduler. All things considered, the proposal here is a reasonable solution (pending naming and safety flags). It's about as simple as I could make it and the results show that it works pretty well - within the bounds of what's possible for parallelism in Ruby. ---------------------------------------- Feature #20855: Introduce `Fiber::Scheduler#blocking_region` to avoid stalling the event loop. https://bugs.ruby-lang.org/issues/20855#change-110417 * Author: ioquatix (Samuel Williams) * Status: Open ---------------------------------------- The current Fiber Scheduler performance can be significantly impacted by blocking operations that cannot be deferred to the event loop, particularly in high-concurrency environments where Fibers rely on non-blocking operations for efficient task execution. ## Problem Description Fibers in Ruby are designed to improve performance and responsiveness by allowing concurrent tasks to proceed without blocking one another. However, certain operations inherently block the fiber scheduler, leading to delayed execution across other fibers. When blocking operations are inevitable, such as system or CPU bound operations without event-loop support, they create bottlenecks that degrade the scheduler's overall performance. ## Proposed Solution The proposed solution in PR https://github.com/ruby/ruby/pull/11963 introduces a `blocking_region` hook in the fiber scheduler to improve handling of blocking operations. This addition allows code that releases the GVL (Global VM Lock) to be lifted out of the event loop, reducing the performance impact on the scheduler during blocking calls. By isolating these operations from the primary event loop, this enhancement aims to improve worst case performance in the presence of blocking operations. ### `blocking_region(work)` The new, optional, fiber scheduler hook `blocking_region` accepts an opaque callable object `work`, which encapsulates work that can be offloaded to a thread pool for execution. If the hook is not implemented `rb_nogvl` executes as usual. ## Example ```ruby require "zlib" require "async" require "benchmark" DATA = Random.new.bytes(1024*1024*100) duration = Benchmark.measure do Async do 10.times do Async do Zlib.deflate(DATA) end end end end # Ruby 3.3.4: ~16 seconds # Ruby 3.4.0 + PR: ~2 seconds. ``` ---Files-------------------------------- clipboard-202411060314-mby8k.png (120 KB) -- https://bugs.ruby-lang.org/ ______________________________________________ ruby-core mailing list -- ruby-core@ml.ruby-lang.org To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/