ruby-core

Issue #16786 has been updated by ioquatix (Samuel Williams).


Thanks @eregon for your feedback.

> Maybe we can make `Fiber.new(blocking: false)` call `Thread.current.scheduler.fiber {}` or `Thread.current.scheduler.register(fiber_instance)` ?

`Fiber.new` is a constructor and is independent of the scheduler in every way. 

The blocking or non-blocking state is simply stored into the fiber itself.

Because of that, I disagree with the constructor doing anything complicated or invoking any hook, at least at this time. Because then, the time at which you construct the fiber might impact it's behaviour in the scheduler, which I think is unnecessary and maybe confusing to user.

Additionally, we should not expose user to `Fiber.new(blocking: true/false)` because it's detail of scheduler implementation and to avoid breaking existing code (where `Fiber.new` defaults to blocking fiber which preserves existing behaviour).

Users need a simple entry point for concurrency. This is proposed as `Fiber {}`. I cannot make it any simpler than that. Fiber as a name is already reserved by Ruby, so making a method of the same name is similar to `class Integer`/`Integer(...)`.

I've had feedback from developer over several years who told me `Async {}` is so simple and easy. So the ergonomics are good for users and the feedback supports that.

----------------------------------------
Feature #16786: Light-weight scheduler for improved concurrency.
https://bugs.ruby-lang.org/issues/16786#change-85140

* Author: ioquatix (Samuel Williams)
* Status: Open
* Priority: Normal
----------------------------------------
# Abstract

We propose to introduce a light weight fiber scheduler, to improve the concurrency of Ruby code with minimal changes.

# Background

We have been discussing and considering options to improve Ruby scalability for several years. More context can be provided by the following discussions:

- https://bugs.ruby-lang.org/issues/14736
- https://bugs.ruby-lang.org/issues/13618

The final Ruby Concurrency report provides some background on the various issues considered in the latest iteration: https://www.codeotaku.com/journal/2020-04/ruby-concurrency-final-report/index

# Proposal

We propose to introduce the following concepts:

- A `Scheduler` interface which provides hooks for user-supplied event loops.
- Non-blocking `Fiber` which can invoke the scheduler when it would otherwise block.

## Scheduler

The per-thread fiber scheduler interface is used to intercept blocking operations. A typical implementation would be a wrapper for a gem like EventMachine or Async. This design provides separation of concerns between the event loop implementation and application code. It also allows for layered schedulers which can perform instrumentation, enforce constraints (e.g. during testing) and provide additional logging. You can see a [sample implementation here](https://github.com/socketry/async/pull/56).

```ruby
class Scheduler
  # Wait for the given file descriptor to become readable.
  def wait_readable(fd)
  end

  # Wait for the given file descriptor to become writable.
  def wait_writable(fd)
  end

  # Wait for the given file descriptor to match the specified events within
  # the specified timeout.
  # @param event [Integer] a bit mask of +IO::WAIT_READABLE+,
  #   `IO::WAIT_WRITABLE` and `IO::WAIT_PRIORITY`.
  # @param timeout [#to_f] the amount of time to wait for the event.
  def wait_for_single_fd(fd, events, timeout)
  end

  # Sleep the current task for the specified duration, or forever if not
  # specified.
  # @param duration [#to_f] the amount of time to sleep.
  def wait_sleep(duration = nil)
  end

  # The Ruby virtual machine is going to enter a system level blocking
  # operation.
  def enter_blocking_region
  end

  # The Ruby virtual machine has completed the system level blocking
  # operation.
  def exit_blocking_region
  end

  # Intercept the creation of a non-blocking fiber.
  def fiber(&block)
    Fiber.new(blocking: false, &block).resume
  end

  # Invoked when the thread exits.
  def run
    # Implement event loop here.
  end
end
```

A thread has a non-blocking fiber scheduler. All blocking operations on non-blocking fibers are hooked by the scheduler and the scheduler can switch to another fiber. If any mutex is acquired by a fiber, then a scheduler is not called; the same behaviour as blocking Fiber.

Schedulers can be written in Ruby. This is a desirable property as it allows them to be used in different implementations of Ruby easily.

To enable non-blocking fiber switching on blocking operations:

- Specify a scheduler: `Thread.current.scheduler = Scheduler.new`.
- Create several non-blocking fibers: `Fiber.new(blocking:false) {...}`.
- As the main fiber exits, `Thread.current.scheduler.run` is invoked which
  begins executing the event loop until all fibers are finished.

### Time/Duration Arguments

Tony Arcieri suggested against using floating point values for time/durations, because they can accumulate rounding errors and other issues. He has a wealth of experience in this area so his advice should be considered carefully. However, I have yet to see these issues happen in an event loop. That being said, round tripping between `struct timeval` and `double`/`VALUE` seems a bit inefficient. One option is to have an opaque argument that responds to `to_f` as well as potentially `seconds` and `microseconds` or some other such interface (could be opaque argument supported by `IO.select` for example).

### File Descriptor Arguments

There is a good case for prefering `IO` instances over file descriptors. However because of the public C interface we may need to support both.

```c
int rb_io_wait_readable(int);
int rb_io_wait_writable(int);
int rb_wait_for_single_fd(int fd, int events, struct timeval *tv);
```

Internally, in CRuby, it may be possible to map from `fd` -> `IO` instance. Another option is to simply support both interfaces and leave it up to the scheduler to decide how to handle it, e.g.

```ruby
class Scheduler
  def wait_readable_fd(fd)
    # wait_readable_io(IO.from_fd(fd, autoclose: false))
  end

  def wait_readable_io(io)
    # wait_readable_fd(io.fileno)
  end
end
```

We would like to be flexible, without imposing a performance burden on any particular implementation. This is a good point for further discussion.

## Non-blocking Fiber

We propose to introduce per-fiber flag `blocking: true/false`.

A fiber created by `Fiber.new(blocking: true)` (the default `Fiber.new`) becomes a "blocking Fiber" and has no changes from current Fiber implementation.

A fiber created by `Fiber.new(blocking: false)` becomes a "non-blocking Fiber" and it will be scheduled by the per-thread scheduler when the blocking operations (blocking I/O, sleep, and so on) occurs.

```ruby
Fiber.new(blocking: false) do
  puts Fiber.current.blocking? # false

  # May invoke `Thread.scheduler&.wait_readable`.
  io.read(...)

  # May invoke `Thread.scheduler&.wait_writable`.
  io.write(...)

  # Will invoke `Thread.scheduler&.wait_sleep`.
  sleep(n)
end.resume
```

Non-blocking fibers also supports `Fiber#resume`, `Fiber#transfer` and `Fiber.yield` which are necessary to create a scheduler.

### Fiber Method

We also introduce a new method which simplifes the creation of these non-blocking fibers:

```ruby
Fiber do
  puts Fiber.current.blocking? # false
end
```

This method invokes `Scheduler#fiber(...)`. The purpose of this method is to allow the scheduler to internally decide the policy for when to start the fiber, and whether to use symmetric or asymmetric fibers.

If no scheduler is specified, it is a error: `RuntimeError.new("No scheduler is available")`.

In the future we may expand this to support some kind of default scheduler.

## Non-blocking I/O

`IO#nonblock` is an existing interface to control whether I/O uses blocking or non-blocking system calls. We can take advantage of this:

- `IO#nonblock = false` prevents that particular IO from utilising the scheduler. This should be the default for `stderr`.
- `IO#nonblock = true` enables that particular IO to utilise the scheduler. We should enable this where possible.

As proposed by Eric Wong, we believe that making I/O non-blocking by default is the right approach. We have expanded his work in the current implementation. By doing this, when the user writes `Fiber do ... end` they are guaranteed the best possible concurrency possible, without any further changes to code. As an example, one of the tests shows `Net::HTTP.get` being used in this way with no further modifications required.

To support this further, consider the counterpoint, that `Net::HTTP.get(..., blocking: false)` is required for concurrent requests. Library code may not expose the relevant options, sevearly limiting the user's ability to improve concurrency, even if that is what they desire.

# Implementation

We have an evolving implementation here: https://github.com/ruby/ruby/pull/3032 which we will continue to update as the proposal changes.

# Evaluation

This proposal provides the hooks for scheduling fibers. With regards to performance, there are several things to consider:

- The impact of the scheduler design on non-concurrent workloads. We believe it's acceptable.
- The impact of the scheduler design on concurrent workloads. Our results are promising.
- The impact of different event loops on throughput and latency. We have independent tests which confirm the scalability of the approach.

We can control for the first two in this proposal, and depending on the design we may help or hinder the wrapper implementation.

In the tests, we provide a basic implementation using `IO.select`. As this proposal is finalised, we will introduce some basic benchmarks using this approach.

# Discussion

The following points are good ones for discussion:

- Handling of file descriptors vs `IO` instances.
- Handling of time/duration arguments.
- General design and naming conventions.
- Potential platform issues (e.g. CRuby vs JRuby vs TruffleRuby, etc).

The following is planned to be described by @eregon in another design document:

- Semantics of non-blocking mutex (e.g. `Mutex.new(blocking: false)` or some other approach).

In the future we hope to extend the scheduler to handle other blocking operations, including name resolution, file I/O (by `io_uring`) and others.




-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>

Thread

Prev Next

In This Thread

Prev Next

[#97652] [Ruby master Feature#16746] Endless method definition — mame@...

[#97655] [Ruby master Misc#16747] Repository reorganization request — shyouhei@...

[#97658] test_io_console error? — Leam Hall <leamhall@...>

[#97660] [Ruby master Bug#16748] Different behaviour between a hash and multi-Array when passing 2 arguments to a proc — mnash@...

[#97661] [Ruby master Bug#16749] File.lchmod specs fails on linux since glibc 2.31.9000 — v.ondruch@...

[#97666] [Ruby master Feature#16750] Change typedef of VALUE for better type checking — daniel@...42.com

[#97671] [Ruby master Bug#16751] abhBug — ahlem.abh@...

[#97678] [Ruby master Feature#16752] :private param for const_set — bughitgithub@...

[#97686] [Ruby master Bug#16753] ruby -run -e httpd . -p 8080 — sevkme@...

[#97688] [Ruby master Feature#16754] Pager for `--help` — nobu@...

[#97691] [Ruby master Bug#16755] warning: `if' at the end of line without an expression — mpapis@...

[#97694] [Ruby master Bug#16756] File.chmod does not work on links. — v.ondruch@...

[#97697] [Ruby master Bug#12666] Fatal error: glibc detected an invalid stdio handle — v.ondruch@...

[#97698] [Ruby master Bug#16148] bugs.ruby-lang.org is not sending email notifications for watched issues — bughitgithub@...

[#97699] [Ruby master Bug#14413] `-n` and `-p` flags break when stdout is closed — josh.cheek@...

[#97700] [Ruby master Bug#14413] `-n` and `-p` flags break when stdout is closed — josh.cheek@...

[#97701] [Ruby master Feature#16757] Add intersection to Range — stuartyamartino@...

[#97705] [Ruby master Bug#16758] Unable to run task that require bundle exec — antarr.byrd@...

[#97714] [Ruby master Feature#5663] Combined map/select method — sawadatsuyoshi@...

[#97724] [Ruby master Bug#16759] MinGW 2.5 - SEGV bug with Binding#local_variable_set — Greg.mpls@...

[#97726] [Ruby master Bug#12666] Fatal error: glibc detected an invalid stdio handle — shyouhei@...

[#97727] [Ruby master Bug#12666] Fatal error: glibc detected an invalid stdio handle — v.ondruch@...

[#97728] [Ruby master Bug#12666] Fatal error: glibc detected an invalid stdio handle — v.ondruch@...

[#97729] [Ruby master Bug#16760] backport #67305 / e39f7e64 to 2.6? — ryand-ruby@...

[#97730] [Ruby master Feature#16761] Add an API to move the entire heap, as to make testing GC.compact compatibility easier — jean.boussier@...

[#97731] [Ruby master Bug#16762] Ruby is not properly fortified on armv7hl — v.ondruch@...

[#97732] [Ruby master Feature#16763] MSVC: allow ranges for MSVC 2017 and 2019 support in win/Makefile.sub — julien.marrec@...

[#97733] [Ruby master Bug#16764] Module.const_source_location does not work on autoloaded constants — mail@...

[#97734] [Ruby master Bug#16765] Crash when use sass image-url in email layout — mitchellgould7@...

[#97740] [Ruby master Feature#16766] how to solve software issue for my website "qatar e visa online" — social@...

[#97742] [Ruby master Bug#13671] Regexp with lookbehind and case-insensitivity raises RegexpError only on strings with certain characters — contact@...

[#97743] [Ruby master Bug#16337] kernel_gem.rb:68 - ThreadError — jkohen@...

[#97745] [Ruby master Bug#16769] Struct.new(..., immutable: true) — takashikkbn@...

[#97748] [Ruby master Bug#16337] kernel_gem.rb:68 - ThreadError — deivid.rodriguez@...

[#97750] [Ruby master Bug#16770] Commit: RUBY3_HAS_BUILTIN: fix for nonexistent builtin - large increase in MinGW compile time — Greg.mpls@...

[#97754] [Ruby master Feature#16254] MRI internal: Define built-in classes in Ruby with `__intrinsic__` syntax — eregontp@...

[#97756] [Ruby master Bug#16771] Segmentation fault when inspecting a bound method — fg@...

[#97757] [Ruby master Bug#16772] Build becomes slow with CIFS mounted srcdir — ko1@...

[#97763] [Ruby master Bug#6087] How should inherited methods deal with return values of their own subclass? — mame@...

[#97768] [Ruby master Feature#15330] autoload_relative — brodock@...

[#97770] [Ruby master Feature#16428] Add Array#uniq?, Enumerable#uniq? — matz@...

[#97771] [Ruby master Feature#15921] R-assign (rightward-assignment) operator — matz@...

[#97773] [Ruby master Bug#14541] Class variables have broken semantics, let's fix them — matz@...

[#97779] [Ruby master Bug#6087] How should inherited methods deal with return values of their own subclass? — matz@...

[#97780] [Ruby master Bug#6087] How should inherited methods deal with return values of their own subclass? — matz@...

[#97781] [Ruby master Feature#15921] R-assign (rightward-assignment) operator — merch-redmine@...

[#97788] [Ruby master Feature#15921] R-assign (rightward-assignment) operator — eregontp@...

[#97789] [Ruby master Feature#16773] Reduce allocations in net/http — mail@...

[#97794] [Ruby master Feature#16122] Struct::Value: simple immutable value object — eregontp@...

[#97795] [Ruby master Feature#16122] Struct::Value: simple immutable value object — eregontp@...

[#97800] [Ruby master Bug#16774] Don't require sub-word atomics — schwab@...68k.org

[#97801] [Ruby master Feature#16428] Add Array#uniq?, Enumerable#uniq? — yanagi@...

[#97802] [Ruby master Feature#16428] Add Array#uniq?, Enumerable#uniq? — shyouhei@...

[#97803] [Ruby master Misc#16775] DevelopersMeeting20200514Japan — mame@...

[#97806] [Ruby master Bug#14413] `-n` and `-p` flags break when stdout is closed — samuel@...

[#97807] [Ruby master Feature#16428] Add Array#uniq?, Enumerable#uniq? — yanagi@...

[#97810] [Ruby master Bug#16776] Regression in coverage library — deivid.rodriguez@...

[#97811] [Ruby master Bug#6087] How should inherited methods deal with return values of their own subclass? — eregontp@...

[#97812] [Ruby master Bug#14413] `-n` and `-p` flags break when stdout is closed — akr@...

[#97822] [Ruby master Bug#16777] IRB in Ruby 2.7 hangs on pasting long here document — me@...

[#97823] [Ruby master Bug#16455] coroutine ucontext uses deprecated POSIX getcontext/swapcontext/makecontext, absent in musl and uclibc — samuel@...

[#97824] [Ruby master Bug#14413] `-n` and `-p` flags break when stdout is closed — samuel@...

[#97827] [Ruby master Bug#13962] Change http://unicode.org to https — zn@...

[#97828] [Ruby master Misc#16778] Should we stop vendoring default gems code? — deivid.rodriguez@...

[#97833] [Ruby master Feature#16779] Add sprintf %q format option — bryan@...

[#97838] [Ruby master Bug#16780] Net::FTP PUT command issuing Net::ReadTimeout too quickly — ryan.gerard@...

[#97841] [Ruby master Bug#13962] Change http://unicode.org to https — duerst@...

[#97843] [Ruby master Feature#11816] Partial safe navigation operator — ruby-core@...

[#97848] [Ruby master Feature#16494] Allow hash unpacking in non-lambda Proc — joshua.goodall@...

[#97850] [Ruby master Feature#16494] Allow hash unpacking in non-lambda Proc — zverok.offline@...

[#97861] [Ruby master Feature#16781] alias :fold :reduce — 0xfffffff0@...

[#97863] [Ruby master Feature#15897] `it` as a default block parameter — jonathan@...

[#97864] [Ruby master Bug#16782] `lock': deadlock; recursive locking (ThreadError) in 2.7.1 — samuel@...

[#97868] [Ruby master Feature#11816] Partial safe navigation operator — daniel@...42.com

[#97870] [Ruby master Feature#16783] Implicit vs explicit self — kori@...

[#97874] [Ruby master Bug#16784] Compiling with --enable-load-relative and "musl-gcc -static" yields "negative string size (or size too big) (ArgumentError)" — trung.le@...

[#97875] [Ruby master Feature#11816] Partial safe navigation operator — ruby-core@...

[#97876] [Ruby master Feature#11816] Partial safe navigation operator — zverok.offline@...

[#97877] [Ruby master Bug#16785] ruby 2.8.0-dev 5c27681813 causes Rails CI failure — yasuo.honda@...

[#97878] [Ruby master Feature#16786] Light-weight scheduler for improved concurrency. — samuel@...

[#97754] [Ruby master Feature#16254] MRI internal: Define built-in classes in Ruby with `intrinsic` syntax — eregontp@...

[#98023] [Ruby master Feature#16254] MRI internal: Define built-in classes in Ruby with `intrinsic` syntax — ko1@...

[#98030] [Ruby master Feature#16254] MRI internal: Define built-in classes in Ruby with `intrinsic` syntax — eregontp@...