From: samuel@... Date: 2021-07-09T07:01:58+00:00 Subject: [ruby-core:104557] [Ruby master Feature#18020] Introduce `IO::Buffer` for fiber scheduler. Issue #18020 has been updated by ioquatix (Samuel Williams). Okay, the PR is ready for review: https://github.com/ruby/ruby/pull/4621 Here is how it's used: - `uring.c`: https://github.com/socketry/event/blob/b40bb0b174aed4cc3fed0f0eaafdd73f2a6a6f4c/ext/event/backend/uring.c#L265-L365 - `epoll.c`: https://github.com/socketry/event/blob/b40bb0b174aed4cc3fed0f0eaafdd73f2a6a6f4c/ext/event/backend/epoll.c#L269-L414 - `kqueue.c` implementation largely the same as epoll. - `select.rb`: https://github.com/socketry/event/blob/b40bb0b174aed4cc3fed0f0eaafdd73f2a6a6f4c/lib/event/backend/select.rb#L56-L101 In `io_uring` implementation, the data buffer is passed directly to the OS for zero-copy I/O. A brief overview of the implementation: - It provides a fast path from internal `IO` buffering to the fiber scheduler. - It's primarily an object that represents a `(void*, size_t)` tuple. - It can allocate it's own memory using `malloc`, `mmap` or `VirtualAlloc` (mainly for testing). - It can also map `File` objects into memory (experimental). - It provides some basic provisions for getting and setting data. - It provides a locking mechanism to prevent incorrect usage while the buffer is being used by the OS/system. - It provides mutable/immutable flag to validate correct usage when reading/writing. Going forward, I would like to see a more elaborate model where we can read and write directly using these buffers. We want a fast path for binary protocols like DNS, HTTP/2 etc. This implementation of `get`/`set` is 4x faster than `String#unpack` in my limited testing. ---------------------------------------- Feature #18020: Introduce `IO::Buffer` for fiber scheduler. https://bugs.ruby-lang.org/issues/18020#change-92836 * Author: ioquatix (Samuel Williams) * Status: Open * Priority: Normal ---------------------------------------- After continuing to build out the fiber scheduler interface and the specific hooks required for `io_uring`, I found some trouble within the implementation of `IO`. I found that in some cases, we need to read into the `rb_io_buffer_t` struct directly. I tried creating a "fake string" in order to transit back into the Ruby fiber scheduler interface and this did work, but I was told we cannot expose fake string to Ruby code. So, after this, and many other frustrations with using `String` as a IO buffer, I decided to implement a low level `IO::Buffer` based on my needs for high performance IO, and as part of the fiber scheduler interface. Going forward, this can form the basis of newer interfaces like `IO::Buffer#splice` and so on. We can also add support for `IO#read(n, buffer)` rather than string. This avoids many encoding and alignment issues. While I'm less interested in the user facing interface at this time, I believe we can introduce it incrementally. Initially my focus is on the interface requirements for the fiber scheduler. Then, I'll look at how we can integrate it more into `IO` directly. The goal is to have this in place for Ruby 3.1. -- https://bugs.ruby-lang.org/ Unsubscribe: