From: samuel@... Date: 2021-07-11T09:16:32+00:00 Subject: [ruby-core:104577] [Ruby master Bug#17664] Behavior of sockets changed in Ruby 3.0 to non-blocking Issue #17664 has been updated by ioquatix (Samuel Williams). I have researched this topic today and I'm going to share some of my notes and thoughts. Firstly, with regards to performance, the most important platform is Linux, and I personally believe that `io_uring` is going to be the most important interface. We can also support `epoll` as a fallback, but it's less complete. Other interfaces, `kqueue` is similar to `epoll` and is less interesting. An important point to consider, is that on Linux, I've been told that sockets **don't** support asynchronous read and/or write. Internally they are emulated by the same user-space implementation - try reading, and on EAGAIN fall back to polling. In my testing, comparing `io_uring` io_read` and `io_write` operations, perform about 20% worse in practice in my benchmarks. This was surprising to me. My current understanding as to why it's slow is because when we perform `io_read`, internally it performs `read`, but because we have to defer that operation until the next iteration of the run loop, we pay quite a bit latency cost here. The fast path is this: ``` result = read(fd, ...) if (EAGAIN) { wait_readable -> Fiber.yield } ``` The slow path is this: ``` io_read(fd, ...) -> OP_READ SQE ... io_uring_submit() wait_cqe -> result ``` Now, there are actually two interpretations of the above, essentially it depends on the percentage of operations you expect `read` to result in EAGAIN. If you expect that percentage to be low, the single system call for `read` is far more efficient for Ruby, since we avoid the context switch. In both cases you need to call a system call, either `read` or `io_uring_submit`. With `io_uring_submit`, you can amortise the cost of the system calls, but it turns out that it's much less than the cost of the context switch in Ruby from what I can tell. The interesting point is, when the IO is not so busy, and we expect a higher chance of EAGAIN, the overhead of the yield is far less important. So the net result of this is, from what I can measure so far, non-blocking sockets are the most efficient way to handle IO. Forcing sockets to go through `OP_READ` seems to yield worse performance in every configuration I could think of. I'm gong to continue investigating this as I'm a little bit unconvinced by the results but I'm unconvinced that making sockets blocking is the right. way forward given this result. If you can produce benchmarks which show something other that what I've found so far, I'd be most interested. There is a caveat to this though. I did try to make `stdin`, `stdout` and `stderr` nonblocking. It turns out it's pretty difficult as a ton of things start breaking in unexpected ways - e.g. `printf`. Fortunately, I think there is a good solution - we do have the ability to check if an IO is in blocking mode, and if that's the case, we can punt it off to `OP_READ` which while a little bit slower will do the right thing without needing `O_NONBLOCK`. This allows us to have non-blocking stdin, stdout which would be really great. I'm still working out the details of how this should fit together within `io.c` but largely I'm convinced that: - Non-blocking Socket is the fast path. - Blocking file descriptors can still be asynchronous. In `io_uring` we can use `OP_READ` and in `epoll`/`kqueue` we can use `fcntl` to toggle `O_NONBLOCK`. I don't care much about performance impact in `epoll` and `kqueue` cases since it's not what I'm considering a hot path. I'll keep investigating but I wanted to give an update. ---------------------------------------- Bug #17664: Behavior of sockets changed in Ruby 3.0 to non-blocking https://bugs.ruby-lang.org/issues/17664#change-92859 * Author: ciconia (Sharon Rosner) * Status: Assigned * Priority: Normal * Assignee: ioquatix (Samuel Williams) * ruby -v: 3.0.0 * Backport: 2.5: UNKNOWN, 2.6: UNKNOWN, 2.7: UNKNOWN, 3.0: UNKNOWN ---------------------------------------- I'm not sure this is a bug, but apparently a change was introduced in Ruby 3.0 that makes sockets non-blocking by default. This change was apparently introduced as part of the work on the [FiberScheduler interface](https://github.com/ruby/ruby/blame/78f188524f551c97b1a7a44ae13514729f1a21c7/ext/socket/init.c#L411-L434). This change of behaviour is not discussed in the Ruby 3.0.0 release notes. This change complicates the implementation of an io_uring-based fiber scheduler, since io_uring SQE's on fd's with `O_NONBLOCK` can return `EAGAIN` just like normal syscalls. Using io_uring with non-blocking fd's defeats the whole purpose of using io_uring in the first place. A workaround I have put in place in the Polyphony [io_uring backend](https://github.com/digital-fabric/polyphony/blob/d3c9cf3ddc1f414387948fa40e5f6a24f68bf045/ext/polyphony/backend_io_uring.c#L28-L47) is to make sure `O_NONBLOCK` is not set before attempting I/O operations on any fd. -- https://bugs.ruby-lang.org/ Unsubscribe: