From: samuel@... Date: 2021-07-12T00:16:46+00:00 Subject: [ruby-core:104581] [Ruby master Bug#17664] Behavior of sockets changed in Ruby 3.0 to non-blocking Issue #17664 has been updated by ioquatix (Samuel Williams). I missed some functions in `io.c` which could invoke `read`. That's why `read` was showing up. Now that's been patched: ``` % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ------------------ 41.67 0.099333 4 21439 1 io_uring_enter 27.40 0.065317 5 10957 close 17.63 0.042024 3 10719 2 accept4 12.07 0.028768 2 11109 9 newfstatat 0.24 0.000570 1 313 read ``` All read and write operations are going via the uring. I feel fairly confident of this (on `io-buffer` branch). Measuring performance systematically is a bit of a pain but here are some comparisons. ## Comparing `io_uring_submit` (right away after prep sqe vs deferred) I found in some cases it's advantageous to call it right away, but in this case, it didn't seem to help much. But at least you can see impact to sys call count to confirm it's working. All I/O is non-blocking in this test. `DirectScheduler` means it implements `io_read` and `io_write` (`OP_READ` and `OP_WRITE` respectively). `Scheduler` means it doesn't implement `io_read` and `io_write` which forces `io_wait` path. `strace -c` is captured separately and the numbers are different but proportionally correct (because it's slower). ### Early `io_uring_submit` (`DirectScheduler`, 128 connections) ``` Running 30s test @ http://localhost:9090 4 threads and 128 connections Thread Stats Avg Stdev Max +/- Stdev Latency 3.52ms 578.19us 13.65ms 64.86% Req/Sec 9.03k 0.95k 18.68k 72.95% 1076598 requests in 30.05s, 47.23MB read Requests/sec: 35831.03 Transfer/sec: 1.57MB % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ------------------ 41.33 0.090196 4 21218 io_uring_enter 25.61 0.055886 5 10753 close 16.64 0.036306 3 10516 3 accept4 11.46 0.025016 2 10905 9 newfstatat 2.80 0.006115 2 2079 mprotect ``` ### Deferred `io_uring_submit` (`DirectScheduler`, 128 connections) ``` Running 30s test @ http://localhost:9090 4 threads and 128 connections Thread Stats Avg Stdev Max +/- Stdev Latency 3.41ms 677.38us 12.00ms 63.73% Req/Sec 9.31k 1.01k 11.65k 56.88% 1110951 requests in 30.05s, 48.74MB read Requests/sec: 36975.48 Transfer/sec: 1.62MB % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ------------------ 36.38 0.074410 6 11745 io_uring_enter 27.47 0.056179 4 11987 close 18.97 0.038801 3 11751 4 accept4 13.02 0.026624 2 12139 9 newfstatat 2.94 0.006010 2 2079 mprotect ``` ### Early `io_uring_submit` (`Scheduler`, 128 connections) ``` Running 30s test @ http://localhost:9090 4 threads and 128 connections Thread Stats Avg Stdev Max +/- Stdev Latency 2.94ms 567.14us 17.13ms 66.30% Req/Sec 10.81k 1.02k 13.44k 58.47% 1289574 requests in 30.04s, 56.57MB read Requests/sec: 42926.80 Transfer/sec: 1.88MB % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ------------------ 23.69 0.048398 4 9924 close 22.35 0.045668 4 9684 write 16.04 0.032769 3 9687 3 accept4 12.88 0.026323 2 10076 79 read 11.58 0.023663 2 10076 9 newfstatat 9.70 0.019821 2 9767 io_uring_enter 2.76 0.005635 2 2079 mprotect ``` ### Deferred `io_uring_submit` (`Scheduler`, 128 connections) I feel like the `io_uring_enter` count here is a bit off. Maybe because the `select` operation always calls `io_uring_enter` even if there are no SQEs outstanding. Perhaps it should only be called if the SQ is not empty. ``` Running 30s test @ http://localhost:9090 4 threads and 128 connections Thread Stats Avg Stdev Max +/- Stdev Latency 3.04ms 557.79us 21.41ms 67.67% Req/Sec 10.44k 0.96k 13.07k 64.22% 1245909 requests in 30.07s, 54.66MB read Requests/sec: 41434.63 Transfer/sec: 1.82MB % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ------------------ 23.57 0.057679 5 9964 close 22.06 0.053990 5 9724 write 15.48 0.037879 3 9727 3 accept4 12.95 0.031689 3 10116 79 read 11.50 0.028137 2 10116 9 newfstatat 9.24 0.022621 2 9725 1 io_uring_enter 3.43 0.008383 4 2079 mprotect ``` ---------------------------------------- Bug #17664: Behavior of sockets changed in Ruby 3.0 to non-blocking https://bugs.ruby-lang.org/issues/17664#change-92864 * Author: ciconia (Sharon Rosner) * Status: Assigned * Priority: Normal * Assignee: ioquatix (Samuel Williams) * ruby -v: 3.0.0 * Backport: 2.5: UNKNOWN, 2.6: UNKNOWN, 2.7: UNKNOWN, 3.0: UNKNOWN ---------------------------------------- I'm not sure this is a bug, but apparently a change was introduced in Ruby 3.0 that makes sockets non-blocking by default. This change was apparently introduced as part of the work on the [FiberScheduler interface](https://github.com/ruby/ruby/blame/78f188524f551c97b1a7a44ae13514729f1a21c7/ext/socket/init.c#L411-L434). This change of behaviour is not discussed in the Ruby 3.0.0 release notes. This change complicates the implementation of an io_uring-based fiber scheduler, since io_uring SQE's on fd's with `O_NONBLOCK` can return `EAGAIN` just like normal syscalls. Using io_uring with non-blocking fd's defeats the whole purpose of using io_uring in the first place. A workaround I have put in place in the Polyphony [io_uring backend](https://github.com/digital-fabric/polyphony/blob/d3c9cf3ddc1f414387948fa40e5f6a24f68bf045/ext/polyphony/backend_io_uring.c#L28-L47) is to make sure `O_NONBLOCK` is not set before attempting I/O operations on any fd. -- https://bugs.ruby-lang.org/ Unsubscribe: