From: samuel@... Date: 2018-06-02T01:34:23+00:00 Subject: [ruby-core:87353] [Ruby trunk Feature#14739] Improve fiber yield/resume performance Issue #14739 has been updated by ioquatix (Samuel Williams). Here is a more realistic benchmark which fiber context switch is only a tiny percentage of the actual run-time. A brief summary of the benchmark: `async-http` uses an event-driven stackful coroutine (fiber) based design. Each request allocates a fiber, and each blocking operation (i.e. read) results in `Fiber.yield`. Once the IO is ready, `Fiber#resume` is called. So, for each request being processed, we expect several calls to `Fiber.yield`. `async` is optimistic so it tries to perform the operation e.g. `read` and only yields if it results in `EWOULDBLOCK` so in some cases (especially in synthetic benchmarks) some Fiber scheduling may be elided. ``` koyoko% rvm use 2.6 Using /home/samuel/.rvm/gems/ruby-2.6.0-preview2 koyoko% ruby --version ruby 2.6.0preview2 (2018-05-31 trunk 63539) [x86_64-linux] koyoko% bundle exec rake wrk Running 10s test @ http://127.0.0.1:9294/ 1 threads and 1 connections Thread Stats Avg Stdev Max +/- Stdev Latency 63.59us 77.52us 4.53ms 98.32% Req/Sec 16.68k 1.07k 18.32k 74.26% 167544 requests in 10.10s, 14.54MB read Requests/sec: 16589.33 Transfer/sec: 1.44MB Running 10s test @ http://127.0.0.1:9294/ 1 threads and 1 connections Thread Stats Avg Stdev Max +/- Stdev Latency 60.85us 34.26us 1.39ms 95.82% Req/Sec 16.82k 0.87k 18.49k 70.00% 167424 requests in 10.00s, 14.53MB read Requests/sec: 16742.19 Transfer/sec: 1.45MB Running 10s test @ http://127.0.0.1:9294/ 1 threads and 1 connections Thread Stats Avg Stdev Max +/- Stdev Latency 62.44us 54.34us 3.81ms 97.62% Req/Sec 16.62k 1.00k 18.09k 67.33% 166959 requests in 10.10s, 14.49MB read Requests/sec: 16530.76 Transfer/sec: 1.43MB Running 10s test @ http://127.0.0.1:9294/ 1 threads and 1 connections Thread Stats Avg Stdev Max +/- Stdev Latency 61.89us 32.53us 687.00us 94.29% Req/Sec 16.54k 1.20k 18.37k 67.33% 166105 requests in 10.10s, 14.42MB read Requests/sec: 16445.91 Transfer/sec: 1.43MB Running 10s test @ http://127.0.0.1:9294/ 1 threads and 1 connections Thread Stats Avg Stdev Max +/- Stdev Latency 60.90us 37.64us 1.70ms 95.89% Req/Sec 16.89k 1.22k 18.57k 72.28% 169694 requests in 10.10s, 14.73MB read Requests/sec: 16802.33 Transfer/sec: 1.46MB ``` Here is with the PR: ``` koyoko% rvm use ruby-head-fiber Using /home/samuel/.rvm/gems/ruby-head-fiber koyoko% ruby --version ruby 2.6.0dev (2018-06-01 native-fiber 63544) [x86_64-linux] last_commit=Better support for amd64 platforms koyoko% bundle exec rake wrk Running 10s test @ http://127.0.0.1:9294/ 1 threads and 1 connections Thread Stats Avg Stdev Max +/- Stdev Latency 62.53us 73.11us 5.02ms 97.96% Req/Sec 16.80k 1.35k 19.46k 63.37% 168863 requests in 10.10s, 14.65MB read Requests/sec: 16719.77 Transfer/sec: 1.45MB Running 10s test @ http://127.0.0.1:9294/ 1 threads and 1 connections Thread Stats Avg Stdev Max +/- Stdev Latency 58.91us 35.19us 1.54ms 95.25% Req/Sec 17.49k 1.16k 19.42k 69.31% 175719 requests in 10.10s, 15.25MB read Requests/sec: 17399.00 Transfer/sec: 1.51MB Running 10s test @ http://127.0.0.1:9294/ 1 threads and 1 connections Thread Stats Avg Stdev Max +/- Stdev Latency 58.64us 45.92us 3.09ms 96.88% Req/Sec 17.72k 1.10k 19.42k 71.29% 178027 requests in 10.10s, 15.45MB read Requests/sec: 17626.32 Transfer/sec: 1.53MB Running 10s test @ http://127.0.0.1:9294/ 1 threads and 1 connections Thread Stats Avg Stdev Max +/- Stdev Latency 60.83us 33.93us 1.06ms 94.93% Req/Sec 16.86k 1.54k 19.36k 63.37% 169307 requests in 10.10s, 14.69MB read Requests/sec: 16764.19 Transfer/sec: 1.45MB Running 10s test @ http://127.0.0.1:9294/ 1 threads and 1 connections Thread Stats Avg Stdev Max +/- Stdev Latency 59.07us 39.77us 2.17ms 95.97% Req/Sec 17.52k 0.98k 19.32k 66.34% 176112 requests in 10.10s, 15.28MB read Requests/sec: 17436.64 Transfer/sec: 1.51MB ``` This is actually better than I expected. I would say there is a practical improvement of about ~5%. In this situation it's very workload dependent, but I'm glad that I saw something. ---------------------------------------- Feature #14739: Improve fiber yield/resume performance https://bugs.ruby-lang.org/issues/14739#change-72345 * Author: ioquatix (Samuel Williams) * Status: Open * Priority: Normal * Assignee: * Target version: ---------------------------------------- I am interested to improve Fiber yield/resume performance. I've used this library before: http://software.schmorp.de/pkg/libcoro.html and handled millions of HTTP requests using it. I'd suggest to use that library. As this is used in many places in Ruby (e.g. enumerable) it could be a big performance win across the board. Here is a nice summary of what was done for RethinkDB: https://rethinkdb.com/blog/making-coroutines-fast/ Does Ruby currently reuse stacks? This is also a big performance win if it's not being done already. -- https://bugs.ruby-lang.org/ Unsubscribe: