From: samuel@... Date: 2017-07-31T09:19:17+00:00 Subject: [ruby-core:82214] [Ruby trunk Feature#13618] [PATCH] auto fiber schedule for rb_wait_for_single_fd and rb_waitpid Issue #13618 has been updated by ioquatix (Samuel Williams). I am following this thread and I find it really fascinating. Thanks everyone for thinking about these issues and Eric for your insightful work and ideas. Just as an aside, I feel like something is being lost in translation w.r.t. the response from Matz and other core Ruby developers. Perhaps we need to have a hangout to discuss these ideas. I've just released async, async-io and async-dns 1.0.0, along with rubydns 2.0.0 - in addition to this there is also async-http (client and server library) and falcon, a rack compatible server, built on top of async. The http library lacks support for SSL so it's not 1.x yet - still working on that part. It works on Ruby 2.0+, and most of it also works on JRuby, excepting JRuby's missing support for UDP sockets (https://github.com/jruby/jruby/pull/4684). I would like to think `async` is a proof of concept of what is possible with Ruby, in terms of performance. I think it's a solid platform for making network clients and servers, and I've implemented both DNS client/server and HTTP client/server which provide useful test cases for both performance and design. In terms of design, it's a very simple concept to use with an API that works as if it's sequential, but yields if the operation would block. The user almost cannot make any mistakes, and implementing complex network logic becomes trivial. In terms of performance, there are few comparisons I can make. If you like more details, let me know. I'm going to be matter of fact, you can draw your own conclusions. - RubyDNS is about as fast as Bind for a trivial benchmark resolving a fixed set of IP addresses. - Falcon is as fast as Puma but scales significantly better especially if non-blocking IO is leveraged. - Falcon and Puma both process requests significantly faster than typical Rack middleware can cope with them. An example would be, Falcon can easily handle 30,000 conn/s on my 8-core workstation, but as soon as I put any non-trivial rack application behind it, it would drop to < 3000 conn/s. Falcon can handle up to 100,000 req/s on the same hardware (e.g. using keep alive). - I implemented a complete stack in C++ of the same concept, and it achieved roughly on 1 core what Ruby required 8 cores. That is, a single process/thread could handle 25,000 conn/s on 1 core, and about 90,000 req/s. So, Ruby is about 10x slower than similar C++ code. Eric, my opinion at this point is that the work you've done here is awesome. What I would personally like to see, is a backend, perhaps an alternative to nio4r, which, as an example, async could use to implement it's reactor. I think that when your selector is running for the current fiber, operations like wait_for_pid and wait_one_fd should be hijacked and go via reactor. I think it should be possible for nio4r to tap into this too some how. This would make things completely transparent for user. I still believe this should be a gem - even if it's an official one distributed with Ruby, and that Ruby should expose the relevant hooks. Otherwise, it's going to make a lot of trouble for other implementations e.g. JRuby, MRI, etc. Ideally they can just expose the same low-level hooks at the VM level. I would like to say at this point, with the release of async & (-*) 1.0, I believe that this concept has proven itself - e.g. that the implementation works, that it has good performance, and that it can be used to implement good composable libraries. Whatever form the final library takes, I hope that it is (a) modular (b) fast and (c) composable. One final opinion that I've formed while working on this project, is that Ruby IO primitives are overly complex and fail to expose the right abstraction. `*_nonblock` methods never should have existed. If there is one thing I'd wish for, it's that once a decent asynchronous library is adopted, that these methods are not made part of it's public API. `async` does forward these methods, but it's only to make wrapping existing `Net::HTTP` work better, and essentially the `x_nonblock` variant is identical to the `x` method in `async`. ---------------------------------------- Feature #13618: [PATCH] auto fiber schedule for rb_wait_for_single_fd and rb_waitpid https://bugs.ruby-lang.org/issues/13618#change-65981 * Author: normalperson (Eric Wong) * Status: Open * Priority: Normal * Assignee: * Target version: ---------------------------------------- ``` auto fiber schedule for rb_wait_for_single_fd and rb_waitpid Implement automatic Fiber yield and resume when running rb_wait_for_single_fd and rb_waitpid. The Ruby API changes for Fiber are named after existing Thread methods. main Ruby API: Fiber#start -> enable auto-scheduling and run Fiber until it automatically yields (due to EAGAIN/EWOULDBLOCK) The following behave like their Thread counterparts: Fiber.start - Fiber.new + Fiber#start (prelude.rb) Fiber#join - run internal scheduler until Fiber is terminated Fiber#value - ditto Fiber#run - like Fiber#start (prelude.rb) Right now, it takes over rb_wait_for_single_fd() and rb_waitpid() function if the running Fiber is auto-enabled (cont.c::rb_fiber_auto_sched_p) Changes to existing functions are minimal. New files (all new structs and relations should be documented): iom.h - internal API for the rest of RubyVM (incomplete?) iom_internal.h - internal header for iom_(select|epoll|kqueue).h iom_epoll.h - epoll-specific pieces iom_kqueue.h - kqueue-specific pieces iom_select.h - select-specific pieces iom_pingable_common.h - common code for iom_(epoll|kqueue).h iom_common.h - common footer for iom_(select|epoll|kqueue).h Changes to existing data structures: rb_thread_t.afrunq - list of fibers to auto-resume rb_vm_t.iom - Ruby I/O Manager (rb_iom_t) :) Besides rb_iom_t, all the new structs are stack-only and relies extensively on ccan/list for branch-less, O(1) insert/delete. As usual, understanding the data structures first should help you understand the code. Right now, I reuse some static functions in thread.c, so thread.c includes iom_(select|epoll|kqueue).h TODO: Hijack other blocking functions (IO.select, ...) I am using "double" for timeout since it is more convenient for arithmetic like parts of thread.c. Most platforms have good FP, I think. Also, all "blocking" functions (rb_iom_wait*) will have timeout support. ./configure gains a new --with-iom=(select|epoll|kqueue) switch libkqueue: libkqueue support is incomplete; corner cases are not handled well: 1) multiple fibers waiting on the same FD 2) waiting for both read and write events on the same FD Bugfixes to libkqueue may be necessary to support all corner cases. Supporting these corner cases for native kqueue was challenging, even. See comments on iom_kqueue.h and iom_epoll.h for nuances. Limitations Test script I used to download a file from my server: ----8<--- require 'net/http' require 'uri' require 'digest/sha1' require 'fiber' url = 'http://80x24.org/git-i-forgot-to-pack/objects/pack/pack-97b25a76c03b489d4cbbd85b12d0e1ad28717e55.idx' uri = URI(url) use_ssl = "https" == uri.scheme fibs = 10.times.map do Fiber.start do cur = Fiber.current.object_id # XXX getaddrinfo() and connect() are blocking # XXX resolv/replace + connect_nonblock Net::HTTP.start(uri.host, uri.port, use_ssl: use_ssl) do |http| req = Net::HTTP::Get.new(uri) http.request(req) do |res| dig = Digest::SHA1.new res.read_body do |buf| dig.update(buf) #warn "#{cur} #{buf.bytesize}\n" end warn "#{cur} #{dig.hexdigest}\n" end end warn "done\n" :done end end warn "joining #{Time.now}\n" fibs[-1].join(4) warn "joined #{Time.now}\n" all = fibs.dup warn "1 joined, wait for the rest\n" until fibs.empty? fibs.each(&:join) fibs.keep_if(&:alive?) warn fibs.inspect end p all.map(&:value) Fiber.new do puts 'HI' end.run.join ``` ---Files-------------------------------- 0001-auto-fiber-schedule-for-rb_wait_for_single_fd-and-rb.patch (82.8 KB) -- https://bugs.ruby-lang.org/ Unsubscribe: