From: ko1@... Date: 2017-06-01T14:40:04+00:00 Subject: [ruby-core:81507] [Ruby trunk Feature#13618] [PATCH] auto fiber schedule for rb_wait_for_single_fd and rb_waitpid Issue #13618 has been updated by ko1 (Koichi Sasada). normalperson (Eric Wong) wrote: > I disagree. I do not recall Ruby 1.8 Threads being a big problem > for Rubyists. Modern Rubyists seem OK using native Threads > ("OK", not "great" :) ... > However, I do not believe it > is a big problem since Rubyists should already be used to > threading. ... > And yes, I think native threading bugs are trickier to track down > than auto-Fiber switching. Just remember, today we have native > threading and things are OK. And I think there were more happy > Rubyists in 1.8 days. ... > Again; from my experience; I do not believe many Ruby > programmers had safety problems with 1.8 green threads. > > Today we have Rubyists who are already used to 1.9/2.x native > Thread already. > > The safety improvement is a minor point. My opinion is opposite. I think "For human being using threading is too hard to use correctly" or "Rubyist shouldn't care about threading difficulties". I agree my opinion is extreme and many "advanced" programmers like Eric can write correct thread programs. But most (many? some? a few?) of ruby programmer (including me) can not write correct code I believe. (In addition: I heard some advanced programmers say "people can write". I doubt because it is something survivor bias) (recent days I fixed rubygems' threading problem it is difficult to reproduce) I often use this metaphor: It is like GC strategy. If people can manage object lifetime, it is faster than using GC (at some case. Some case GC is more faster than manual memory management). However we choose GC because we want to concentrate on writing application code. I agree auto-fibers is safer than threads. In my mind: ``` danger <-> safe (this is my opinion) parallel threads (JRuby, ...) > concurrent threads (MRI) >> auto-fibers (full-auto) > auto-fiber (restricted) >> Guild > single thread ``` But auto-fiber can introduce accident and it should be not so frequent, and it is difficult to reproduce. This means it is difficult to debug. Ruby has many pit falls to shoot our own legs (meta-programming features, open class and so on) but they are deterministic (at most of case). I think this is how to evaluate the risk of such danger. C/C++/Java/... (and many imperative languages) choose performance (people should write correct code). Some languages try to avoid this kind of difficulties. Rust choose threading but introduce harness by type system. Clojure choose STM to prevent atomic violation. I agree threading and auto-fiber is easy to use. Maybe most of case it is no problem (especially on auto-fiber). But it can includes accident in only few cases and it will be difficult to find out. I hope Ruby is safe language because I don't want to bother of such difficulties. This is my wish. I agree there are another wish like Eric's and I respect it. ---- Other than this point, I agree of all of your opinions. If I can believe "All Rubyist can write correct thread programs", your points make sense for me. (other points) > Yes; we will document all switch points in RDoc and NEWS, > of course (maybe write a separate doc/auto-fiber.rdoc) My point is, if method "foo" is switching point, then any method can call "foo" (bar, and baz, the caller of bar, ...) should be noted. Maybe it is impossible to complete because of Ruby's dynamic nature. > I would like to use existing stdlib (net/*, webrick, drb, ...) > as much as possible without modifications. That means many > existing Ruby libraries can work transparently. I understand your point. ---------------------------------------- Feature #13618: [PATCH] auto fiber schedule for rb_wait_for_single_fd and rb_waitpid https://bugs.ruby-lang.org/issues/13618#change-65214 * Author: normalperson (Eric Wong) * Status: Open * Priority: Normal * Assignee: * Target version: ---------------------------------------- ``` auto fiber schedule for rb_wait_for_single_fd and rb_waitpid Implement automatic Fiber yield and resume when running rb_wait_for_single_fd and rb_waitpid. The Ruby API changes for Fiber are named after existing Thread methods. main Ruby API: Fiber#start -> enable auto-scheduling and run Fiber until it automatically yields (due to EAGAIN/EWOULDBLOCK) The following behave like their Thread counterparts: Fiber.start - Fiber.new + Fiber#start (prelude.rb) Fiber#join - run internal scheduler until Fiber is terminated Fiber#value - ditto Fiber#run - like Fiber#start (prelude.rb) Right now, it takes over rb_wait_for_single_fd() and rb_waitpid() function if the running Fiber is auto-enabled (cont.c::rb_fiber_auto_sched_p) Changes to existing functions are minimal. New files (all new structs and relations should be documented): iom.h - internal API for the rest of RubyVM (incomplete?) iom_internal.h - internal header for iom_(select|epoll|kqueue).h iom_epoll.h - epoll-specific pieces iom_kqueue.h - kqueue-specific pieces iom_select.h - select-specific pieces iom_pingable_common.h - common code for iom_(epoll|kqueue).h iom_common.h - common footer for iom_(select|epoll|kqueue).h Changes to existing data structures: rb_thread_t.afrunq - list of fibers to auto-resume rb_vm_t.iom - Ruby I/O Manager (rb_iom_t) :) Besides rb_iom_t, all the new structs are stack-only and relies extensively on ccan/list for branch-less, O(1) insert/delete. As usual, understanding the data structures first should help you understand the code. Right now, I reuse some static functions in thread.c, so thread.c includes iom_(select|epoll|kqueue).h TODO: Hijack other blocking functions (IO.select, ...) I am using "double" for timeout since it is more convenient for arithmetic like parts of thread.c. Most platforms have good FP, I think. Also, all "blocking" functions (rb_iom_wait*) will have timeout support. ./configure gains a new --with-iom=(select|epoll|kqueue) switch libkqueue: libkqueue support is incomplete; corner cases are not handled well: 1) multiple fibers waiting on the same FD 2) waiting for both read and write events on the same FD Bugfixes to libkqueue may be necessary to support all corner cases. Supporting these corner cases for native kqueue was challenging, even. See comments on iom_kqueue.h and iom_epoll.h for nuances. Limitations Test script I used to download a file from my server: ----8<--- require 'net/http' require 'uri' require 'digest/sha1' require 'fiber' url = 'http://80x24.org/git-i-forgot-to-pack/objects/pack/pack-97b25a76c03b489d4cbbd85b12d0e1ad28717e55.idx' uri = URI(url) use_ssl = "https" == uri.scheme fibs = 10.times.map do Fiber.start do cur = Fiber.current.object_id # XXX getaddrinfo() and connect() are blocking # XXX resolv/replace + connect_nonblock Net::HTTP.start(uri.host, uri.port, use_ssl: use_ssl) do |http| req = Net::HTTP::Get.new(uri) http.request(req) do |res| dig = Digest::SHA1.new res.read_body do |buf| dig.update(buf) #warn "#{cur} #{buf.bytesize}\n" end warn "#{cur} #{dig.hexdigest}\n" end end warn "done\n" :done end end warn "joining #{Time.now}\n" fibs[-1].join(4) warn "joined #{Time.now}\n" all = fibs.dup warn "1 joined, wait for the rest\n" until fibs.empty? fibs.each(&:join) fibs.keep_if(&:alive?) warn fibs.inspect end p all.map(&:value) Fiber.new do puts 'HI' end.run.join ``` ---Files-------------------------------- 0001-auto-fiber-schedule-for-rb_wait_for_single_fd-and-rb.patch (82.8 KB) -- https://bugs.ruby-lang.org/ Unsubscribe: