From: ko1@... Date: 2017-06-01T02:15:42+00:00 Subject: [ruby-core:81495] [Ruby trunk Feature#13618] [PATCH] auto fiber schedule for rb_wait_for_single_fd and rb_waitpid Issue #13618 has been updated by ko1 (Koichi Sasada). Thank you for your great work. # summary of this comment Recent days I'm thinking about this feature's "safety" or "dependability". Because of this issue, I think it is difficult to employ this feature right now. # Non-auto-fibers Without this feature, Fiber switching is explicit (`Fiber.yield`) and most of case, it is easy to write several operations in atomic. Typical atomic operation is increment. Let's think about it with example: `t = n; ...; n = t+1`. ``` def some_method Fiber.yield end n = 0 f1 = Fiber.new{ t = n some_method n = t + 1 } f1.resume n += 1 f1.resume p n #=> 1 (although two increments are tried) ``` In this case, main fiber and fiber f1 try to increment `n` and `some_method` breaks atomicity because of `Fiber.yield`. Of course, nobody write such silly code and it is easy to check because `Fiber.yield` is strongly coupled with Fiber operations written by users (basically, libraries don't call `Fiber.yield`). # auto-fibers However, auto-fiber switching introduce this kind of danger. ``` # assume all fibers are auto-scheduling fibers n = 0 f1 = Fiber.new{ n = log(t) + 1 } f1.resume # auto-fibers should not call resume # but please allow me, this is pseudo-code to describe an issue. n += 1 f1.resume p n ``` If `log()` method tries to send a log message over network, Fiber will switch to other fibers. Problems are: * It is difficult to know which operations should be run in atomic (users write code without checking atomicity). * It is difficult to find out which method can switch. * Not only user writing code, but also all library code can switch fibers. * This means that we need to check all of library code to know that they don't violate atomic assumptions. * It introduced non-deterministic behavior (with `Fiber.yield` it will be deterministic behavior and it is easy to reproduce the problem). This kind of difficulties are same as threading. The impact can be smaller than threading (because threading can switch anywhere and it is very hard to predict the behavior. Auto-fibers switch only at blocking operations especially on IO operations). # Consideration To solve this behavior, we have several choice. (1) Introduce synchronization mechanisms for auto-fibers Like Mutex, Queue and so on. On Ruby 1.8 era, we have `Thread.exclusive` to prohibit thread-switching. I don't want to choice this option because it is what I want to avoid from Ruby. (2) Introduce limitations The problem "It is difficult to find out which method can switch" is because we need to check whole of code. If we can restrict the auto-fiber switching, this problem can be smaller. (2-1) Introduce Fiber switching methods Instead of implicit blocking (IO) operations, introduce explicit blocking operations can switch. We can check all of source code by grep. (2-2) Check context Permit fiber switching only at permitted places, by block, pragma, and so on. ``` # auto-fiber: true # <- this file can switch fibers automatically Fiber.new(auto: true){ ... io.read # can switch ... something_defined_in_gem # can't switch ... } ``` I think other languages like Python, JavaScript employs this idea. I need to survey more on such languages. (3) Something else cleaver Introducing debugger is one choice (maybe it is easy than threading issues). But we can't avoid troubles (and maybe the troubles should be not frequent, non-reproducible). Other option is to introduce hooks to implement auto-fibers and provide auto-fibers by gems and advanced users know the above risk use this feature. But not good idea because we can't provide good way to write for many people. thought? ---------------------------------------- Feature #13618: [PATCH] auto fiber schedule for rb_wait_for_single_fd and rb_waitpid https://bugs.ruby-lang.org/issues/13618#change-65205 * Author: normalperson (Eric Wong) * Status: Open * Priority: Normal * Assignee: * Target version: ---------------------------------------- ``` auto fiber schedule for rb_wait_for_single_fd and rb_waitpid Implement automatic Fiber yield and resume when running rb_wait_for_single_fd and rb_waitpid. The Ruby API changes for Fiber are named after existing Thread methods. main Ruby API: Fiber#start -> enable auto-scheduling and run Fiber until it automatically yields (due to EAGAIN/EWOULDBLOCK) The following behave like their Thread counterparts: Fiber.start - Fiber.new + Fiber#start (prelude.rb) Fiber#join - run internal scheduler until Fiber is terminated Fiber#value - ditto Fiber#run - like Fiber#start (prelude.rb) Right now, it takes over rb_wait_for_single_fd() and rb_waitpid() function if the running Fiber is auto-enabled (cont.c::rb_fiber_auto_sched_p) Changes to existing functions are minimal. New files (all new structs and relations should be documented): iom.h - internal API for the rest of RubyVM (incomplete?) iom_internal.h - internal header for iom_(select|epoll|kqueue).h iom_epoll.h - epoll-specific pieces iom_kqueue.h - kqueue-specific pieces iom_select.h - select-specific pieces iom_pingable_common.h - common code for iom_(epoll|kqueue).h iom_common.h - common footer for iom_(select|epoll|kqueue).h Changes to existing data structures: rb_thread_t.afrunq - list of fibers to auto-resume rb_vm_t.iom - Ruby I/O Manager (rb_iom_t) :) Besides rb_iom_t, all the new structs are stack-only and relies extensively on ccan/list for branch-less, O(1) insert/delete. As usual, understanding the data structures first should help you understand the code. Right now, I reuse some static functions in thread.c, so thread.c includes iom_(select|epoll|kqueue).h TODO: Hijack other blocking functions (IO.select, ...) I am using "double" for timeout since it is more convenient for arithmetic like parts of thread.c. Most platforms have good FP, I think. Also, all "blocking" functions (rb_iom_wait*) will have timeout support. ./configure gains a new --with-iom=(select|epoll|kqueue) switch libkqueue: libkqueue support is incomplete; corner cases are not handled well: 1) multiple fibers waiting on the same FD 2) waiting for both read and write events on the same FD Bugfixes to libkqueue may be necessary to support all corner cases. Supporting these corner cases for native kqueue was challenging, even. See comments on iom_kqueue.h and iom_epoll.h for nuances. Limitations Test script I used to download a file from my server: ----8<--- require 'net/http' require 'uri' require 'digest/sha1' require 'fiber' url = 'http://80x24.org/git-i-forgot-to-pack/objects/pack/pack-97b25a76c03b489d4cbbd85b12d0e1ad28717e55.idx' uri = URI(url) use_ssl = "https" == uri.scheme fibs = 10.times.map do Fiber.start do cur = Fiber.current.object_id # XXX getaddrinfo() and connect() are blocking # XXX resolv/replace + connect_nonblock Net::HTTP.start(uri.host, uri.port, use_ssl: use_ssl) do |http| req = Net::HTTP::Get.new(uri) http.request(req) do |res| dig = Digest::SHA1.new res.read_body do |buf| dig.update(buf) #warn "#{cur} #{buf.bytesize}\n" end warn "#{cur} #{dig.hexdigest}\n" end end warn "done\n" :done end end warn "joining #{Time.now}\n" fibs[-1].join(4) warn "joined #{Time.now}\n" all = fibs.dup warn "1 joined, wait for the rest\n" until fibs.empty? fibs.each(&:join) fibs.keep_if(&:alive?) warn fibs.inspect end p all.map(&:value) Fiber.new do puts 'HI' end.run.join ``` ---Files-------------------------------- 0001-auto-fiber-schedule-for-rb_wait_for_single_fd-and-rb.patch (82.8 KB) -- https://bugs.ruby-lang.org/ Unsubscribe: