[ruby-core:81027] Re: [ruby-cvs:65407] normal:r58236 (trunk): thread.c: comments on M:N threading [ci skip]

From: Eric Wong <normalperson@...>
Date: 2017-05-08 00:33:15 UTC
List: ruby-core #81027
Eric Wong <normalperson@yhbt.net> wrote:
> SASADA Koichi <ko1@atdot.net> wrote:
> > Sorry I can't understand the basic of your idea with mixing Threads and
> > Fibers. Maybe you need to define more about the model.
> 
> Sorry if I wasn't clear.  Basically, I see:
> 
> 	green Threads == Fibers + auto scheduling
> 
> So, making Threads a subclass of Fibers makes sense to me.
> Then, existing (native) threads becomes an own internal class
> only accessible to C Ruby developers; new native threads get
> spawned as-needed (after GVL releases).
> 
> > Our plan is not mixing Threads and Fibers, so that (hopefully) there are
> > no problem.
> 
> OK, I will wait for you and see.

I have been thinking about this again; think M:N green Thread is
a bad idea[1].  Instead we should improve Fibers to make them
easier-to-use for cases where non-blocking I/O is _desirable_
(not just _possible_).

Notes for auto-scheduling Fibers:

* no timeslice or timer-based scheduling.
  I think this will simplify use and avoid race conditions
  compared to Threads.  Things like "numeric += 1" can always
  be atomic with respect to other Fibers in the same native
  Thread.

* not enabled by default for compatibility, maybe have:
  Fiber.current.auto_schedule = (true|false) # per-fiber
  Fiber.auto_schedule = (true|false) # process-wide
  But I do not do Ruby API design :P

* Existing native-thread code for blocking IO must (MUST!)
  continue blocking w/o GVL as in current 2.4.
  Users relying on blocking accept4() (via BasicSocket#accept)
  still gets thundering herd protection when sharing listen
  socket across multiple processes.
  Ditto with UNIXSocket#recv_io when sharing a receiver socket.

* documented scheduling points:

  TL;DR: most existing "blocking" APIs become Fiber-aware,
  similar to 1.8 green threads.

  - IO operations on pipe and sockets inside Fibers with
    auto-scheduling enabled automatically become Fiber-aware
    and use non-blocking internal interfaces while presenting
    a synchronous API:

        IO#read/write/syswrite/sysread/readpartial/gets etc..
        IO.copy_stream, IO.select
        Socket#connect/accept/sysaccept
        UNIXSocket#recv_io/#send_io
        IO#wait_*able (in io/wait ext)

  - Ditto for some non-IO things:

        Kernel#sleep
        Process.wait/waitpid/waitpid2 family uses WNOHANG
        Queue/SizedQueue support, maybe new Fiber::Queue and
        Fiber::SizedQueue classes needed?

  - keep Mutex and ConditionVariable as-is for native Thread
    user, I don't believe they are necessary for pure Fiber use.
    Maybe add an option for Mutex locks to prevent Fiber.yield
    and disable auto-scheduling temporarily?

  - IO#open, read-write I/O on filesystem release GVL as usual

  - It will be necessary to use resolv and resolv/replace in
    stdlib for Fiber-aware name resolution.

* Implementation (steps can be done gradually):

  1. new internal IO scheduler using kqueue/epoll/select.  Native
     kqueue/epoll allow cross-native-thread operation to share
     the event loop, so they only need one new FD per-process.
     I want to avoid libev/libevent since (last I checked) they
     do not allow sharing an event loop across native threads.
     I can write kqueue/epoll/select parts; I guess win32 can use
     select until someone else implements something

     Maybe build IO scheduler into current timer thread....

  2. pipes and sockets get O_NONBLOCK flag set automatically
     when created inside Fibers with auto-scheduling set.

  3. rb_wait_single_fd can use new IO scheduler and becomes
     Fiber-aware, ditto with rb_thread_fd_select...

     Steps 2 and 3 should make most IO changes transparent.

  4. make necessary changes to Process.wait*, IO.select,
     Kernel.sleep



Side note: I consider making Fibers migratable across native
Threads out-of-scope for this.  We currently use
makecontext/swapcontext (FIBER_USE_NATIVE) for speed (which
according to cont.c comments is significant).  I am not
sure if we can keep FIBER_USE_NATIVE if allowing Fibers
to migrate across native threads.


[1] general problem with threads:
    timeslice scheduling leads to unpredictability
    like Mutex/ConditionVariables become necessary.

    M:N will be problematic, as it will be difficult for
    users to know when it is safe to use heavy native threads
    for blocking operations and when their threads will be
    lightweight; making it difficult to design apps to use
    each appropriately.

    However, native 1:1 Threads will always be useful for cases
    where users can take advantage of blocking I/O
    (#recv_io/#accept/File.open/...) as well as releasing GVL
    for CPU-intensive operations independent of Ruby VM.

Thanks for reading, I wrote most of this while waiting for
tests to r58604 to run before committing.

Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>

In This Thread