From: Eric Wong <normalperson@...>
Date: 2016-01-26T02:37:16+00:00
Subject: [ruby-core:73461] Re: [CommonRuby - Feature #12020] Documenting Ruby memory model

eregontp@gmail.com wrote:
> I attached a RTF version to this issue.

Thanks.

I'm not sure if shared memory is even a good model for Ruby (and not my
decision).  Anyways, my comments below if matz/ko1 decide to go down
this route.

Background: I am only a simple C programmer with some familiarity with
Userspace RCU and Linux kernel memory model.  I have zero experience in
Java, and I do not know any C++ beyond what is in C.

For those unfamiliar with RCU, it is basically a poor man's GC; and all
Rubies have a GC implementation anyways.  In fact, working with the
quirks with our conservative GC is not much different from working with
RCU and the relaxed memory ordering model it favors.

>    Core behavior
>    Following sections covers the various storages in the Ruby language
>    (e.g. local variable, instance variable, etc.). We consider the
>    following operations:
>    ���read - reading a value from an already defined storage
>    ���write - writing a value from an already defined storage
>    ���define - creates a new storage and stores the default value or a
>    supplied value
>    ���undefine - removes an existing storage
>    Key properties are:
>    ���volatility (V) - A written value is immediately visible to any
>    subsequent volatile read of the same variable on any Thread. It has
>    same meaning as in Java, it provides sequential consistency. A volatile
>    write happens-before any subsequent volatile read of the same variable.

Perhaps we call this "synchronous" or "coherent" instead.
The word "volatile" is highly misleading and confusing to me
as a C programmer.  (Perhaps I am easily confused :x)

Anyways, I am not convinced (volatile|synchronous|coherent) access
should happen anywhere by default for anything because of costs.

Those requiring synchronized data should use special method calls
to ensure memory ordering.

>    Constant variables
>    ���volatility - yes
>    ���atomicity - yes
>    ���serializability - yes
>    ���scope - a module
>    A Module or a Class definition is actually a constant definition. The
>    definition is atomic, it assigns the Module or the Class to the
>    constant, then its methods are defined atomically one by one.
>    It���s desirable that once a constant is defined it and its value is
>    immediately visible to all threads, therefore it���s volatile.

<snip (thread|fiber)-local, no objections there>

>    Method table
>    ���volatility - yes
>    ���atomicity - yes
>    ���serializability - yes
>    ���scope - a class
>    Methods are also stored where operations defacto are: read -> method
>    lookup, write -> method redefinition, define -> method definition,
>    undefine -> method removal. Operations over method tables have to be
>    visible as soon as possible otherwise Threads could execute different
>    versions of methods leading to unpredictable behaviour, therefore they
>    are marked volatile. When a method is updated and the method is being
>    executed by a thread, the thread will finish the method body and it���ll
>    use the updated method obtained on next method lookup.

I strongly disagree with volatility in method and constant tables.  Any
programs defining methods/constants in parallel threads and expecting
them to be up-to-date deserve all the problems they get.

Maybe volatility for require/autoload is a special case only iff a
method/constant is missing entirely; but hitting old methods/constants
should be allowed by the implementation.

Methods (and all other objects) are already protected from memory
corruption and use-after-free by GC.  There is no danger in segfaulting
when old/stale methods get run.

The inline, global (, and perhaps in the future: thread-specific)
caches will all become expensive if we need to ensure read-after-write
consistency by checking for changes on methods and constants made
by other threads.

>    Threads
>    Threads have the same guarantees as in in Java. Thread.new
>    happens-before the execution of the new thread���s block. All operations
>    done by the thread happens-before the thread is joined. In other words,
>    when a thread is started it sees all changes made by its creator and
>    when a thread is joined, the joining thread will see all changes made
>    by the joined thread.

Good.  For practical reasons, this should obviate the need for
constant/method volatility specified above.

>    Beware of requiring and autoloading in concurrent programs, it's
>    possible to see partially defined classes. Eager loading or blocking
>    until classes are fully loaded should be used to mitigate.

No disagreement, here :)

Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>