From: email@...
Date: 2016-04-20T17:40:07+00:00
Subject: [ruby-core:75047] [CommonRuby Feature#12020] Documenting Ruby	memory model

Issue #12020 has been updated by Petr Chalupa.


Thank you for responding and for taking time to read the proposal.

Let me start by elaborating more on the motivation behind all of the related
proposals, since I did not really explained it in detail when I was opening
them. I apologise for not doing that sooner.

### Motivation

I would like to clear up a possible misunderstanding about the target users of
this document and this memory model. It's not intended to be directly used by
majority of the Ruby programmers. (Even though the document aims to be
understandable it will still be difficult topic.) It's intended to be used by
concurrency enthusiasts, giving them tools to build many different concurrency
abstractions as gems.

At this point Ruby is a general purpose language, with direct support for
Threads and shared memory. As it was announced in few presentations, there are
plans to add new easy to use abstraction to Ruby in some future release and
maybe deprecate Threads. Lets call this scenario A. (Block-quotes are used for
better logical structure.)

> (A) I understand the need to add such abstraction (actors, channels, other?)
to Ruby to enable Ruby users to build concurrent applications with ease. For
future reference let's call the one future abstraction Red. The Red would then
have well documented and defined behaviour in concurrent and parallel execution
(This is what I think you are referring to). However providing just one
abstraction in standard library (and deprecating Threads) will hurt usability
of Ruby language.

> The problems lies in that there is no single concurrency abstraction which
would fit all problems. Therefore providing just Red will left Ruby language
suitable to just some subset of problems.

> Assuming: Only the Red would be documented and providing high-level
guarantees; threads would be deprecated; low-level concurrency would not be
documented and guaranteed. Developers who would like to contribute new
abstraction to solve another group of problems would be left with following (I
think not very good) choices:

> > (1) Implement the abstraction in underlying language used for the
particular Ruby implementation (in C for MRI, in Java for JRuby(+Truffle))
using guarantees provided by the underlying language. Meaning the author of the
new abstraction has to understand 3 programming languages (C, Ruby, Java) and 3
implementations to develop the implementation 3 times. That would discourage
people and also make the whole process error prone and difficult.

> > (2) Implement the abstraction using the Red. This approach gives users the
desired abstraction (avoiding using different languages and understanding
implementation details) but it will probably have bad performance since the Red
is not suited to solve this problem. For example implementing ConcurrentHashMap
(allowing parallel reads and writes) with actors would perform badly.
(Admittedly this is a little extreme example, but it demonstrates the problem
and I could not think of a better one.)

The above is to best of my knowledge where Ruby is heading in future, please
correct me if I misunderstood and/or misrepresented it in any way.

To avoid the above outlined difficulties Ruby could take a different path,
which is related to these proposals (or theirs evolved successors).

> (B) Ruby would stay general purpose language with direct threads support and
shared memory with documented memory model. The low-level documentation would
allow people (who are interested) to write different concurrent abstractions
efficiently. One of them would become the standard and preferred way how to
deal with concurrency in Ruby. Let's call it Blue. The Blue abstraction would
(as Red would) be part of the standard library. Same as Red it would have well
documented and defined behaviour in concurrent and parallel execution, but in
this case based on the lower-level model. The documentation would be directed
at all Ruby users and made as easy to understand as possible.

> Majority of the Ruby users would use Blue the go-to abstraction as they would
use the Red in scenario A. The key difference is that there is the low-level
model to support creation of new abstractions. Therefore when the Blue cannot
solve a particular issue a Ruby user can pick a different concurrency
abstraction created by somebody else and provided as a gem or create a new one.

I believe this would make the Ruby language more flexible.

### Difficulty of understanding

This is something I believe can be improved over time. Also as mentioned above
it's not intended to be used be everyone. Could you point me to parts which are
not understandable, or lack explanation? I would like to improve them, to make
the document more comprehensible.

The document is intentionally not as detailed and formal as JSR-133, to keep
understandability. The price is as you say and I agree in details and omissions
which may be left unspecified. I believe the high-level documentation for the
Red will unfortunately suffer the same problem of evil details.

If the memory model is reviewed by many people and given some time to mature, I
believe it will cover majority of the situations, omitted corner-cases can be
fixed later. I think the current situation is much worse when each
implementation has different rules and any document will improve the situation
greatly.

### Difficulty of implementation

(Various architectures) I am not a C programmer so I am not that well informed
but I believe that in C this is solved by C11 standard and before that by
various libraries. Can MRI use C11 in future when it'll be dropping GIL?

(Atomic Float) I agree that it is more difficult when Floats are required to by
atomic, but if they were not it would be quite a surprising to Ruby users that
a simple reference assignment of a Float object (as it's represented in Ruby)
is not atomic. Therefore this is chosen to be atomic purely not to surprise
users and to avoid educating users about torn reads/writes. Same applies to
Fixnum which is bigger than int and fits into long (using Java primitive names
here). Even though this is more difficult I think it makes sense to protect
users from concerning about torn reads/writes. The implementation itself should
be trivial on all 64-bit platforms, only 32-bit platforms will require some
tricks. This [1] post suggests that it can be done.

(Strict rules) The document tries to be balanced between restricting
optimisation and creating ugly surprises for users. I am expecting there will
be more discussion about the rules:

  - How to implement it on all Ruby implementations?
  - Will it prevent any optimisations?
  - Will it expose unexpected behaviour to users?

The document is really just a first draft and everything is open for discussion
and improvement which I both hope for. It was prepared not to limit any of the
Ruby implementations, but problems can be missed, if it turns out a rule is too
strict it can be relaxed.

(MRI with GIL) yes MRI already provides all of the guaranties specified thanks
to GIL. It's even stronger. On the other hand if I understand correctly MRI is
looking for ways how to remove GIL and the fact that GIL provides stronger
undocumented guarantees makes this difficult. Users rely on it (intentionally
or unintentionally) even though they shouldn't. Having a document describing
what is guaranteed and what not, may make easier transition to MRI without GIL
in future.

### In Conclusion

I hope that maybe I've changed your mind a little bit about the B scenario and
this proposal, that we could discuss more the issues this model could bring for
MRI. I would like to help to solve them or avoid them by relaxing rules.

I believe that if this model (or its evolved successor) is accepted in all Ruby
implementations over time, it will help the Ruby language a lot to be prepared
for concurrency and parallelism, which is nowadays non-optional.

[1] http://shipilev.net/blog/2014/all-accesses-are-atomic/

----------------------------------------
Feature #12020: Documenting Ruby memory model
https://bugs.ruby-lang.org/issues/12020#change-58170

* Author: Petr Chalupa
* Status: Assigned
* Priority: Normal
* Assignee: Koichi Sasada
----------------------------------------
Defining a memory model for a language is necessary to be able to reason about a program behavior in a concurrent or parallel environment. 

There was a document created describing a Ruby memory model for concurrent-ruby gem, which fits several Ruby language implementations. It was necessary to be able to build lower-level unifying layer that enables creation of concurrency abstractions. They can be implemented only once against the layer, which ensures that it runs on all Ruby implementations.

The Ruby MRI implementation has stronger undocumented guaranties because of GIL semantics than the memory model, but the few relaxations from MRIs behavior allow other implementations to fit the model as well and to improve performance.

This issue proposes to document the Ruby memory model. The above mentioned memory model document which was created for concurrent-ruby can be used as a starting point: https://docs.google.com/document/d/1pVzU8w_QF44YzUCCab990Q_WZOdhpKolCIHaiXG-sPw/edit#. Please comment in the document or here.

The aggregating issue of this effort can be found [here](https://bugs.ruby-lang.org/issues/12019).


-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>