From: email@...
Date: 2016-05-19T21:02:05+00:00
Subject: [ruby-core:75619] [CommonRuby Feature#12020] Documenting Ruby	memory model

Issue #12020 has been updated by Petr Chalupa.


Koichi Sasada wrote:
> Sorry for late response.
> 
> Petr Chalupa wrote:
> > Let me start by elaborating more on the motivation behind all of the related
> > proposals, since I did not really explained it in detail when I was opening
> > them. I apologise for not doing that sooner.
> 
> No problem. Thank you for your explanation.
> 
> > ### Motivation
> > 
> > I would like to clear up a possible misunderstanding about the target users of
> > this document and this memory model. It's not intended to be directly used by
> > majority of the Ruby programmers. (Even though the document aims to be
> > understandable it will still be difficult topic.) It's intended to be used by
> > concurrency enthusiasts, giving them tools to build many different concurrency
> > abstractions as gems.
> 
> I (may) understand what you want to say.
> As you wrote:
> 
> > Even though the document aims to be understandable it will still be difficult topic
> 
> I agree with that, and I believe most of us can't understand and guarantee all of specifications.
> At least I don't believe I can implement it.
> (Of course, it is because of my low skill. Somebody should be able to implement it)

Luckily there are other languages with their memory models, where the
guaranties are already provided. Their working solutions can be reused and
applied in MRI.
 
> > At this point Ruby is a general purpose language, with direct support for
> > Threads and shared memory. As it was announced in few presentations, there are
> > plans to add new easy to use abstraction to Ruby in some future release and
> > maybe deprecate Threads. Lets call this scenario A. (Block-quotes are used for
> > better logical structure.)
> > 
> > > (A) I understand the need to add such abstraction (actors, channels, other?)
> > to Ruby to enable Ruby users to build concurrent applications with ease. For
> > future reference let's call the one future abstraction Red. The Red would then
> > have well documented and defined behaviour in concurrent and parallel execution
> > (This is what I think you are referring to). However providing just one
> > abstraction in standard library (and deprecating Threads) will hurt usability
> > of Ruby language.
> > 
> > > The problems lies in that there is no single concurrency abstraction which
> > would fit all problems. Therefore providing just Red will left Ruby language
> > suitable to just some subset of problems.
> > 
> > > Assuming: Only the Red would be documented and providing high-level
> > guarantees; threads would be deprecated; low-level concurrency would not be
> > documented and guaranteed. Developers who would like to contribute new
> > abstraction to solve another group of problems would be left with following (I
> > think not very good) choices:
> 
> I agree the flexibility should be decreased. 

Just to make sure we understand each other. I agree that the flexibility should
be decreased for users so they can write their concurrent code with ease. I
believe we disagree on which level it should be achieved on though. I am
advocating for library level, you I think for language level.

> 
> > > > (1) Implement the abstraction in underlying language used for the
> > particular Ruby implementation (in C for MRI, in Java for JRuby(+Truffle))
> > using guarantees provided by the underlying language. Meaning the author of the
> > new abstraction has to understand 3 programming languages (C, Ruby, Java) and 3
> > implementations to develop the implementation 3 times. That would discourage
> > people and also make the whole process error prone and difficult.
> > 
> > > > (2) Implement the abstraction using the Red. This approach gives users the
> > desired abstraction (avoiding using different languages and understanding
> > implementation details) but it will probably have bad performance since the Red
> > is not suited to solve this problem. For example implementing ConcurrentHashMap
> > (allowing parallel reads and writes) with actors would perform badly.
> > (Admittedly this is a little extreme example, but it demonstrates the problem
> > and I could not think of a better one.)
> > 
> > The above is to best of my knowledge where Ruby is heading in future, please
> > correct me if I misunderstood and/or misrepresented it in any way.
> > 
> > To avoid the above outlined difficulties Ruby could take a different path,
> > which is related to these proposals (or theirs evolved successors).
> 
> I understand your concerns. I agree there are such disadvantages.
> 
> However, I believe productivity by avoiding shared-everything will help programmers.
> 
> For (1), I agree there is such difficulties.
> I don't have any comment on it.
> Yes, there is.
> 
> For (2), you mentioned about performance.
> However, I believe Ruby should contribute programmer's happiness.
> I believe performance is not a matter.
> 
> It seems strange because parallelism is for performance.
> I assume such drawback can be overcame with (a) design patterns (b) parallelism (# of cores).

I am using Ruby for 10 years (thank You!) and I see and understand the big
benefit of Ruby caring about programmer's happiness. I care about it very much
too and I try to avoid any suggestions which would lead to sacrificing it. I
think that so far all of the proposals were shaped by user happiness and
performance. (For example: the discussion about volatile constants in this
issue, current rules are harder to implement but better for users.) If it's not
true I would like to fix it.

Regarding (2), Users may sacrifice some performance but in this case it might
perform quite badly. Few examples for consideration follow:

(clojure agents) Implementation of agents using actors: since agent has to be
able to report it's value at any time it would need to be modeled using at
least 2 actors: one to hold and report the value, second to process the updates.

(go channels) Implementing go sized channel using actors: The channel is
blocking. The channel is represented with one or two actors. One is simpler but
has higher contention, using two actors it avoids some contenting between head
and tail of the channel. To simulate blocking: actors, which are sending
messages to the channel, will not continue with other message processing until
they receive confirmation from the channel that they can continue, that they
are not blocked. Actors waiting on messages from channel would have to send
challenge to the channel that they want to receive a message from channel and
do not process other messages until they receive the message from channel.

My intuition is that the slowdown will be 2x and *more* (I'll do some tests).
The outlined implementations are much more complex compared to the conventional
implementation using shared memory.

It also touches another issue, for some problems just one abstraction will
inevitably lead to some awkward usage patterns and unnecessary complexity for
users, where the Red abstraction does not provide any natural way of support for
solving the problem.

(actor future state) Staying with the hypothetic actors as Red example for one
more paragraph to support previous claim. Supposing there is an application
with some state and background processing. Actors support state and events
generated based on the state changes naturally. However they are not the best
choice to model background processing. An actor doing a background job isn't
responsive to any messages during the execution, therefore the first step is to
always break up state actors and it's background processing to two actors. Then
the actor responsible for background processing is just a wrapper around an
asynchronously executed function without any state, which might be better
modeled by just a block executed on a thread-pool for IO tasks or by a Future
object. Another issue could arise if one general actor is used to process all
the background jobs (which is a good thought at first glance), the actor will
become bottleneck allowing to execute just one background job at a time (tasks
with blocking IO can also easily deadlock it). Easy fix is to introduce a pool
of actors to process background jobs, but then again they will be slower than
shared-memory thread-pool implementation. 

Of course all of these examples were for actors not for Red. It's not directly
applicable, but it shows what kind of problems can be (I think unavoidably)
anticipated for Red.

It's not always possible to just throw more cores at the problem, the algorithm
has to support such scaling.

Going back to user happiness, the scenario A sacrifices happiness of some users:

(group1) Concurrent library implementers, because of (1) and (2). This is
probably not a biggest group of users but I think it's an important group since
their work will be used by *many* users.

(group2) Second group is larger, it's users which would like to use Ruby to
solve a problem where Red would be of limited help. These users will be looking
for alternative solution and will be disappointed that the choice will be
severely limited, because group1 will not write new abstractions (admittedly
this is just my projection).

Therefore scenario A does not have just positive impact on user happiness
(those users, whose problems fit well to be solved by Red (probably bigger
group than group1 and group2 combined)).

Since Ruby is nowadays mostly used in long running processes not in scripts the
performance becomes more important. In my observation performance is the most
common reason why people leave Ruby, not because they are unhappy with the
language but because they pay too much for their servers to run their
applications.

> I also propose problem issue (3).
> We need more time to discuss to introduce new abstraction.
> (The biggest problem is I couldn't propose Red specifications)
> We need more learning cost and need to invent efficient patterns using Red.
> 
> Thread model is well known (I don't say thread model is easy to use :p).
> This is clearly advantage of thread model.
> 
> 
> I agree there are many issues (1 to 3, and moer).
> But I believe the productivity by simplicity is most important (for me, a ruby programmer).
 
It looks like for the purpose of this discussion I should know more about what
is considered to become Red. Later you mention sharing nothing, how would that
work for classes, constants, method definitions etc.? How would the isolated
parts communicate with each other, deep freezing or copying the messages? Are
there any sources like talks or issues I could read?

When I first heard about Red being planned I was thinking about deep-freezing
or deep-cloning to ensure messages cannot lead to shared memory issues, Red
being actors, channels etc. and isolation achieved only by convention and user
education.

Yeah (3) will take time, that's a common problem for both A and B scenarios. B
might be in a better situation though because more people can get involved
writing more abstractions until a winning one is picked and becomes part of
Ruby standard library.
 
> > > (B) Ruby would stay general purpose language with direct threads support and
> > shared memory with documented memory model. The low-level documentation would
> > allow people (who are interested) to write different concurrent abstractions
> > efficiently. One of them would become the standard and preferred way how to
> > deal with concurrency in Ruby. Let's call it Blue. The Blue abstraction would
> > (as Red would) be part of the standard library. Same as Red it would have well
> > documented and defined behavior in concurrent and parallel execution, but in
> > this case based on the lower-level model. The documentation would be directed
> > at all Ruby users and made as easy to understand as possible.
> > 
> > > Majority of the Ruby users would use Blue the go-to abstraction as they would
> > use the Red in scenario A. The key difference is that there is the low-level
> > model to support creation of new abstractions. Therefore when the Blue cannot
> > solve a particular issue a Ruby user can pick a different concurrency
> > abstraction created by somebody else and provided as a gem or create a new one.
> > 
> > I believe this would make the Ruby language more flexible.
> 
> I agree it is flexible.
> However it will be error prone if shared-everything model is allowed.

Currently Ruby has shared memory, how would that be taken away? It would be
*huge* incompatible change, I believe.

Yeah it is difficult to use, but I would like to stress that it's only for
group1 (mentioned above). Most of the users would not have to deal with it
since they'll use just one of the available abstractions (blue being the most
common one and advised by Ruby to be used).
 
> > ### Difficulty of understanding
> > 
> > This is something I believe can be improved over time. Also as mentioned above
> > it's not intended to be used be everyone. Could you point me to parts which are
> > not understandable, or lack explanation? I would like to improve them, to make
> > the document more comprehensible.
> > 
> > The document is intentionally not as detailed and formal as JSR-133, to keep
> > understandability. The price is as you say and I agree in details and omissions
> > which may be left unspecified. I believe the high-level documentation for the
> > Red will unfortunately suffer the same problem of evil details.
> > 
> > If the memory model is reviewed by many people and given some time to mature, I
> > believe it will cover majority of the situations, omitted corner-cases can be
> > fixed later. I think the current situation is much worse when each
> > implementation has different rules and any document will improve the situation
> > greatly.
> 
> To point out, I need to read more carefully and try to implement with parallel threads.
> (evils will be in implementation details)

Thanks a lot, I really appreciate that you are looking at it in more detail and
that you are willing to discuss this in length.
 
> > ### Difficulty of implementation
> > 
> > (Various architectures) I am not a C programmer so I am not that well informed
> > but I believe that in C this is solved by C11 standard and before that by
> > various libraries. Can MRI use C11 in future when it'll be dropping GIL?
> 
> Not sure, sorry.
> From CPU architecture, there are several overhead for strong memory consistency.

Yeah there are, but comparing GIL vs noGIL, running on all cores with some
slight overhead is advantageous.

> > (Atomic Float) I agree that it is more difficult when Floats are required to by
> > atomic, but if they were not it would be quite a surprising to Ruby users that
> > a simple reference assignment of a Float object (as it's represented in Ruby)
> > is not atomic. Therefore this is chosen to be atomic purely not to surprise
> > users and to avoid educating users about torn reads/writes. Same applies to
> > Fixnum which is bigger than int and fits into long (using Java primitive names
> > here). Even though this is more difficult I think it makes sense to protect
> > users from concerning about torn reads/writes. The implementation itself should
> > be trivial on all 64-bit platforms, only 32-bit platforms will require some
> > tricks. This [1] post suggests that it can be done.

Atomic float and C11: as far as I know if it's declared as atomic float but
operations load and write are done with memory_order_relaxed then it keeps
atomicity property without any ordering constrains, therefore without
performance overhead (there might be exceptions but even 32bit platforms can
use some tricks like SSE instruction to make the float atomic without overhead).
 
> I assume that there are pros and cons about performance.
>
> Shared everything model (thread-model)
> * Pros. we can share everything easily.
> * Cons. requires fine-grain consistency control for some data structures to guarantee memory model.
> 
> Shared nothing model (Red):
> * Pros. Do not need to care fine grain memory consistency
> * Cons. we can't implement shared data structures in Ruby (sometimes, it can be performance overhead).

I am sorry, but I'm not sure how to interpret the comparison. It's important to
distinguish where does the pros. and cons. apply. In thread-model case, Pros.
applies to the users and Cons. applies to the Ruby implementers. In Red, Pros.
applies to the implementers and Cons. to the Ruby users. Shared everything
comes out better in this comparison emphasizing users.

Regarding the implementor's point of view, I appreciate the amount of work and
complexity this will be creating. I am part of the JRuby+Truffle team and we
would have to comply and deal with RMM too. Still I believe it's worth the
effort.
 
> > (Strict rules) The document tries to be balanced between restricting
> > optimisation and creating ugly surprises for users. I am expecting there will
> > be more discussion about the rules:
> > 
> >   - How to implement it on all Ruby implementations?
> >   - Will it prevent any optimisations?
> >   - Will it expose unexpected behaviour to users?
> > 
> > The document is really just a first draft and everything is open for discussion
> > and improvement which I both hope for. It was prepared not to limit any of the
> > Ruby implementations, but problems can be missed, if it turns out a rule is too
> > strict it can be relaxed.
> > 
> > (MRI with GIL) yes MRI already provides all of the guaranties specified thanks
> > to GIL. It's even stronger. On the other hand if I understand correctly MRI is
> > looking for ways how to remove GIL and the fact that GIL provides stronger
> > undocumented guarantees makes this difficult. Users rely on it (intentionally
> > or unintentionally) even though they shouldn't. Having a document describing
> > what is guaranteed and what not, may make easier transition to MRI without GIL
> > in future.
> 
> I agree Ruby programmers can rely on GIL guarantees (and it is not good for other implementation).

Yeah, alternative implementations may suffer the issue of users relying on GIL
already. In practice it may not be that bad though, at least in code which is
meant to be run concurrently or on parallel. These libraries tend to use slower
Mutexes to stay safe, because instance variables do not have precisely defined
behavior. (This is just my personal view, we should ask Charles and Tom how
often this came up in their issues.)
 
> BTW, such strong GIL guarantee helps people from some kind of thread-safety bugs.
> ("help" means decreasing bug appearance rate. As you wrote, it is also "bad" thing)
> 
> > ### In Conclusion
> > 
> > I hope that maybe I've changed your mind a little bit about the B scenario and
> > this proposal, that we could discuss more the issues this model could bring for
> > MRI. I would like to help to solve them or avoid them by relaxing rules.
> > 
> > I believe that if this model (or its evolved successor) is accepted in all Ruby
> > implementations over time, it will help the Ruby language a lot to be prepared
> > for concurrency and parallelism, which is nowadays non-optional.
> > 
> > [1] http://shipilev.net/blog/2014/all-accesses-are-atomic/
> 
> I don't change my mind.
> I believe simplicity is more important than flexibility.

I am of the same opinion simplicity is important for users, however I think we
(whole Ruby community no matter the implementation) could have both simplicity
and flexibility.
 
> However, your comments clear many kind of things.
> I agree that many people agree with you.
> 
> Again, my comment is only my thoughts.
> I don't against B scenario for other implementations, and for MRI if someone contribute.

To sum up regarding contribution, headius was so kind and offered to work on
the accompanying proposals since he has an experience with C which I have only
limited. I think the current form of the Ruby Memory Model fits MRI with GIL
so no contribution should be needed. 

I suppose you meant contributing a work on removing GIL and ensuring RMM
compliance?

Benoit Daloze (eregon) and I will gladly help to find solutions if needed.
 
> Actually, sometime Matz said he want to go B scenario.
> He proposed Actors on threads (people should care to modify objects inter actors (threads)).
> Same approach of Cellroid.
> But I'm against on it :p (and Matz said he agree with me, when I asked. I'm not sure current his idea)

We could also scope down the discussion to just the most important parts of
RMM, which (I think) are local and instance variables, rest of the related
proposals are mostly related to them.

I've also posted a comment to https://bugs.ruby-lang.org/issues/12021, it
provides an example of how the low-level model could be used to support a
simple and nice high-level behavior of Proc.

----------------------------------------
Feature #12020: Documenting Ruby memory model
https://bugs.ruby-lang.org/issues/12020#change-58750

* Author: Petr Chalupa
* Status: Assigned
* Priority: Normal
* Assignee: Koichi Sasada
----------------------------------------
Defining a memory model for a language is necessary to be able to reason about a program behavior in a concurrent or parallel environment. 

There was a document created describing a Ruby memory model for concurrent-ruby gem, which fits several Ruby language implementations. It was necessary to be able to build lower-level unifying layer that enables creation of concurrency abstractions. They can be implemented only once against the layer, which ensures that it runs on all Ruby implementations.

The Ruby MRI implementation has stronger undocumented guaranties because of GIL semantics than the memory model, but the few relaxations from MRIs behavior allow other implementations to fit the model as well and to improve performance.

This issue proposes to document the Ruby memory model. The above mentioned memory model document which was created for concurrent-ruby can be used as a starting point: https://docs.google.com/document/d/1pVzU8w_QF44YzUCCab990Q_WZOdhpKolCIHaiXG-sPw/edit#. Please comment in the document or here.

The aggregating issue of this effort can be found [here](https://bugs.ruby-lang.org/issues/12019).


-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>