From: email@... Date: 2016-04-20T17:40:07+00:00 Subject: [ruby-core:75047] [CommonRuby Feature#12020] Documenting Ruby memory model Issue #12020 has been updated by Petr Chalupa. Thank you for responding and for taking time to read the proposal. Let me start by elaborating more on the motivation behind all of the related proposals, since I did not really explained it in detail when I was opening them. I apologise for not doing that sooner. ### Motivation I would like to clear up a possible misunderstanding about the target users of this document and this memory model. It's not intended to be directly used by majority of the Ruby programmers. (Even though the document aims to be understandable it will still be difficult topic.) It's intended to be used by concurrency enthusiasts, giving them tools to build many different concurrency abstractions as gems. At this point Ruby is a general purpose language, with direct support for Threads and shared memory. As it was announced in few presentations, there are plans to add new easy to use abstraction to Ruby in some future release and maybe deprecate Threads. Lets call this scenario A. (Block-quotes are used for better logical structure.) > (A) I understand the need to add such abstraction (actors, channels, other?) to Ruby to enable Ruby users to build concurrent applications with ease. For future reference let's call the one future abstraction Red. The Red would then have well documented and defined behaviour in concurrent and parallel execution (This is what I think you are referring to). However providing just one abstraction in standard library (and deprecating Threads) will hurt usability of Ruby language. > The problems lies in that there is no single concurrency abstraction which would fit all problems. Therefore providing just Red will left Ruby language suitable to just some subset of problems. > Assuming: Only the Red would be documented and providing high-level guarantees; threads would be deprecated; low-level concurrency would not be documented and guaranteed. Developers who would like to contribute new abstraction to solve another group of problems would be left with following (I think not very good) choices: > > (1) Implement the abstraction in underlying language used for the particular Ruby implementation (in C for MRI, in Java for JRuby(+Truffle)) using guarantees provided by the underlying language. Meaning the author of the new abstraction has to understand 3 programming languages (C, Ruby, Java) and 3 implementations to develop the implementation 3 times. That would discourage people and also make the whole process error prone and difficult. > > (2) Implement the abstraction using the Red. This approach gives users the desired abstraction (avoiding using different languages and understanding implementation details) but it will probably have bad performance since the Red is not suited to solve this problem. For example implementing ConcurrentHashMap (allowing parallel reads and writes) with actors would perform badly. (Admittedly this is a little extreme example, but it demonstrates the problem and I could not think of a better one.) The above is to best of my knowledge where Ruby is heading in future, please correct me if I misunderstood and/or misrepresented it in any way. To avoid the above outlined difficulties Ruby could take a different path, which is related to these proposals (or theirs evolved successors). > (B) Ruby would stay general purpose language with direct threads support and shared memory with documented memory model. The low-level documentation would allow people (who are interested) to write different concurrent abstractions efficiently. One of them would become the standard and preferred way how to deal with concurrency in Ruby. Let's call it Blue. The Blue abstraction would (as Red would) be part of the standard library. Same as Red it would have well documented and defined behaviour in concurrent and parallel execution, but in this case based on the lower-level model. The documentation would be directed at all Ruby users and made as easy to understand as possible. > Majority of the Ruby users would use Blue the go-to abstraction as they would use the Red in scenario A. The key difference is that there is the low-level model to support creation of new abstractions. Therefore when the Blue cannot solve a particular issue a Ruby user can pick a different concurrency abstraction created by somebody else and provided as a gem or create a new one. I believe this would make the Ruby language more flexible. ### Difficulty of understanding This is something I believe can be improved over time. Also as mentioned above it's not intended to be used be everyone. Could you point me to parts which are not understandable, or lack explanation? I would like to improve them, to make the document more comprehensible. The document is intentionally not as detailed and formal as JSR-133, to keep understandability. The price is as you say and I agree in details and omissions which may be left unspecified. I believe the high-level documentation for the Red will unfortunately suffer the same problem of evil details. If the memory model is reviewed by many people and given some time to mature, I believe it will cover majority of the situations, omitted corner-cases can be fixed later. I think the current situation is much worse when each implementation has different rules and any document will improve the situation greatly. ### Difficulty of implementation (Various architectures) I am not a C programmer so I am not that well informed but I believe that in C this is solved by C11 standard and before that by various libraries. Can MRI use C11 in future when it'll be dropping GIL? (Atomic Float) I agree that it is more difficult when Floats are required to by atomic, but if they were not it would be quite a surprising to Ruby users that a simple reference assignment of a Float object (as it's represented in Ruby) is not atomic. Therefore this is chosen to be atomic purely not to surprise users and to avoid educating users about torn reads/writes. Same applies to Fixnum which is bigger than int and fits into long (using Java primitive names here). Even though this is more difficult I think it makes sense to protect users from concerning about torn reads/writes. The implementation itself should be trivial on all 64-bit platforms, only 32-bit platforms will require some tricks. This [1] post suggests that it can be done. (Strict rules) The document tries to be balanced between restricting optimisation and creating ugly surprises for users. I am expecting there will be more discussion about the rules: - How to implement it on all Ruby implementations? - Will it prevent any optimisations? - Will it expose unexpected behaviour to users? The document is really just a first draft and everything is open for discussion and improvement which I both hope for. It was prepared not to limit any of the Ruby implementations, but problems can be missed, if it turns out a rule is too strict it can be relaxed. (MRI with GIL) yes MRI already provides all of the guaranties specified thanks to GIL. It's even stronger. On the other hand if I understand correctly MRI is looking for ways how to remove GIL and the fact that GIL provides stronger undocumented guarantees makes this difficult. Users rely on it (intentionally or unintentionally) even though they shouldn't. Having a document describing what is guaranteed and what not, may make easier transition to MRI without GIL in future. ### In Conclusion I hope that maybe I've changed your mind a little bit about the B scenario and this proposal, that we could discuss more the issues this model could bring for MRI. I would like to help to solve them or avoid them by relaxing rules. I believe that if this model (or its evolved successor) is accepted in all Ruby implementations over time, it will help the Ruby language a lot to be prepared for concurrency and parallelism, which is nowadays non-optional. [1] http://shipilev.net/blog/2014/all-accesses-are-atomic/ ---------------------------------------- Feature #12020: Documenting Ruby memory model https://bugs.ruby-lang.org/issues/12020#change-58170 * Author: Petr Chalupa * Status: Assigned * Priority: Normal * Assignee: Koichi Sasada ---------------------------------------- Defining a memory model for a language is necessary to be able to reason about a program behavior in a concurrent or parallel environment. There was a document created describing a Ruby memory model for concurrent-ruby gem, which fits several Ruby language implementations. It was necessary to be able to build lower-level unifying layer that enables creation of concurrency abstractions. They can be implemented only once against the layer, which ensures that it runs on all Ruby implementations. The Ruby MRI implementation has stronger undocumented guaranties because of GIL semantics than the memory model, but the few relaxations from MRIs behavior allow other implementations to fit the model as well and to improve performance. This issue proposes to document the Ruby memory model. The above mentioned memory model document which was created for concurrent-ruby can be used as a starting point: https://docs.google.com/document/d/1pVzU8w_QF44YzUCCab990Q_WZOdhpKolCIHaiXG-sPw/edit#. Please comment in the document or here. The aggregating issue of this effort can be found [here](https://bugs.ruby-lang.org/issues/12019). -- https://bugs.ruby-lang.org/ Unsubscribe: