From: ko1@... Date: 2020-10-31T19:05:47+00:00 Subject: [ruby-core:100681] [Ruby master Feature#17298] Ractor's basket communication APIs Issue #17298 has been updated by ko1 (Koichi Sasada). Eregon (Benoit Daloze) wrote in #note-4: > For the first example, isn't `move: true` much simpler? There are two problems: * Now, many types are not support on `move: true`. I'll investigate more, but not sure we can support for all data types. * To move the object, we need traverse the object each time and make shallow copy for it. For example, we need to traverse a nesting array. > Having `move: true` on `Ractor.yield` means we need to wait to know which Ractor to move the object to, but that seems fine and transparent to the user. > The bridge Ractor can ensure it will no longer refer to the message, so it's perfectly safe to move there. If we analyze cross-method escaping, it can be. But now there is no supported yet. ```ruby bridge = Ractor.new do msg = Ractor.receive Ractor.yield msg, move: true end ``` > Adding 4 new methods for this seem heavy to me. Also, exposing the serialized representation seems bad. "serialized representation" is not exposed. It only says the bridge ractor doesn't touch the message. > Explicitly deeply freezing the message (e.g., with Ractor.make_shareable, or some kwarg to freeze on send/yield) seems a good way too, and easier to reason about. > Sending a copy of a mutable object seems useless to me, because mutations will not affect the sender, it only mutates a copy. > So either freezing or moving seems more useful, and both avoid copying. I agree to add an option calling `make_shareable` to `send` and `yield`. It is already filed in my working memo. However, it doesn't mean we can avoid copying. I believe sending copy is the most used option because there is no difficulties. Sender does not need to care about accessing to the sent object after sending. ---- Eregon (Benoit Daloze) wrote in #note-5: > From the benchmark above, USE_BASKET=false takes for me `5.636s`. > Adding `Ractor.make_shareable(ary)`, it takes `0.192s`. > So that's ~10x faster than with `w/ basket API` on this benchmark, and does not require new APIs and concepts. Of course, it is faster. > So I don't see the need for this API, better to freeze or to move, and both are more efficient. > Copying is confusing anyway if mutated. As I wrote, I believe copying should be the first option. > Maybe all messages should be made `Ractor.make_shareable` by `Ractor#send`/`Ractor.yield`, unless `move: true` is used? > Then there would be no confusion about mutations, and there would be no hidden cost to sending a message (copying a big data structure can take a long time as we see here). > Moving still has a cost proportional to the #objects in the object graph, which seems unavoidable, but at least it does not need to copy e.g. the array storage. It can be one option, but I think it is introducing huge side-effect. > Do other languages have something similar to the basket API? I don't have an idea. I don't have any * copying is used for IPC * immutable data is used for traditional actor models * only reference (pointer) is used for shared-everything model I don't know our mixing model. ---- Eregon (Benoit Daloze) wrote in #note-6: > In the example above, using `, move: true` for Ractor.yield and Ractor#send, instead of the `_basket` calls seems to give the same or better performance. > (code at https://gist.github.com/eregon/092ea76534b46e227d9cbf5fd107de66) > It runs in 1.578s for me. ```ruby AN = 1_000 LN = 100 ary = Array.new(AN){Array.new(AN)} ``` Changing the parameter, it takes: ``` use basket: 0m6.047s use move : 0m8.018s ``` There is a bit difference. > And it requires less modifications than the `_basket` APIs which need both sides to know about it, not just the sending side. In general, proposed APIs should be hidden in framework, I think. ---------------------------------------- Feature #17298: Ractor's basket communication APIs https://bugs.ruby-lang.org/issues/17298#change-88322 * Author: ko1 (Koichi Sasada) * Status: Open * Priority: Normal ---------------------------------------- This ticket proposes send_basket/send_receive, yield_basket/take_basket APIs to make effective and flexible bridge ractors. ## Background When we want to send an object as a message, usually we need to copy it. Copying is achieved by marshal protocol, and receiver load it immediately. If we want to make a bridge ractor which receive a message and send it to another ractor, the immediate loading is not effective. ```ruby bridge = Ractor.new do Ractor.yield Ractor.receive end consumer = Ractor.new bridge do |from| obj = from.take do_task(obj) end msg = [1, 2, 3] bridge.send msg ``` In this case, the array (`[1, 2, 3]`) is * (1) dumped at the first `bridge.send msg` * (2) loaded at `Ractor.receive` * (3) dumped again at `Ractor.yield` * (4) laoded at `from.take` Essentially we only need one dump/load pair, but now it needs 2 pairs. Mixing "moving" is more complex. Now there is no way to pass the "moving" status to the bridge ractors, we can not make a moving bridge. ## Proposal To make more effective and flexible bridge ractors, we propose new basket APIs * `Ractor.receive_basket` * `Ractor#send_basket` * `Ractor.take_basket` * `Ractor.yield_basket` They receive a message, but remaining dumped state and send it without dumping again. We can rewrite the above example with these APIs. ```ruby bridge = Ractor.new do Ractor.yield_basket Ractor.receive_basket end consumer = Ractor.new bridge do |from| obj = from.take do_task(obj) end msg = [1, 2, 3] bridge.send msg ``` In this case, * (1) dumped at the first `bridge.send msg` * (2) laoded at `from.take` we only need one dump/load pair. ## Implementation https://github.com/ruby/ruby/pull/3725 ## Evaluation The following program makes 4 type of bridges and pass an array as a message through them. ```ruby USE_BASKET = false receive2yield = Ractor.new do loop do if USE_BASKET Ractor.yield_basket Ractor.receive_basket else Ractor.yield Ractor.receive end end end receive2send = Ractor.new receive2yield do |r| loop do if USE_BASKET r.send_basket Ractor.receive_basket else r.send Ractor.receive end end end take2yield = Ractor.new receive2yield do |from| loop do if USE_BASKET Ractor.yield_basket from.take_basket else Ractor.yield from.take end end end take2send = Ractor.new take2yield, Ractor.current do |from, to| loop do if USE_BASKET to.send_basket from.take_basket else to.send from.take end end end AN = 1_000 LN = 10_000 ary = Array.new(AN) # 1000 LN.times{ receive2send << ary Ractor.receive } # This program passes the message as: # main -> # receive2send -> # receive2yield -> # take2yield -> # take2send -> # main ``` The result is: ``` w/ basket API 0m2.056s w/o basket API 0m5.974s ``` on my machine (=~ x3 faster). (BTW, if we have a TVar, we can change the value `USE_BASKET` dynamically) ## Discussion ### naming Of course, naming is an issue. Now, I named "_basket" because source code using this terminology. There are other candidates: * container metaphor * package * parcel * box * envelope * packet (maybe bad idea because of confusion of networking) * bundle (maybe bad idea because of confusion of bin/bundle) * "don't touch the content" metaphor * raw * sealed * unopened I like "basket" because I like picnic. ### feature Now, basket is represented by "Ractor::Basket" and there is no methods. We can add the following feature: * `Ractor::Basket#sender` return the sending ractor. * `Ractor::Basket#sender = a_ractor` change the sending ractor. * `Ractor::Basket#value` returns the content. There was another proposal `Ractor.recvfrom`, but we only need these APIs. -- https://bugs.ruby-lang.org/ Unsubscribe: