From: eregontp@...
Date: 2020-11-26T15:29:54+00:00
Subject: [ruby-core:101099] [Ruby master Feature#17342] Hash#fetch_set

Issue #17342 has been updated by Eregon (Benoit Daloze).


MaxLap (Maxime Lapointe) wrote in #note-9:
> The Hash that we are using is not always under our control. In my second example, I'm using the `request_store` gem, which exposes as `Hash` (`RequestStore.store`), as cache.

I see, https://github.com/steveklabnik/request_store, so you might want a different block per key essentially, and so a single default block isn't enough.

> It would be quite dirty to monkey patch this.

There is `Hash#default_proc=` but it would be dirty in this case indeed.

> I don't understand what you mean, how is it breaking that? You mean in the case of threading, where there could be 2 assignments if 2 threads go in at the same time? I don't think it's the job of the Hash to deal with this.

Yes, exactly.
Many gems assume Hash is thread-safe, so in my opinion it needs to be thread-safe, and it is at least in CRuby.
Even without threads you need to consider cases like the block itself would already set the key, and you shouldn't set it again.

> I don't consider Hash to be a concurrency primitive in Ruby. So I wouldn't put any synchronization here. If synchronization is needed, it can be done from inside the block (or the block of the classic `fetch` pattern).

This method fits very clearly for caches, and caches are typically used for multiple thread, so I think we should consider the concurrency semantics.
Concurrency can also happen for Hash when mutating while iterating, or some methods can be called recursively, so even without threads it needs to be considered.
Agreed no synchronization when calling the block is better, and the block can make its own synchronization to avoid running twice per key if really needed.

On CRuby, the GIL will be enough to do a check if the key is already set just before setting it from the result of the block, but the code needs to be careful there is no Ruby call between that check and assignment, or it would break with multiple threads.

----------------------------------------
Feature #17342: Hash#fetch_set
https://bugs.ruby-lang.org/issues/17342#change-88776

* Author: MaxLap (Maxime Lapointe)
* Status: Open
* Priority: Normal
----------------------------------------
I would like to propose adding the `fetch_set` method to `Hash`. It behaves just like `fetch`, but when using the default value (2nd argument or the block), it also sets the value in the Hash for the given key.

We often use the pattern `cache[key] ||= calculation`. This pattern however has a problem when the calculation could return false or nil, as in those case, the calculation is repeated each time.

I believe the best practice in that case is:

```ruby
cache.fetch(key) { cache[key] = calculation }
```

With my suggestion, it would be:

```ruby
cache.fetch_set(key) { calculation }
```

In these examples, each part is very short, so the `fetch` case is still clean. But as each part gets longer, the need to repeat cache[key] becomes more friction.

Here is a more realistic example:

```ruby
# Also using the key argument to the block to avoid repeating the
# long symbol, adding some indirection
RequestStore.store.fetch(:monitor_value_is_delayed?) do |key|
  RequestStore.store[key] = !MonitorValue.where('date >= ?', Time.now - 5.minutes).exists?
end

RequestStore.store.fetch_set(:monitor_value_is_delayed?) do
  !MonitorValue.where('date >= ?', Time.now - 5.minutes).exists?
end
```

There is a precedent for such a method: Python has it, but with a quite confusing name: `setdefault(key, default_value)`. This does not set a default for the whole dictionary as the name would make you think, it really just does what is proposed here. https://docs.python.org/3/library/stdtypes.html#dict.setdefault


-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>