From: hunter_spawn@... Date: 2020-11-26T14:53:07+00:00 Subject: [ruby-core:101096] [Ruby master Feature#17342] Hash#fetch_set Issue #17342 has been updated by MaxLap (Maxime Lapointe). I forgot to mention, but this pattern can also be more performant, as the key only needs to be hashed once for both the fetching and setting part. It's minor, but I do think of it everything I write the `fetch` pattern, and when the key is complex, this could be meaningful. (Dang, Eregon beat me to this by 2 minutes!) --- nobu (Nobuyoshi Nakada) wrote in #note-6: > Why not a separate class? > ```ruby > class Cache < Hash > def fetch(key, &block) > super(key) {self[key] = yield(key)} > end > end > ``` The Hash that we are using is not always under our control. In my second example, I'm using the `request_store` gem, which exposes as `Hash` (`RequestStore.store`), as cache. It would be quite dirty to monkey patch this. I could add that custom Hash in the `store`, but that would be losing the cleaner and shorter code benefits that I'm trying to achieve with this feature request. Doing this class is also less performant instead of being more. (This is a very minor point) Also: If I see somewhere `cache.fetch(key) { calculation }` I will instantly be worried: * Did the person forget to set the key? * Is it only set elsewhere? => Oh wait, it's a custom class that does something different. * Will one of my colleague copy this pattern without remembering that this is a different class? So if I was to make a different class, I would still use a different name just to avoid the frictions it would cause. --- jbeschi (jacopo beschi) wrote in #note-7: > `fetch_set` mixes the concept of query with the concept of command and I think it's not a good approach. Maybe in name it appears so, but not in spirit. All this does is set a value if one isn't already there, and in common Ruby spirit, it returns something that can be useful, which is the value at the key. This is where the Python name for the function comes from, `setdefault` sets a value "by default" for a single key, and is often used to replace `cache[key] ||= []` since Python doesn't support this syntax. --- Eregon (Benoit Daloze) wrote in #note-8: > Another name for this is `compute_if_absent`. True, however my thinking is this could also be used with a 2nd argument, like `fetch`, in which case there is no "computing" to speak of. > Another way to do this pattern is to use the default block: > ```ruby > RequestStore.store = Hash.new do |h, k| > h[k] = !MonitorValue.where('date >= ?', Time.now - 5.minutes).exists? > end > > RequestStore.store[:monitor_value_is_delayed?] > ``` > > Which already works fine. > And it has the advantage that if multiple places want to read from the Hash they don't have to repeat the code. > Is there a case this pattern wouldn't work and where `Hash#fetch_set` would work? RequestStore is used for lots of different things, setting a default like that means that if I use it for `RequestStore.store[:last_count]`, and that wasn't set, then I would instead be doing this MonitorValue check. And I can only use your pattern once per Hash. > > This pattern can be made to work with parallelism too, see [Idiomatic Concurrent Hash Operations](https://eregon.me/blog/assets/research/thesis-thread-safe-data-representations-in-dynamic-languages.pdf), page 83. > > Regarding concurrency and parallelism, we need to define the semantics if we add this method. > > Of course, the assignment should not be performed if there is already a key, it must be "put if absent" semantics > (`cache.fetch(key) { cache[key] = calculation }` is actually breaking that). I don't understand what you mean, how is it breaking that? You mean in the case of threading, where there could be 2 assignments if 2 threads go in at the same time? I don't think it's the job of the Hash to deal with this. > > The question is whether the given block can be executed multiple times for a given key. > If not, it requires synchronization while calling the block, which can lead to deadlocks. > If yes, it doesn't require synchronization while calling the block which seems safer, but it means the block can be called multiple times. I don't consider Hash to be a concurrency primitive in Ruby. So I wouldn't put any synchronization here. If synchronization is needed, it can be done from inside the block. ---------------------------------------- Feature #17342: Hash#fetch_set https://bugs.ruby-lang.org/issues/17342#change-88773 * Author: MaxLap (Maxime Lapointe) * Status: Open * Priority: Normal ---------------------------------------- I would like to propose adding the `fetch_set` method to `Hash`. It behaves just like `fetch`, but when using the default value (2nd argument or the block), it also sets the value in the Hash for the given key. We often use the pattern `cache[key] ||= calculation`. This pattern however has a problem when the calculation could return false or nil, as in those case, the calculation is repeated each time. I believe the best practice in that case is: ```ruby cache.fetch(key) { cache[key] = calculation } ``` With my suggestion, it would be: ```ruby cache.fetch_set(key) { calculation } ``` In these examples, each part is very short, so the `fetch` case is still clean. But as each part gets longer, the need to repeat cache[key] becomes more friction. Here is a more realistic example: ```ruby # Also using the key argument to the block to avoid repeating the # long symbol, adding some indirection RequestStore.store.fetch(:monitor_value_is_delayed?) do |key| RequestStore.store[key] = !MonitorValue.where('date >= ?', Time.now - 5.minutes).exists? end RequestStore.store.fetch_set(:monitor_value_is_delayed?) do !MonitorValue.where('date >= ?', Time.now - 5.minutes).exists? end ``` There is a precedent for such a method: Python has it, but with a quite confusing name: `setdefault(key, default_value)`. This does not set a default for the whole dictionary as the name would make you think, it really just does what is proposed here. https://docs.python.org/3/library/stdtypes.html#dict.setdefault -- https://bugs.ruby-lang.org/ Unsubscribe: