From: jean.boussier@... Date: 2021-05-20T06:46:09+00:00 Subject: [ruby-core:103905] [Ruby master Feature#16038] Provide a public WeakMap that compares by equality rather than by identity Issue #16038 has been updated by byroot (Jean Boussier). > Your example always allocates a new instance first, then deduplicates it. Yes, for more complex cases that's kind of the only way. What I'm after here is mostly memory retention, not so much allocations. > Why not: Yes, I used this in some cases when the number of property was slow enough. But: - The object need to hold that string. - That just doubled the number of retained objects, just like `WeakRef`. - Not everything is well suited to be concatenated as a string like this. And even for your approach, equality based WeakMap would offer a much cleaner interface: ```ruby def new(x, y, z) REGISTRY[[x, y, z]] ||= super end ``` ---------------------------------------- Feature #16038: Provide a public WeakMap that compares by equality rather than by identity https://bugs.ruby-lang.org/issues/16038#change-92033 * Author: byroot (Jean Boussier) * Status: Open * Priority: Normal ---------------------------------------- I know `ObjectSpace::WeakMap` isn't really supposed to be used, and that the blessed interface is `WeakRef`. However, I'd like to make a case for a better public WeakMap. ### Usage As described in [Feature #16035], `WeakMap` is useful for deduplicating "value objects". A typical use case is as follows: ```ruby class Position REGISTRY = {} private_constant :REGISTRY class << self def new(*) instance = super REGISTRY[instance] ||= instance end end attr_reader :x, :y, :z def initialize(x, y, z) @x = x @y = y @z = z freeze end def hash self.class.hash ^ x.hash >> 1 ^ y.hash >> 2 ^ y.hash >> 3 end def ==(other) other.is_a?(Position) && other.x == x && other.y == y && other.z == z end alias_method :eql?, :== end p Position.new(1, 2, 3).equal?(Position.new(1, 2, 3)) ``` That's pretty much the pattern [I used in Rails to deduplicate database metadata and save lots of memory](https://github.com/rails/rails/blob/f3c68c59ed57302ca54f4dfad0e91dbff426962d/activerecord/lib/active_record/connection_adapters/deduplicable.rb). The big downside here is that these value objects can't be GCed anymore, so this pattern is not viable in many case. ### Why not use WeakRef A couple of reasons. First, when using this pattern, the goal is to reduce memory usage, so having one extra `WeakRef` for every single value object is a bit counter productive. Then it's a bit annoying to work with, as you have to constantly check wether the reference is still alive, and/or rescue `WeakRef::RefError`. Often, these two complications make the tradeoff not worth it. ### Ruby 2.7 Since [Feature #13498] `WeakMap` is a bit more usable as you can now use an interned string as the unique key, e.g. ```ruby class Position REGISTRY = ObjectSpace::WeakMap.new private_constant :REGISTRY class << self def new(*) instance = super REGISTRY[instance.unique_id] ||= instance end end attr_reader :x, :y, :z, :unique_id def initialize(x, y, z) @x = x @y = y @z = z @unique_id = -"#{self.class}-#{x},#{y},#{z}" freeze end def hash self.class.hash ^ x.hash >> 1 ^ y.hash >> 2 ^ y.hash >> 3 end def ==(other) other.is_a?(Position) && other.x == x && other.y == y && other.z == z end alias_method :eql?, :== end p Position.new(1, 2, 3).equal?(Position.new(1, 2, 3)) ``` That makes the pattern much easier to work with than dealing with `WeakRef`, but there is still that an extra instance. ### Proposal What would be ideal would be a `WeakMap` that works by equality, so that the first snippet could simply replace `{}` by `WeakMap.new`. Changing `ObjectSpace::WeakMap`'s behavior would cause issues, and I see two possibilities: - The best IMO would be to have a new top level `::WeakMap` be the equality based map, and have `ObjectSpace::WeakMap` remain as a semi-private interface for backing up `WeakRef`. - Or alternatively, `ObjectSpace::WeakMap` could have a `compare_by_equality` method (inverse of `Hash#compare_by_identity`) to change its behavior post instantiation. I personally prefer the first one. -- https://bugs.ruby-lang.org/ Unsubscribe: