From: "mame (Yusuke Endoh)" Date: 2021-12-09T09:33:08+00:00 Subject: [ruby-core:106585] [Ruby master Feature#16038] Provide a public WeakMap that compares by equality rather than by identity Issue #16038 has been updated by mame (Yusuke Endoh). Eregon (Benoit Daloze) wrote in #note-22: > For the deduplication use-case as in the description, `WeakValuesMap` is possible too: ... > Then that mapping stays in the map as long as the instance is referenced somewhere. > We can't use the instance itself as the key, otherwise no mapping would ever GC, since the key is referenced strongly by the map. As far as I understand from the description, what byroot wanted is to use the instance itself as the key. BTW, is it really impossible for JVM to implement CRuby's current WeakMap in which both keys and values are weak? If the code written in byroot (WeakKeysMap with WeakRef values) works in JVM, the hack can be also used to implement WeakMap. > As examples from TruffleRuby, the internal map for Symbols or for frozen literal/interned strings use a `WeakValuesMap`. They need to guarantee uniqueness, so as long as the value is alive it makes sure to to avoid any duplicate value. > `WeakKeysMap` is used to keep some extra data about an object outside of it, as long as the object is alive (e.g., the `eval` line offset for a given Source file object, the `excluded_descendants` in that rails PR). > One can also build a weak set based on `WeakKeysMap`, by just using e.g. `true` as the value. This is used e.g. to track subclasses of a class in a weakly manner. Thanks, this summary is very clear to me. ---------------------------------------- Feature #16038: Provide a public WeakMap that compares by equality rather than by identity https://bugs.ruby-lang.org/issues/16038#change-95248 * Author: byroot (Jean Boussier) * Status: Open * Priority: Normal ---------------------------------------- I know `ObjectSpace::WeakMap` isn't really supposed to be used, and that the blessed interface is `WeakRef`. However, I'd like to make a case for a better public WeakMap. ### Usage As described in [Feature #16035], `WeakMap` is useful for deduplicating "value objects". A typical use case is as follows: ```ruby class Position REGISTRY = {} private_constant :REGISTRY class << self def new(*) instance = super REGISTRY[instance] ||= instance end end attr_reader :x, :y, :z def initialize(x, y, z) @x = x @y = y @z = z freeze end def hash self.class.hash ^ x.hash >> 1 ^ y.hash >> 2 ^ y.hash >> 3 end def ==(other) other.is_a?(Position) && other.x == x && other.y == y && other.z == z end alias_method :eql?, :== end p Position.new(1, 2, 3).equal?(Position.new(1, 2, 3)) ``` That's pretty much the pattern [I used in Rails to deduplicate database metadata and save lots of memory](https://github.com/rails/rails/blob/f3c68c59ed57302ca54f4dfad0e91dbff426962d/activerecord/lib/active_record/connection_adapters/deduplicable.rb). The big downside here is that these value objects can't be GCed anymore, so this pattern is not viable in many case. ### Why not use WeakRef A couple of reasons. First, when using this pattern, the goal is to reduce memory usage, so having one extra `WeakRef` for every single value object is a bit counter productive. Then it's a bit annoying to work with, as you have to constantly check wether the reference is still alive, and/or rescue `WeakRef::RefError`. Often, these two complications make the tradeoff not worth it. ### Ruby 2.7 Since [Feature #13498] `WeakMap` is a bit more usable as you can now use an interned string as the unique key, e.g. ```ruby class Position REGISTRY = ObjectSpace::WeakMap.new private_constant :REGISTRY class << self def new(*) instance = super REGISTRY[instance.unique_id] ||= instance end end attr_reader :x, :y, :z, :unique_id def initialize(x, y, z) @x = x @y = y @z = z @unique_id = -"#{self.class}-#{x},#{y},#{z}" freeze end def hash self.class.hash ^ x.hash >> 1 ^ y.hash >> 2 ^ y.hash >> 3 end def ==(other) other.is_a?(Position) && other.x == x && other.y == y && other.z == z end alias_method :eql?, :== end p Position.new(1, 2, 3).equal?(Position.new(1, 2, 3)) ``` That makes the pattern much easier to work with than dealing with `WeakRef`, but there is still that an extra instance. ### Proposal What would be ideal would be a `WeakMap` that works by equality, so that the first snippet could simply replace `{}` by `WeakMap.new`. Changing `ObjectSpace::WeakMap`'s behavior would cause issues, and I see two possibilities: - The best IMO would be to have a new top level `::WeakMap` be the equality based map, and have `ObjectSpace::WeakMap` remain as a semi-private interface for backing up `WeakRef`. - Or alternatively, `ObjectSpace::WeakMap` could have a `compare_by_equality` method (inverse of `Hash#compare_by_identity`) to change its behavior post instantiation. I personally prefer the first one. -- https://bugs.ruby-lang.org/ Unsubscribe: