From: "Dan0042 (Daniel DeLorme) via ruby-core" Date: 2025-05-20T20:11:37+00:00 Subject: [ruby-core:122210] [Ruby Feature#21353] Add shape_id to RBasic under 32 bit Issue #21353 has been updated by Dan0042 (Daniel DeLorme). In general this sounds like a good idea, but I think it would be better to have a struct that works the same way for both 32bit and 64bit systems, and also avoids reserving an entire 32 bits, which is overkill for shape ids. What about this? struct RBasic { uint32 flags; uint32 flags2; const VALUE klass; } #define shape_id(ptr) (ptr->flags2) Not relying on `#if RBASIC_SHAPE_ID_FIELD` would imho reduce overall complexity and simplify maintenance. Future-proofing: if we run out of bits in `flags`, we can redefine the `shape_id` macro and use some bits from `flags2` ---------------------------------------- Feature #21353: Add shape_id to RBasic under 32 bit https://bugs.ruby-lang.org/issues/21353#change-113358 * Author: byroot (Jean Boussier) * Status: Open ---------------------------------------- Currently on 64bit systems, for every types, the `shape_id` is stored inside the `RBasic.flags` field, and is 32bit long. However, on 32bit systems like i686 and WASM, it is much more complicated. For `T_OBJECT`, `T_CLASS` and `T_MODULE`, the `shape_id` is stored as part of "user flags" in `FL_USER4-19`, and for all other types it's stored alongside the instance variable in the `generic_fields_tbl`, which means a hash lookup is required to access it. This situation makes a lot of routine noticeably more complicated, with numerous codepath taken only by 32bit systems, because to avoid doing two hash-lookup per ivar access, the code need a lot of contortions. You can look for `SHAPE_IN_BASIC_FLAGS` to have an idea of the added complexity. In addition, it forces us to duplicate some bits of information. For instance `RUBY_FL_FREEZE` is redundant with the `shape_id`. The shape already record that the object is frozen, and the only reason `RUBY_FL_FREEZE` hasn't been eliminated is because on 32bits the `shape_id` isn't store inline for some objects. Similarly, `RUBY_FL_EXIVAR` is redundant with the `shape_id`, because to know whether an object has ivars, you can simply check if `shape_id == 0`. Reclaiming these bits would be very useful for Ractors, as we'd need two bits in objects to be able to implement [lightwieght locks](https://webkit.org/blog/6161/locking-in-webkit/). Yet another complication, is that on 32bit systems, the `shape_id` is only 16bits long. I have the project to use the upper bits of the `shape_id` to store metadata, such as the `frozen` and `too_complex` status, allowing to test for this without chasing a pointer: https://github.com/ruby/ruby/pull/13289. But this currently can't be done on 32bit sytems, both because accessing the `shape_id` might require a hash-lookup, and also because it's only 16bits long, so every single bit used for tagging severely restrict the maximum number of shapes. ### Proposal To simplify all this, we propose that on 32bit systems, we add a `VALUE shape_id` in `RBasic`: ```c struct RBasic { VALUE flags; const VALUE klass; #if RBASIC_SHAPE_ID_FIELD VALUE shape_id; #endif } ``` This ensure that on 32bits, all objects have their `shape_id` always at the same predictable offset, and 32bits long. As you can see on the pull request, it simplify the code quite significantly: https://github.com/ruby/ruby/pull/13341, and there's more cleanup that can be done. The downside obviously is that on 32bit, objects would grow from `20B` to `24B`. ### Pull Request You can find the proposed patch at: https://github.com/ruby/ruby/pull/13341 cc @tenderlovemaking and @jhawthorn Also FYI @katei because as the maintainer of WASM I assume this impact you the most. -- https://bugs.ruby-lang.org/ ______________________________________________ ruby-core mailing list -- ruby-core@ml.ruby-lang.org To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/