From: "Dan0042 (Daniel DeLorme) via ruby-core" Date: 2025-05-22T03:20:25+00:00 Subject: [ruby-core:122224] [Ruby Feature#21353] Add shape_id to RBasic under 32 bit Issue #21353 has been updated by Dan0042 (Daniel DeLorme). jhawthorn (John Hawthorn) wrote in #note-2: > it is hard for us to follow that path when shapes are so different under 32-bit So we're in agreement here, and in fact I think shapes should be **exactly the same** under both 32bit and 64bit. > Declaring RBasic that way is an interesting idea, but I think creates a lot more complexity. A lot of things currently expect two `VALUE` sized objects at the start of each object to be `flags` and `klass` so that would be quite a large change. As well we'd need to redefine what flags are everywhere they're used to be a uint64_t rather than a VALUE (or make it a uint32_t?? either way a large change). So I think the approach we took with our PR is the least invasive way to get there, there aren't that many references to `RBASIC_SHAPE_ID_FIELD` and to me they all read as being in a place they make sense. I can only see it as simplification, certainly not complexity. And I don't think it's nearly as big a change as you imagine. On 64 bit the memory layout doesn't even change. On 32 bit I can't think of anything that requires the `klass` to be at exactly 4 bytes from the struct start. On 64 bit `flags` is defined to be a 64-bit VALUE, but only the first 32 bits are ever used (apart from the shapes stuff), so changing the type to `uint32_t` would not change any behavior, and would reflect the actual number of flag bits; imho using 64 bits for the flags was kind of a hack in the first place, and this is the opportunity for a long-overdue cleanup. I actually tried changing Ruby 3.1 (before the introduction of shapes) to use 2 `uint32_t` fields as above, and everything compiles and works! Much less of a large change than you'd think. ---------------------------------------- Feature #21353: Add shape_id to RBasic under 32 bit https://bugs.ruby-lang.org/issues/21353#change-113371 * Author: byroot (Jean Boussier) * Status: Open ---------------------------------------- Currently on 64bit systems, for every types, the `shape_id` is stored inside the `RBasic.flags` field, and is 32bit long. However, on 32bit systems like i686 and WASM, it is much more complicated. For `T_OBJECT`, `T_CLASS` and `T_MODULE`, the `shape_id` is stored as part of "user flags" in `FL_USER4-19`, and for all other types it's stored alongside the instance variable in the `generic_fields_tbl`, which means a hash lookup is required to access it. This situation makes a lot of routine noticeably more complicated, with numerous codepath taken only by 32bit systems, because to avoid doing two hash-lookup per ivar access, the code need a lot of contortions. You can look for `SHAPE_IN_BASIC_FLAGS` to have an idea of the added complexity. In addition, it forces us to duplicate some bits of information. For instance `RUBY_FL_FREEZE` is redundant with the `shape_id`. The shape already record that the object is frozen, and the only reason `RUBY_FL_FREEZE` hasn't been eliminated is because on 32bits the `shape_id` isn't store inline for some objects. Similarly, `RUBY_FL_EXIVAR` is redundant with the `shape_id`, because to know whether an object has ivars, you can simply check if `shape_id == 0`. Reclaiming these bits would be very useful for Ractors, as we'd need two bits in objects to be able to implement [lightwieght locks](https://webkit.org/blog/6161/locking-in-webkit/). Yet another complication, is that on 32bit systems, the `shape_id` is only 16bits long. I have the project to use the upper bits of the `shape_id` to store metadata, such as the `frozen` and `too_complex` status, allowing to test for this without chasing a pointer: https://github.com/ruby/ruby/pull/13289. But this currently can't be done on 32bit sytems, both because accessing the `shape_id` might require a hash-lookup, and also because it's only 16bits long, so every single bit used for tagging severely restrict the maximum number of shapes. ### Proposal To simplify all this, we propose that on 32bit systems, we add a `VALUE shape_id` in `RBasic`: ```c struct RBasic { VALUE flags; const VALUE klass; #if RBASIC_SHAPE_ID_FIELD VALUE shape_id; #endif } ``` This ensure that on 32bits, all objects have their `shape_id` always at the same predictable offset, and 32bits long. As you can see on the pull request, it simplify the code quite significantly: https://github.com/ruby/ruby/pull/13341, and there's more cleanup that can be done. The downside obviously is that on 32bit, objects would grow from `20B` to `24B`. ### Pull Request You can find the proposed patch at: https://github.com/ruby/ruby/pull/13341 cc @tenderlovemaking and @jhawthorn Also FYI @katei because as the maintainer of WASM I assume this impact you the most. -- https://bugs.ruby-lang.org/ ______________________________________________ ruby-core mailing list -- ruby-core@ml.ruby-lang.org To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/