From: "jhawthorn (John Hawthorn)" <noreply@...>
Date: 2022-07-28T00:15:55+00:00
Subject: [ruby-core:109345] [Ruby master Feature#18943] New constant caching instruction: opt_getconstant_path

Issue #18943 has been reported by jhawthorn (John Hawthorn).

----------------------------------------
Feature #18943: New constant caching instruction: opt_getconstant_path
https://bugs.ruby-lang.org/issues/18943

* Author: jhawthorn (John Hawthorn)
* Status: Open
* Priority: Normal
----------------------------------------
I'd like to propose the change to the bytecode for constant caching.
I've submitted this improvement via pull request at https://github.com/ruby/ruby/pull/6187 and also attached a patch to this issue.

Previously YARV bytecode implemented constant caching by having a pair of instructions, `opt_getinlinecache` and `opt_setinlinecache`, wrapping a series of `getconstant` calls (with `putobject` providing supporting arguments).

```
# old
$ ruby --dump=insns -e 'Foo::Bar::Baz'
== disasm: #<ISeq:<main>@-e:1 (1,0)-(1,13)> (catch: FALSE)
0000 opt_getinlinecache                     17, <is:0>                (   1)[Li]
0003 putobject                              true
0005 getconstant                            :Foo
0007 putobject                              false
0009 getconstant                            :Bar
0011 putobject                              false
0013 getconstant                            :Baz
0015 opt_setinlinecache                     <is:0>
0017 leave
```

This commit replaces that pattern with a new instruction, `opt_getconstant_path`, handling both getting/setting the inline cache and fetching the constant on a cache miss.

This is implemented by storing the full constant path as a null-terminated array of IDs inside of the IC structure. `idNULL` is used to signal an absolute constant reference.

```
# new
$ ./miniruby --dump=insns -e '::Foo::Bar::Baz'
== disasm: #<ISeq:<main>@-e:1 (1,0)-(1,13)> (catch: FALSE)
0000 opt_getconstant_path                   <ic:0 ::Foo::Bar::Baz>      (   1)[Li]
0002 leave
```

The motivation for this is that we had increasingly found the need to disassemble the instructions between the `opt_getinlinecache` and `opt_setinlinecache` in order to determine the constant we are fetching, or otherwise store metadata.

This disassembly was previously done:
* In `opt_setinlinecache`, to register the `IC` against the constant names it is using for granular invalidation.
* In `rb_iseq_free`, to unregister the IC from the invalidation table.
* In YJIT to find the position of a `opt_getinlinecache` instruction to invalidate it when the cache is populated
* In YJIT to register the constant names being used for invalidation.

With this change we no longer need disassembly for these (in fact `rb_iseq_each` is now unused and is removed in the PR), as the list of constant names being referenced is held in the `IC`. This should also make it possible to make more optimizations in the future.

This may also reduce the size of iseqs, as previously each segment required 32 bytes (assuming 64-bit platform) for each constant segment. This implementation only stores one 8-byte `ID` per-segment .

There should be no significant performance difference between this and the previous implementation. Previously `opt_getinlinecache` was a "leaf" instruction, but it included a jump (almost always to a separate cache line). Now `opt_getconstant_path` is a non-leaf (it may raise/autoload/call `const_missing`) but it does not jump. These seem to even out. This also removes a field from the IC structure that was needed by YJIT, but adds the `ID *segments` field, so the size remains the same.

---Files--------------------------------
0001-New-constant-caching-insn-opt_getconstant_path.patch (61.3 KB)


-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>