[#117392] [Ruby master Feature#20405] Inline comments — "nobu (Nobuyoshi Nakada) via ruby-core" <ruby-core@...>

Issue #20405 has been reported by nobu (Nobuyoshi Nakada).

11 messages 2024/04/01

[#117434] [Ruby master Bug#20409] Missing reporting some invalid breaks — "kddnewton (Kevin Newton) via ruby-core" <ruby-core@...>

Issue #20409 has been reported by kddnewton (Kevin Newton).

8 messages 2024/04/03

[#117458] [Ruby master Bug#20414] `Fiber#raise` should recurse to `resumed_fiber` rather than failing. — "ioquatix (Samuel Williams) via ruby-core" <ruby-core@...>

Issue #20414 has been reported by ioquatix (Samuel Williams).

10 messages 2024/04/07

[#117469] [Ruby master Feature#20415] Precompute literal String hash code during compilation — "byroot (Jean Boussier) via ruby-core" <ruby-core@...>

Issue #20415 has been reported by byroot (Jean Boussier).

10 messages 2024/04/09

[#117494] [Ruby master Bug#20421] String#index and String#byteindex don't clear `$~` when offset > size (or bytesize) — "andrykonchin (Andrew Konchin) via ruby-core" <ruby-core@...>

Issue #20421 has been reported by andrykonchin (Andrew Konchin).

7 messages 2024/04/11

[#117498] [Ruby master Feature#20425] Optimize forwarding callers and callees — "tenderlovemaking (Aaron Patterson) via ruby-core" <ruby-core@...>

Issue #20425 has been reported by tenderlovemaking (Aaron Patterson).

14 messages 2024/04/11

[#117531] [Ruby master Bug#20431] Ruby 3.3.0 build fail with make: *** [io_buffer.o] Error 1 — "shubham_yadav (Shubham Yadav) via ruby-core" <ruby-core@...>

SXNzdWUgIzIwNDMxIGhhcyBiZWVuIHJlcG9ydGVkIGJ5IHNodWJoYW1feWFkYXYgKFNodWJoYW0g

11 messages 2024/04/16

[#117564] [Ruby master Bug#20433] Hash.inspect for some hash returns syntax invalid representation — "tompng (tomoya ishida) via ruby-core" <ruby-core@...>

Issue #20433 has been reported by tompng (tomoya ishida).

15 messages 2024/04/17

[#117572] [Ruby master Misc#20435] DevMeeting-2024-06-13 — "mame (Yusuke Endoh) via ruby-core" <ruby-core@...>

Issue #20435 has been reported by mame (Yusuke Endoh).

12 messages 2024/04/17

[#117588] [Ruby master Misc#20436] DevMeeting at RubyKaigi 2024 — "ko1 (Koichi Sasada) via ruby-core" <ruby-core@...>

SXNzdWUgIzIwNDM2IGhhcyBiZWVuIHJlcG9ydGVkIGJ5IGtvMSAoS29pY2hpIFNhc2FkYSkuDQoN

14 messages 2024/04/18

[#117624] [Ruby master Bug#20440] `super` from child class passing keyword arg as Hash if in a method with passthrough args called from base class — "ozydingo (Andrew Schwartz) via ruby-core" <ruby-core@...>

Issue #20440 has been reported by ozydingo (Andrew Schwartz).

7 messages 2024/04/20

[#117644] [Ruby master Feature#20443] Allow Major GC's to be disabled — "eightbitraptor (Matthew Valentine-House) via ruby-core" <ruby-core@...>

Issue #20443 has been reported by eightbitraptor (Matthew Valentine-House).

25 messages 2024/04/22

[#117646] [Ruby master Bug#20444] Kernel#loop: returning the "result" value of StopIteration doesn't work when raised directly — "esad (Esad Hajdarevic) via ruby-core" <ruby-core@...>

Issue #20444 has been reported by esad (Esad Hajdarevic).

9 messages 2024/04/22

[#117653] [Ruby master Bug#20446] OUtdated https://cache.ruby-lang.org/pub/ruby/index.txt — "vo.x (Vit Ondruch) via ruby-core" <ruby-core@...>

Issue #20446 has been reported by vo.x (Vit Ondruch).

7 messages 2024/04/23

[#117657] [Ruby master Bug#20447] Ruby 3.3.1 broken on i686 — "vo.x (Vit Ondruch) via ruby-core" <ruby-core@...>

SXNzdWUgIzIwNDQ3IGhhcyBiZWVuIHJlcG9ydGVkIGJ5IHZvLnggKFZpdCBPbmRydWNoKS4NCg0K

15 messages 2024/04/23

[#117658] [Ruby master Feature#20448] Make coverage event hooking C API public — "ms-tob (Matt S) via ruby-core" <ruby-core@...>

Issue #20448 has been reported by ms-tob (Matt S).

9 messages 2024/04/23

[#117674] [Ruby master Bug#20450] Ruby 3.1.1 broken with bootsnap — "philippe.bs.noel@... (Philippe Noel) via ruby-core" <ruby-core@...>

Issue #20450 has been reported by philippe.bs.noel@tutanota.com (Philippe Noel).

11 messages 2024/04/24

[#117684] [Ruby master Bug#20452] Ruby 3.3 on Alpine Linux results in a relatively shallow SystemStackError exception — "Earlopain (A S) via ruby-core" <ruby-core@...>

Issue #20452 has been reported by Earlopain (A S).

12 messages 2024/04/24

[#117711] [Ruby master Bug#20456] Hash can get stuck marked as iterating through process forking — "blowfishpro (Talia Wong) via ruby-core" <ruby-core@...>

Issue #20456 has been reported by blowfishpro (Talia Wong).

7 messages 2024/04/25

[ruby-core:117469] [Ruby master Feature#20415] Precompute literal String hash code during compilation

From: "byroot (Jean Boussier) via ruby-core" <ruby-core@...>
Date: 2024-04-09 07:43:43 UTC
List: ruby-core #117469
Issue #20415 has been reported by byroot (Jean Boussier).

----------------------------------------
Feature #20415: Precompute literal String hash code during compilation
https://bugs.ruby-lang.org/issues/20415

* Author: byroot (Jean Boussier)
* Status: Open
----------------------------------------
I worked on a proof of concept with @etienne which I think has some potenti=
al, but I'm looking for feedback on what would be the best implementation.


The proof of concept is here: https://github.com/Shopify/ruby/pull/553

### Idea

Most string literals are relatively short, hence embedded, and have some wa=
sted bytes at the end of their slot. We could use that wasted space to stor=
e the string hash.

The goal being to make **looking up a literal String key in a hash, as fast=
 as a Symbol key**. The goal isn't to memoize the hash code of all strings,=
 but to **only selectively precompute the hash code of literal strings
in the compiler**. The compiler could even selectively do this when we lite=
ral string is used to lookup a hash (`opt_aref`).

Here's the benchmark we used:

```ruby
hash =3D 10.times.to_h do |i|
  [i, i]
end

dyn_sym =3D "dynamic_symbol".to_sym
hash[:some_symbol] =3D 1
hash[dyn_sym] =3D 1
hash["small"] =3D 2
hash["frozen_string_literal"] =3D 2

Benchmark.ips do |x|
  x.report("symbol") { hash[:some_symbol] }
  x.report("dyn_symbol") { hash[:some_symbol] }
  x.report("small_lit") { hash["small"] }
  x.report("frozen_lit") { hash["frozen_string_literal"] }
  x.compare!(order: :baseline)
end
```

On Ruby 3.3.0, looking up a String key is a bit slower based on the key siz=
e:

```
Calculating -------------------------------------
              symbol     24.175M (=B1 1.7%) i/s -    122.002M in   5.048306s
          dyn_symbol     24.345M (=B1 1.6%) i/s -    122.019M in   5.013400s
           small_lit     21.252M (=B1 2.1%) i/s -    107.744M in   5.072042s
          frozen_lit     20.095M (=B1 1.3%) i/s -    100.489M in   5.001681s

Comparison:
              symbol: 24174848.1 i/s
          dyn_symbol: 24345476.9 i/s - same-ish: difference falls within er=
ror
           small_lit: 21252403.2 i/s - 1.14x  slower
          frozen_lit: 20094766.0 i/s - 1.20x  slower
```

With the proof of concept performance is pretty much identical:

```
Calculating -------------------------------------
              symbol     23.528M (=B1 6.9%) i/s -    117.584M in   5.033231s
          dyn_symbol     23.777M (=B1 4.7%) i/s -    120.231M in   5.071734s
           small_lit     23.066M (=B1 2.9%) i/s -    115.376M in   5.006947s
          frozen_lit     22.729M (=B1 1.1%) i/s -    115.693M in   5.090700s

Comparison:
              symbol: 23527823.6 i/s
          dyn_symbol: 23776757.8 i/s - same-ish: difference falls within er=
ror
           small_lit: 23065535.3 i/s - same-ish: difference falls within er=
ror
          frozen_lit: 22729351.6 i/s - same-ish: difference falls within er=
ror
```

### Possible implementation

The reason I'm opening this issue early is to get feedback on which would b=
e the best implementation.

#### Store hashcode after the string terminator

Right now the proof of concept simply stores the `st_index_t` after the str=
ing null terminator, and only when the string is embedded and as enough lef=
t over space.
Strings with a precomputed hash are marked with an user flag.

Pros:

  - Very simple implementation, no need to change a lot of code, and very e=
asy to strip out if we want to.
  - Doesn't use any extra memory. If the string doesn't have enough left ov=
er bytes, the optimization simply isn't applied.
  - The worst case overhead is a single `FL_TEST_RAW` in `rb_str_hash`.

Cons:

  - The optimization won't apply to certain string sizes. e.g. strings betw=
een `17` and `23` bytes won't have a precomputed hash code.
  - Extracting the hash code requires some not so nice pointer arithmetic.


#### Create another RString union

Another possibility would be to add another entry in the `RString` struct u=
nion, such as we'd have:

```c
struct RString {
    struct RBasic basic;
    long len;
    union {
        // ... existing members
        struct {
            st_index_t hash;
            char ary[1];
        } embded_literal;
    } as;
};
```

Pros:

  - The optimization can now be applied to all string sizes.
  - The hashcode is always at the same offset and properly aligned.

Cons:

  - Some strings would be bumped by one slot size, so would use marginally =
more memory.
  - Complexify the code base more, need to modify a lot more string related=
 code (e.g. `RSTRING_PTR` and many others)
  - When compiling such string, if an equal string already exists in the `f=
string` table, we'd need to replace it, we can't just mutate it in place to=
 add the hashcode.


### Prior art

[Feature #15331] is somewhat similar in its idea, but it does it lazily for=
 all strings. Here it's much simpler because limited to string literals, wh=
ich are the ones likely to be used as Hash keys, and the overhead is on com=
pilation, not runtime (aside from a single flag check). So I think most of =
the caveats of that original implementation don't apply here.




--=20
https://bugs.ruby-lang.org/
 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-c=
ore.ml.ruby-lang.org/

In This Thread

Prev Next