[#70257] [Ruby trunk - Feature #11420] [Open] Introduce ID key table into MRI — ko1@...

Issue #11420 has been reported by Koichi Sasada.

11 messages 2015/08/06

[ruby-core:70401] [Ruby trunk - Bug #11396] Bad performance in ruby >= 2.2 for Hash with many symbol keys

From: nagachika00@...
Date: 2015-08-15 18:11:44 UTC
List: ruby-core #70401
Issue #11396 has been updated by Tomoyuki Chikanaga.

Backport changed from 2.0.0: UNKNOWN, 2.1: UNKNOWN, 2.2: REQUIRED to 2.0.0:=
 UNKNOWN, 2.1: UNKNOWN, 2.2: DONE

----------------------------------------
Bug #11396: Bad performance in ruby >=3D 2.2 for Hash with many symbol keys
https://bugs.ruby-lang.org/issues/11396#change-53804

* Author: Bruno Escherl
* Status: Closed
* Priority: Normal
* Assignee:=20
* ruby -v:=20
* Backport: 2.0.0: UNKNOWN, 2.1: UNKNOWN, 2.2: DONE
----------------------------------------
This started out as an issue on stackoverflow, where I found strange perfor=
mance anomalies when comparing Set.include? and Array.include? in different=
 ruby versions: http://stackoverflow.com/questions/31631284/performance-ano=
maly-in-ruby-set-include-with-symbols-2-2-2-vs-2-1-6

In the end it came down to problems with lookup of Hash keys. While for sma=
ller Hashes the performance issues went away using ruby_2_2 branch, they st=
aid for bigger Hashes. I'll attach a benchmark script (hash_bench_3.rb) I u=
sed that creates a Hash with 200000 keys and does a lookup of 10000 of them=
.=20

Here my results:

ruby 2.1.6p336 (2015-04-13 revision 50298) [x86_64-darwin14.0]

              string    142.818  (=C2=B1 2.8%) i/s -    714.000=20
              symbol    505.831  (=C2=B1 3.0%) i/s -      2.550k

ruby 2.2.2p95 (2015-04-13 revision 50295) [x86_64-darwin14]

              string    143.404  (=C2=B1 3.5%) i/s -    728.000=20
              symbol     76.945  (=C2=B1 6.5%) i/s -    385.000=20

ruby 2.2.3p147 (2015-07-04 revision 51143) [x86_64-darwin14] self-compiled

              string    138.349  (=C2=B1 2.2%) i/s -    702.000=20
              symbol     77.495  (=C2=B1 3.9%) i/s -    392.000=20

As you can see 2.2 is much slower than 2.1.6 for symbol keys. I was recomme=
nded to disable Garbage Collection for Symbols for testing and did so on th=
e ruby_2_2 branch

ruby 2.2.3p147 (2015-07-04 revision 51143) [x86_64-darwin14] self-compiled,=
 USE_SYMBOL_GC=3D0

              string    145.179  (=C2=B1 3.4%) i/s -    728.000=20
              symbol    602.008  (=C2=B1 7.6%) i/s -      3.050k

I would have expected that symbol GC may have some performance impact, but =
this looks like it is too big. I can't say exactly at which point Garbage C=
ollection really hurts, but the bigger the Hash and the bigger the number o=
f include? calls, the slower it gets.

---Files--------------------------------
hash_bench_3.rb (605 Bytes)


--=20
https://bugs.ruby-lang.org/

In This Thread

Prev Next