From: Yusuke Endoh Date: 2015-12-12T01:41:19+09:00 Subject: [ruby-core:72057] Re: [Ruby trunk - Feature #11405] [Assigned] [PATCH] hash.c: minor speedups to int/fixnum keys > I'm curious, how did you notice this? In short, I was studying hash conflicts for another reason. The detailed context is off topic, but you might be interested. I encountered an interesting behavior of case statement: 100000000.times do case 0 when 0 do_something when 1, 2, 3, ..., 10000 raise end end was always slower than 100000000.times do case 0 when 1, 2, 3, ..., 10000 raise when 0 do_something end end , even though the only difference is the order of when clauses. The cause was a dense cdhash in opt_case_dispatch. Both programs create a cdhash that has 10001 elements, which are slightly less than 10240 (= bin size 2048 * rehashing threshold 5). So, each bin has a very long linked list (average length is 5). The former program inserts 0 to the cdhash at first. Because the final cdhash has been rehashed after 0 was inserted, we cannot expect where the element is placed in a linked list. The latter, on the other hand, inserts 0 at last. In this case we can expect 0 is placed at the head of the linked list. Because of no need to follow the linked list, it is faster to find. Though I'm not sure if this IS a problem, it was anyway fixed by nobu at r53031 by explicitly rehashing a cdhash. But this behavior of st_table is not only for case statement. I could reproduce it by normal Hash: h = {} 10000.times do |n| h[n] = true end 1_000_000_000.times { h[9999] } # 57.2 sec 1_000_000_000.times { h[0] } # 89.1 sec I'm not sure if (and how) we should fix this. Reducing rehash threshold would work, but it will cause frequent rehashing, which may lead to another overhead and memory waste. 2015-12-11 4:23 GMT+09:00 Eric Wong : > mame@ruby-lang.org wrote: >> This caused a lot of trivial hash conflicts of Fixnums that is >= 16384. > > Thanks for noticing, I will investigate. > > I'm curious, how did you notice this? > > I'll probably integrate some well-known hashing tests into the > test suite to prevent this in the future...