[#61171] Re: [ruby-changes:33145] normal:r45224 (trunk): gc.c: fix build for testing w/o RGenGC — SASADA Koichi <ko1@...>
(2014/03/01 16:15), normal wrote:
[#61243] [ruby-trunk - Feature #9425] [PATCH] st: use power-of-two sizes to avoid slow modulo ops — normalperson@...
Issue #9425 has been updated by Eric Wong.
[#61359] [ruby-trunk - Bug #9609] [Open] [PATCH] vm_eval.c: fix misplaced RB_GC_GUARDs — normalperson@...
Issue #9609 has been reported by Eric Wong.
(2014/03/07 19:09), normalperson@yhbt.net wrote:
SASADA Koichi <ko1@atdot.net> wrote:
[#61424] [REJECT?] xmalloc/xfree: reduce atomic ops w/ thread-locals — Eric Wong <normalperson@...>
I'm unsure about this. I _hate_ the extra branches this adds;
Hi Eric,
SASADA Koichi <ko1@atdot.net> wrote:
(2014/03/14 2:12), Eric Wong wrote:
SASADA Koichi <ko1@atdot.net> wrote:
[#61452] [ruby-trunk - Feature #9632] [Open] [PATCH 0/2] speedup IO#close with linked-list from ccan — normalperson@...
Issue #9632 has been reported by Eric Wong.
[#61496] [ruby-trunk - Feature #9638] [Open] [PATCH] limit IDs to 32-bits on 64-bit systems — normalperson@...
Issue #9638 has been reported by Eric Wong.
[#61568] hash function for global method cache — Eric Wong <normalperson@...>
I came upon this because I noticed existing st numtable worked poorly
(2014/03/18 8:03), Eric Wong wrote:
SASADA Koichi <ko1@atdot.net> wrote:
what's the profit from using binary tree in place of hash?
Юрий Соколов <funny.falcon@gmail.com> wrote:
[#61687] [ruby-trunk - Bug #9606] Ocassional SIGSEGV inTestException#test_machine_stackoverflow on OpenBSD — normalperson@...
Issue #9606 has been updated by Eric Wong.
[#61760] [ruby-trunk - Feature #9632] [PATCH 0/2] speedup IO#close with linked-list from ccan — normalperson@...
Issue #9632 has been updated by Eric Wong.
[ruby-core:61575] [ruby-trunk - Feature #9425] [PATCH] st: use power-of-two sizes to avoid slow modulo ops
Issue #9425 has been updated by Eric Wong.
normalperson@yhbt.net wrote:
> hash_aref_sym 1.000
Lack of improvement here was disappointing since symbol keys are common,
and this showed a regression on my x86 (32-bit) VMs.
I tweaked rb_any_hash to be symbol-aware:
http://bogomips.org/ruby.git/patch?id=497ed6355
12-30% improvement on this test from trunk depending on CPU so far \o/
(Phenom II X4 got 30%, newer/faster x86-64 CPUs show less speedup).
I'm comfortable with improvements of this series on x86 VMs running on
x86-64 (and of course native x86-64).
Can anybody with real 32-bit hardware verify this series? Not sure I
can trust VM results; my remaining x86 hardware is on its last legs
and showing occasional HW errors.
git://80x24.org/ruby.git st-noprime-v4
st: use power-of-two sizes to avoid slow modulo ops
add hash benchmarks
st.c: tweak numhash to match common Ruby use cases
hash.c: improve symbol hash distribution
----------------------------------------
Feature #9425: [PATCH] st: use power-of-two sizes to avoid slow modulo ops
https://bugs.ruby-lang.org/issues/9425#change-45859
* Author: Eric Wong
* Status: Open
* Priority: Low
* Assignee:
* Category: core
* Target version: current: 2.2.0
----------------------------------------
Prime number-sized hash tables are only needed to compensate for bad
hash functions. Ruby has good hash functions nowadays, so reduce our
code size with power-of-two-sized hash tables which allows us to avoid
the slow modulo operation. I expected numhash performance to be worse,
but it seems most of those hashes are either too-small-to-matter or
well-distributed anyways. If we find problems with some existing
numhashes we should start using a proper hash function (Murmur via
st_hash_uint) on those.
This consistently speeds up the bm_hash_flatten and bm_vm2_bighash.
target 0: trunk (ruby 2.2.0dev (2014-01-17 trunk 44631) [x86_64-linux]) at "/home/ew/rrrr/i/bin/ruby --disable=gems"
target 1: st-noprime (ruby 2.2.0dev (2014-01-17 trunk 44631) [x86_64-linux]) at "/home/ew/ruby/i/bin/ruby --disable=gems"
Benchmarks on a Xeon E3-1230 v3 CPU:
minimum results in each 10 measurements.
Execution time (sec)
name trunk st-noprime
hash_flatten 0.500 0.345
hash_keys 0.191 0.192
hash_shift 0.019 0.018
hash_values 0.201 0.200
loop_whileloop2 0.090 0.090
vm2_bighash* 4.457 3.578
Speedup ratio: compare with the result of `trunk' (greater is better)
name st-noprime
hash_flatten 1.451
hash_keys 0.998
hash_shift 1.046
hash_values 1.003
loop_whileloop2 1.000
vm2_bighash* 1.246
Somewhat less impressive on an AMD FX 8320:
minimum results in each 10 measurements.
Execution time (sec)
name trunk st-noprime
hash_flatten 0.633 0.596
hash_keys 0.236 0.232
hash_shift 0.031 0.032
hash_values 0.234 0.238
loop_whileloop2 0.135 0.135
vm2_bighash* 8.198 6.982
Speedup ratio: compare with the result of `trunk' (greater is better)
name st-noprime
hash_flatten 1.063
hash_keys 1.020
hash_shift 0.976
hash_values 0.982
loop_whileloop2 1.000
vm2_bighash* 1.174
----------------------------------------------------------------
The following changes since commit 2c3522c3e403dfdadaaf6095564bde364cc4bddf:
test_thread.rb: stop at once (2014-01-16 06:34:47 +0000)
are available in the git repository at:
git://80x24.org/ruby.git st-noprime
for you to fetch changes up to ed4f4103f4f407ed99dd6cd25b6c35d3aa9f3479:
st: use power-of-two sizes to avoid slow modulo ops (2014-01-18 04:05:21 +0000)
----------------------------------------------------------------
Eric Wong (1):
st: use power-of-two sizes to avoid slow modulo ops
st.c | 100 +++++++++++++++++--------------------------------------------------
1 file changed, 25 insertions(+), 75 deletions(-)
---Files--------------------------------
0001-st-use-power-of-two-sizes-to-avoid-slow-modulo-ops.patch (10.9 KB)
--
https://bugs.ruby-lang.org/