From: Eric Wong Date: 2014-03-12T00:06:22+00:00 Subject: [ruby-core:61424] [REJECT?] xmalloc/xfree: reduce atomic ops w/ thread-locals I'm unsure about this. I _hate_ the extra branches this adds; and most of our benchmarks don't show an improvement. But this seems like an obvious experiment, so maybe somebody else would've tried it if I didn't at least publish it here. Atomic operations are expensive, so use thread-local counters and only perform atomic operations when the local counters hit a predefined limit (currently 16K). This gives a ~12% speedup to the bm_so_count_words.rb benchmark which does many small mallocs. This pattern is common in some Ruby scripts doing text processing, so maybe it is worth doing. Unfortunately, this adds more branches, increases code size, and hurts accuracy of GC accounting in multithreaded programs. Some benchmarks are slower as a result. Full benchmark results in the full patch: http://bogomips.org/ruby.git/patch?id=8271ec7b977 git://80x24.org/ruby.git gc-lessatomic