[#97063] [Ruby master Bug#16608] ConditionVariable#wait should return false when timeout exceeded — shugo@...

Issue #16608 has been reported by shugo (Shugo Maeda).

10 messages 2020/02/05

[#97084] [Ruby master Feature#16614] New method cache mechanism for Guild — ko1@...

Issue #16614 has been reported by ko1 (Koichi Sasada).

18 messages 2020/02/07

[#97248] [Ruby master Bug#16651] Extensions Do Not Compile on Mingw64 — cfis@...

Issue #16651 has been reported by cfis (Charlie Savage).

17 messages 2020/02/24

[#97289] [Ruby master Bug#16658] `method__cache__clear` DTrace hook was dropped without replacement — v.ondruch@...

Issue #16658 has been reported by vo.x (Vit Ondruch).

9 messages 2020/02/27

[#97307] [Ruby master Feature#16663] Add block or filtered forms of Kernel#caller to allow early bail-out — headius@...

Issue #16663 has been reported by headius (Charles Nutter).

29 messages 2020/02/28

[#97310] [Ruby master Feature#16665] Add an Array#except_index method — alexandr1golubenko@...

Issue #16665 has been reported by alex_golubenko (Alex Golubenko).

12 messages 2020/02/29

[ruby-core:97240] [Ruby master Feature#16648] improve GC performance by 5% with builtin_prefetch

From: bobbypowers@...
Date: 2020-02-22 18:03:42 UTC
List: ruby-core #97240
Issue #16648 has been reported by bpowers (Bobby Powers).

----------------------------------------
Feature #16648: improve GC performance by 5% with builtin_prefetch
https://bugs.ruby-lang.org/issues/16648

* Author: bpowers (Bobby Powers)
* Status: Open
* Priority: Normal
----------------------------------------
The mark phase of non-incremental major GC is (I believe) dominated by pointer chasing.  One way we can improve that is by prefetching cachelines from memory before they are accessed, to reduce stalls.  I did some experimenting, and the following patch reduces the time spent on a full GC from ~ 950 milliseconds to ~ 900 milliseconds, a small but stable improvement.  I would love if additional folks have other benchmarks (or could point me at them) to see if this holds up beyond the web service I tested, and whether something like this could be considered for inclusion.

I also attempted a more "principled" approach based on an optimization described in the GC handbook: putting a FIFO queue in front of the mark stack, and prefetching addresses as they enter the queue.  However, I wasn't able to see any performance improvement there despite testing a number of queue sizes from 4 to 64.  Its possible I implemented this wrong, or misjudged the access patterns (if e.g. the memory of a VALUE is accessed before push_mark_stack is called, it would invalidate this approach).  The code for that alternative is here: https://github.com/bpowers/ruby/commit/d790d0c15047c36c23850a112093fe0e32fd3262

---Files--------------------------------
0001-gc-prefech-objects-seems-to-improve-full-GC-performa.patch (2.29 KB)


-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>

In This Thread

Prev Next