[#108771] [Ruby master Bug#18816] Ractor segfaulting MacOS 12.4 (aarch64 / M1 processor) — "brodock (Gabriel Mazetto)" <noreply@...>

Issue #18816 has been reported by brodock (Gabriel Mazetto).

8 messages 2022/06/05

[#108802] [Ruby master Feature#18821] Expose Pattern Matching interfaces in core classes — "baweaver (Brandon Weaver)" <noreply@...>

Issue #18821 has been reported by baweaver (Brandon Weaver).

9 messages 2022/06/08

[#108822] [Ruby master Feature#18822] Ruby lack a proper method to percent-encode strings for URIs (RFC 3986) — "byroot (Jean Boussier)" <noreply@...>

Issue #18822 has been reported by byroot (Jean Boussier).

18 messages 2022/06/09

[#108937] [Ruby master Bug#18832] Suspicious superclass mismatch — "fxn (Xavier Noria)" <noreply@...>

Issue #18832 has been reported by fxn (Xavier Noria).

16 messages 2022/06/15

[#108976] [Ruby master Misc#18836] DevMeeting-2022-07-21 — "mame (Yusuke Endoh)" <noreply@...>

Issue #18836 has been reported by mame (Yusuke Endoh).

12 messages 2022/06/17

[#109043] [Ruby master Bug#18876] OpenSSL is not available with `--with-openssl-dir` — "Gloomy_meng (Gloomy Meng)" <noreply@...>

Issue #18876 has been reported by Gloomy_meng (Gloomy Meng).

18 messages 2022/06/23

[#109052] [Ruby master Bug#18878] parse.y: Foo::Bar {} is inconsistently rejected — "qnighy (Masaki Hara)" <noreply@...>

Issue #18878 has been reported by qnighy (Masaki Hara).

9 messages 2022/06/26

[#109055] [Ruby master Bug#18881] IO#read_nonblock raises IOError when called following buffered character IO — "javanthropus (Jeremy Bopp)" <noreply@...>

Issue #18881 has been reported by javanthropus (Jeremy Bopp).

9 messages 2022/06/26

[#109063] [Ruby master Bug#18882] File.read cuts off a text file with special characters when reading it on MS Windows — magynhard <noreply@...>

Issue #18882 has been reported by magynhard (Matth辰us Johannes Beyrle).

15 messages 2022/06/27

[#109081] [Ruby master Feature#18885] Long lived fork advisory API (potential Copy on Write optimizations) — "byroot (Jean Boussier)" <noreply@...>

Issue #18885 has been reported by byroot (Jean Boussier).

23 messages 2022/06/28

[#109083] [Ruby master Bug#18886] Struct aref and aset don't trigger any tracepoints. — "ioquatix (Samuel Williams)" <noreply@...>

Issue #18886 has been reported by ioquatix (Samuel Williams).

8 messages 2022/06/29

[#109095] [Ruby master Misc#18888] Migrate ruby-lang.org mail services to Google Domains and Google Workspace — "shugo (Shugo Maeda)" <noreply@...>

Issue #18888 has been reported by shugo (Shugo Maeda).

16 messages 2022/06/30

[ruby-core:108767] [Ruby master Feature#18339] GVL instrumentation API

From: "byroot (Jean Boussier)" <noreply@...>
Date: 2022-06-03 13:14:48 UTC
List: ruby-core #108767
Issue #18339 has been updated by byroot (Jean Boussier).

Status changed from Open to Closed

Merged as https://github.com/ruby/ruby/commit/9125374726fbf68c05ee7585d4a374ffc5efc5db after several rounds of review with @ko1 

----------------------------------------
Feature #18339: GVL instrumentation API
https://bugs.ruby-lang.org/issues/18339#change-97832

* Author: byroot (Jean Boussier)
* Status: Closed
* Priority: Normal
----------------------------------------
# GVL instrumentation API

### Context

One of the most common if not the most common way to deploy Ruby application these days is through threaded runners (typically `puma`, `sidekiq`, etc).

While threaded runners can offer more throughput per RAM than forking runners, they do require to carefully set the concurrency level (number of threads).
If you increase concurrency too much, you'll experience GVL contention and the latency will suffer.

The common way today is to start with a relatively low number of threads, and then increase it until CPU usage reach an acceptably high level (generally `~75%`).
But this method really isn't precise, require to saturate one process with fake workload, and doesn't tell how much threads are waiting on the GVLs, just how much the CPU is used.

Because of this, lots of threaded applications are not that well tuned, even more so because the ideal configuration is very dependant on the workload and can vary over time. So a decent setting might not be so good six months later.

Ideally, application owners should be able to continuously see the impact of the GVL contention on their latency metric, so they can more accurately decide what throughput vs latency tradeoff is best for them and regularly adjust it.

### Existing instrumentation methods

Currently, if you want to measure how much GVL contention is happening, you have to use some lower level tracing tools
such as `bpftrace`, or `dtrace`. These are quite advanced tools and require either `root` access or to compile Ruby with different configuration flags etc.

They're also external, so common Application Performance Monitoring (APM) tools can't really report it.

### Proposal

I'd like to have a C-level hook API around the GVL, with 3 events:

  - `RUBY_INTERNAL_EVENT_THREAD_READY`
  - `RUBY_INTERNAL_EVENT_THREAD_RESUME`
  - `RUBY_INTERNAL_EVENT_THREAD_PAUSE`

Such API would allow to implement C extensions that collect various metrics about the GVL impact, such as median / p90 / p99 wait time, or even per thread total wait time.

Aditionaly it would be very useful if the hook would pass some metadata, most importantly, the number of threads currently waiting.
People interested in a lower overhead monitoring method not calling `clock_gettime` could instrument that number of waiting thread instead. It would be less accurate, but enough to tell wether there might be problem.

With such metrics, application owners would be able to much more precisely tune their concurrency setting, and deliberately chose their own tradeoff between throughput and latency.


### Implementation

I submitted a PR for it https://github.com/ruby/ruby/pull/5500 (lacking windows support for now)

The API is as follow:

  - `rb_thread_hook_t * rb_thread_event_new(rb_thread_callback callback, rb_event_flag_t event)`
  - `bool rb_thread_event_delete(rb_thread_hook_t * hook)`

The overhead when no hook is registered is just a single unprotected boolean check, so close to zero.




-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>

In This Thread

Prev Next