From: "jhawthorn (John Hawthorn) via ruby-core" Date: 2025-11-20T01:25:00+00:00 Subject: [ruby-core:123867] [Ruby Bug#21685] Unnecessary context-switching, especially bad on multi-core machines. Issue #21685 has been updated by jhawthorn (John Hawthorn). @jpl-coconut Please do! This seems like a really good demonstration of the issue and a good start on addressing it. The change is a lot smaller than I expected it to be ������. We would only make performance improvements like this to the `master` branch without a backport to 3.4 or older. We have an existing "timer thread" which serves some similar functions (thread preemption, waking sleeping threads). Do you think this could be integrated with that? There's some previous discussion in #20816 ---------------------------------------- Bug #21685: Unnecessary context-switching, especially bad on multi-core machines. https://bugs.ruby-lang.org/issues/21685#change-115268 * Author: jpl-coconut (Jacob Lacouture) * Status: Open * ruby -v: ruby 3.4.7 (2025-10-08 revision 7a5688e2a2) +PRISM [aarch64-linux] * Backport: 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: UNKNOWN ---------------------------------------- While debugging a performance issue in a large rails application, I wrote a minimal microbenchmark that reproduces the issue. [[here]](https://gist.github.com/jpl-coconut/cb3679ce885eb578e1071c4b3a525d5c) I was surprised to see that the benchmark takes ~3.6sec on a single-core machine, and ~36sec **(10x slower) on a machine with 2 or more cores** . Initially I thought this was a bug in the implementation of Thread::Queue, but soon realized it relates to how the ruby reschedules threads around system calls. I prepared a fix in [[this branch]](https://github.com/jpl-coconut/ruby/tree/deferred_thread_wait) which is based off ruby 3.4.7. I can apply the fix to a different branch or to master if that's helpful. The fix simply defers suspending the thread until the syscall has been running for some short interval. I chose 100usec initially, but this could easily be made configurable. I pasted raw benchmark results below from a single run (though I did many runs and the results are stable). My CPU is an Apple M4. After the fix: - Single-core performance improves by 55%, from 3.6sec to 2sec. - Adding cores causes performance to be flat (at 2sec), rather than getting 10x slower. - Multi-core context-switch count reduces by 99.995%, from 1.4 million to ~80 - system_time/user_time ratio drops from (1.2 - 1.6) to 0.65 Here are the benchmark results before my change: ``` # time taskset --cpu-list 1 ./ruby qtest_simple.rb voluntary_ctxt_switches: 1140773 nonvoluntary_ctxt_switches: 9487 real 0m3.619s user 0m1.653s sys 0m1.950s # time taskset --cpu-list 1,2 ./ruby qtest_simple.rb voluntary_ctxt_switches: 1400110 nonvoluntary_ctxt_switches: 3 real 0m36.223s user 0m9.380s sys 0m14.927s ``` And after: ``` # time taskset --cpu-list 1 ./ruby qtest_simple.rb voluntary_ctxt_switches: 88 nonvoluntary_ctxt_switches: 899 real 0m2.031s user 0m1.209s sys 0m0.743s # time taskset --cpu-list 1,2 ./ruby qtest_simple.rb voluntary_ctxt_switches: 75 nonvoluntary_ctxt_switches: 8 real 0m2.062s user 0m1.279s sys 0m0.783s ``` I was concerned these results might still be reflective of a bug in Thread::Queue, so I also came up with a repro that doesn't rely on it. That one is [[here]](https://gist.github.com/jpl-coconut/aa14e59354abf98f808daaf39baa9a72). Results summary: - Single-core performance improves (this time by only 30%) - Multi-core penalty drops from 4x to 0. - No change to context-switching rates. - system_time/user_time ratio drops from (0.5-1) to 0.15 Before fix: ``` # time taskset --cpu-list 1 ./ruby mbenchmark.rb voluntary_ctxt_switches: 60 real 0m0.336s user 0m0.211s sys 0m0.118s # time taskset --cpu-list 1,2 ./ruby mbenchmark.rb voluntary_ctxt_switches: 60 real 0m1.424s user 0m0.468s sys 0m0.496s ``` After fix: ``` # time taskset --cpu-list 1 ./ruby mbenchmark.rb voluntary_ctxt_switches: 59 real 0m0.241s user 0m0.202s sys 0m0.032s # time taskset --cpu-list 1,2 ./ruby mbenchmark.rb voluntary_ctxt_switches: 60 real 0m0.238s user 0m0.195s sys 0m0.035s ``` -- https://bugs.ruby-lang.org/ ______________________________________________ ruby-core mailing list -- ruby-core@ml.ruby-lang.org To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/