From: "luke-gru (Luke Gruber) via ruby-core" Date: 2025-02-06T20:48:19+00:00 Subject: [ruby-core:120904] [Ruby master Bug#21119] Programs containing `Dir.glob` with a thread executing a CPU-heavy task run very slowly. Issue #21119 has been updated by luke-gru (Luke Gruber). Yeah sorry it is the GVL, like you guys are saying. There are many syscalls here, it would be nice to just release it at the top and get it back after all the syscalls, but then there's probably a lot of ruby functions in between the syscalls that need the GVL... I agree with @byroot that we need a smarter scheduler for these cases. And alternatively to not penalizing threads that release the GVL, we could do like Go and not release the GVL (the `P` in go parlance) on potentially short blocking syscalls and instead register the thread with a monitoring thread (maybe the timer thread?) before the syscall. That monitoring thread checks ruby threads that are in this blocking state for too long and gives the GVL to another waiting thread if it exceeds the limit. If it doesn't exceed this time limit, the ruby thread never yields. This way we could use the GVL release for calls that we know will block a while and use the optimistic no-release case for calls we think will be fast. ---------------------------------------- Bug #21119: Programs containing `Dir.glob` with a thread executing a CPU-heavy task run very slowly. https://bugs.ruby-lang.org/issues/21119#change-111786 * Author: genya0407 (Yusuke Sangenya) * Status: Open * ruby -v: ruby 3.5.0dev (2025-02-06T14:10:34Z master adbf9c5b36) +PRISM [arm64-darwin24] * Backport: 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: UNKNOWN ---------------------------------------- Executing the following code in Ruby 3.4.1 takes a very long time, especially when there are many files \(100~\) in the current directory. This delay does not occur in Ruby 3.3.6. ## Reproducible script ```ruby # hoge.rb # Launch a thread to execute CPU-heavy task Thread.new do loop do arr = [] 100.times do arr << rand(1...100) end end end # Execute a program containing `Dir.glob` in the main thread. 10.times do Dir.glob('*') puts "aaaa" end ``` ## Execution Results Executing the above code in Ruby 3.4.1 takes **119.43s**. ```shell $ ruby -v ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +PRISM [arm64-darwin24] $ time ruby hoge.rb aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa ruby hoge.rb 119.43s user 0.30s system 99% cpu 1:59.89 total ``` Executing it in Ruby master also takes **118.87s**. ```shell $ ~/opt-ruby/bin/ruby -v ruby 3.5.0dev (2025-02-06T14:10:34Z master adbf9c5b36) +PRISM [arm64-darwin24] $ time ~/opt-ruby/bin/ruby hoge.rb aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa ~/opt-ruby/bin/ruby hoge.rb 118.87s user 0.46s system 99% cpu 2:00.45 total ``` Executing it in Ruby 3.3.6 takes only **2.22s**. ```shell $ ruby -v ruby 3.3.6 (2024-11-05 revision 75015d4c1f) [arm64-darwin24] $ time ruby hoge.rb aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa ruby hoge.rb 2.22s user 0.03s system 98% cpu 2.286 total ``` So, there are roughly **50x** delays. ## Possible Cause >From Ruby 3.4.0, `Dir.glob` releases the GVL frequently. * https://bugs.ruby-lang.org/issues/20587 * https://github.com/ruby/ruby/pull/11147 Due to this change, when a CPU-heavy thread releases the GVL, `Dir.glob` also releases the GVL immediately. As a result, `Dir.glob` gets significantly delayed because it has to continuously regain the GVL causing a major slowdown in execution. ## Note about Execution Results I measured the execution results under a stress condition, with 100 files in the current directory. If there are fewer files, the slowdown may be less pronounced. -- https://bugs.ruby-lang.org/ ______________________________________________ ruby-core mailing list -- ruby-core@ml.ruby-lang.org To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/