From: "jeremyevans0 (Jeremy Evans) via ruby-core" <ruby-core@...> Date: 2025-02-06T19:25:12+00:00 Subject: [ruby-core:120900] [Ruby master Bug#21119] Programs containing `Dir.glob` with a thread executing a CPU-heavy task run very slowly. Issue #21119 has been updated by jeremyevans0 (Jeremy Evans). It is simple to revert the GVL-releasing, but then no other thread can run while accessing the filesystem (which may block for a long period of time for networked filesystems). GVL-releasing is a tradeoff. It mitigates damage if the filesystem access takes a long time, but it makes the common case slower. I think this issue is much more pronounced on Mac OS and other systems where `getattrlist`/`fgetattrlist` are used in order to determine whether normalization is needed, because then the GVL is released for every directory entry. I don't have any opinion on whether the tradeoff is worth it in this case. ---------------------------------------- Bug #21119: Programs containing `Dir.glob` with a thread executing a CPU-heavy task run very slowly. https://bugs.ruby-lang.org/issues/21119#change-111776 * Author: genya0407 (Yusuke Sangenya) * Status: Open * ruby -v: ruby 3.5.0dev (2025-02-06T14:10:34Z master adbf9c5b36) +PRISM [arm64-darwin24] * Backport: 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: UNKNOWN ---------------------------------------- Executing the following code in Ruby 3.4.1 takes a very long time, especially when there are many files \(100~\) in the current directory. This delay does not occur in Ruby 3.3.6. ## Reproducible script ```ruby # hoge.rb # Launch a thread to execute CPU-heavy task Thread.new do loop do arr = [] 100.times do arr << rand(1...100) end end end # Execute a program containing `Dir.glob` in the main thread. 10.times do Dir.glob('*') puts "aaaa" end ``` ## Execution Results Executing the above code in Ruby 3.4.1 takes **119.43s**. ```shell $ ruby -v ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +PRISM [arm64-darwin24] $ time ruby hoge.rb aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa ruby hoge.rb 119.43s user 0.30s system 99% cpu 1:59.89 total ``` Executing it in Ruby master also takes **118.87s**. ```shell $ ~/opt-ruby/bin/ruby -v ruby 3.5.0dev (2025-02-06T14:10:34Z master adbf9c5b36) +PRISM [arm64-darwin24] $ time ~/opt-ruby/bin/ruby hoge.rb aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa ~/opt-ruby/bin/ruby hoge.rb 118.87s user 0.46s system 99% cpu 2:00.45 total ``` Executing it in Ruby 3.3.6 takes only **2.22s**. ```shell $ ruby -v ruby 3.3.6 (2024-11-05 revision 75015d4c1f) [arm64-darwin24] $ time ruby hoge.rb aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa ruby hoge.rb 2.22s user 0.03s system 98% cpu 2.286 total ``` So, there are roughly **50x** delays. ## Possible Cause >From Ruby 3.4.0, `Dir.glob` releases the GVL frequently. * https://bugs.ruby-lang.org/issues/20587 * https://github.com/ruby/ruby/pull/11147 Due to this change, when a CPU-heavy thread releases the GVL, `Dir.glob` also releases the GVL immediately. As a result, `Dir.glob` gets significantly delayed because it has to continuously regain the GVL causing a major slowdown in execution. ## Note about Execution Results I measured the execution results under a stress condition, with 100 files in the current directory. If there are fewer files, the slowdown may be less pronounced. -- https://bugs.ruby-lang.org/ ______________________________________________ ruby-core mailing list -- ruby-core@ml.ruby-lang.org To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/