From: "jeremyevans0 (Jeremy Evans) via ruby-core" <ruby-core@...>
Date: 2025-02-06T19:25:12+00:00
Subject: [ruby-core:120900] [Ruby master Bug#21119] Programs containing `Dir.glob` with a thread executing a CPU-heavy task run very slowly.

Issue #21119 has been updated by jeremyevans0 (Jeremy Evans).


It is simple to revert the GVL-releasing, but then no other thread can run while accessing the filesystem (which may block for a long period of time for networked filesystems).  GVL-releasing is a tradeoff.  It mitigates damage if the filesystem access takes a long time, but it makes the common case slower.  I think this issue is much more pronounced on Mac OS and other systems where `getattrlist`/`fgetattrlist` are used in order to determine whether normalization is needed, because then the GVL is released for every directory entry. I don't have any opinion on whether the tradeoff is worth it in this case.

----------------------------------------
Bug #21119: Programs containing `Dir.glob` with a thread executing a CPU-heavy task run very slowly.
https://bugs.ruby-lang.org/issues/21119#change-111776

* Author: genya0407 (Yusuke Sangenya)
* Status: Open
* ruby -v: ruby 3.5.0dev (2025-02-06T14:10:34Z master adbf9c5b36) +PRISM [arm64-darwin24]
* Backport: 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: UNKNOWN
----------------------------------------
Executing the following code in Ruby 3.4.1 takes a very long time, especially when there are many files \(100~\) in the current directory.
This delay does not occur in Ruby 3.3.6.

## Reproducible script

```ruby
# hoge.rb

# Launch a thread to execute CPU-heavy task
Thread.new do
  loop do
    arr = []
    100.times do
     arr << rand(1...100)
    end
  end
end

# Execute a program containing `Dir.glob` in the main thread.
10.times do
  Dir.glob('*')
  puts "aaaa"
end
```

## Execution Results

Executing the above code in Ruby 3.4.1 takes **119.43s**.

```shell
$ ruby -v
ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +PRISM [arm64-darwin24]

$ time ruby hoge.rb
aaaa
aaaa
aaaa
aaaa
aaaa
aaaa
aaaa
aaaa
aaaa
aaaa
ruby hoge.rb  119.43s user 0.30s system 99% cpu 1:59.89 total
```

Executing it in Ruby master also takes **118.87s**.

```shell
$ ~/opt-ruby/bin/ruby -v
ruby 3.5.0dev (2025-02-06T14:10:34Z master adbf9c5b36) +PRISM [arm64-darwin24]

$ time ~/opt-ruby/bin/ruby hoge.rb
aaaa
aaaa
aaaa
aaaa
aaaa
aaaa
aaaa
aaaa
aaaa
aaaa
~/opt-ruby/bin/ruby hoge.rb  118.87s user 0.46s system 99% cpu 2:00.45 total
```

Executing it in Ruby 3.3.6 takes only **2.22s**.

```shell
$ ruby -v
ruby 3.3.6 (2024-11-05 revision 75015d4c1f) [arm64-darwin24]

$ time ruby hoge.rb
aaaa
aaaa
aaaa
aaaa
aaaa
aaaa
aaaa
aaaa
aaaa
aaaa
ruby hoge.rb  2.22s user 0.03s system 98% cpu 2.286 total
```

So, there are roughly **50x** delays.

## Possible Cause

>From Ruby 3.4.0, `Dir.glob` releases the GVL frequently.

* https://bugs.ruby-lang.org/issues/20587
* https://github.com/ruby/ruby/pull/11147

Due to this change, when a CPU-heavy thread releases the GVL, `Dir.glob` also releases the GVL immediately.
As a result, `Dir.glob` gets significantly delayed because it has to continuously regain the GVL causing a major slowdown in execution.

## Note about Execution Results

I measured the execution results under a stress condition, with 100 files in the current directory.  
If there are fewer files, the slowdown may be less pronounced.





-- 
https://bugs.ruby-lang.org/
 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/