[ruby-core:102183] [Ruby master Misc#17565] Prefer use of access(2) in rb_file_load_ok() to check for existence of require'd files
From:
lee.hambley@...
Date:
2021-01-20 20:05:56 UTC
List:
ruby-core #102183
Issue #17565 has been reported by leehambley (Lee Hambley).
----------------------------------------
Misc #17565: Prefer use of access(2) in rb_file_load_ok() to check for existence of require'd files
https://bugs.ruby-lang.org/issues/17565
* Author: leehambley (Lee Hambley)
* Status: Open
* Priority: Normal
----------------------------------------
When using Ruby in Docker (2.5 in our case, but the code is unchanged in 15 years across all versions) with a large $LOAD_PATH some millions of calls are made to `open(2)` with a mean cost of 130挙ec per call, where a call to `access(2)` has a cost around 5ラ lower (something around 28オsec).
With a Rails 5 app, without Zeitwerk, the load path is searched iteratively looking for a file to define a constant, this causes something like 2,000,000 calls to `open(2)` of which 97.5% are failing with `ENOENT`.
I believe that the cost of two syscalls (`open(2)` only after successful `access(2)`) would, in our case, at least because we would shave-off something like 1,900,000ラ90オsec (2.85 minutes) from the three minute boot time for our application.
I prepared a very na阮e patch with a simple early-return in `rb_file_load_ok`:
```
diff --git a/file.c b/file.c
index 3bf092c05c..c7a7635125 100644
--- a/file.c
+++ b/file.c
@@ -5986,6 +5986,16 @@ rb_file_load_ok(const char *path)
O_NDELAY |
#endif
0);
+ if (access(path, R_OK) == -1) return 0;
int fd = rb_cloexec_open(path, mode, 0);
if (fd == -1) return 0;
rb_update_max_fd(fd);
```
This hasn't been exhaustively tested as I simply haven't had time yet, but at least it compiled and passed `make check`.
I spoke with Aaron Patterson on Twitter, who suggested maybe a wiser approach would be a heuristic approach one level higher (`rb_find_file`?) which switches the strategy based on the length of the LOAD_PATH.
Alternatively, maybe the patch could be conditional, guarded somehow, and conditionally compiled only into the Rubies built for Docker, in a way that is portable to the common Ruby version managers.
I am opening this ticket to track my own work, as much as anything, with no expectation that someone implement this on my behalf. I am eager to contribute to Ruby for all the benefit I have seen from it in my career.
If someone knows hints why this may be an unsuccessful adventure, I gratefully receive any and all feedback.
--
https://bugs.ruby-lang.org/
Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>