From: Xavier Shay Date: 2011-05-31T21:08:50+09:00 Subject: [ruby-core:36626] [Ruby 1.9 - Bug #3924] Performance bug (in require?) Issue #3924 has been updated by Xavier Shay. Hello, This is a long message, but I have tried to address most of the concerns with my patch. There are two sections: one for the technical detail of the patch, and one for benchmarks. = The patch I have split my patch up into a six smaller patches which I hope are easier to review. Descriptions: https://gist.github.com/35060fbcefb25cf1a456 Patches: https://gist.github.com/58dbd6e72c1a1f47a415 GitHub: https://github.com/xaviershay/ruby/commits/require-patch In addition I have addressed the following concerns from Yusuke: - Please use 4 space for indent, with 8 space tab. (Emacs-style) - Please use C89. Please don't use // comments. - Please don't export function without "rb_" prefix, to avoid symbol conflicts. I think there are still two big concerns: 1) Significantly different from original load.c 2) Compatibility with Windows 3) LoadedFeaturesProxy Regarding #1, I don't have anything further to add at the moment. Regarding #2, it will need to be tested thoroughly like all other platforms, but I don't believe it would result in significant architectural changes. For #3, I was worried about it but on reflection I think it may not be a problem. Yusuke wrote: "It is an impossible approach because some extension library can modify $LOADED_FEATURES directly by rb_ary_push." The hooks are only used to keep a cache up to date. If we get a cache-miss, we could just revert to scanning $LOADED_FEATURES again which would pick up any items that were added with rb_ary_push. This will degrade worst-case performance, but I don't think that matters because most requires will succeed. Does this make sense? This would also mean the addition of `lib/enumerator.rb` in my patch would be unnecessary (though IMO probably still a good idea). = Benchmarks Yusuke I think I mislead you with the benchmarks. I do not expect full_load_path_benchmark.rb to be faster. I was only using it to ensure I didn't make that case any worse. See below for further details. Here are my updated timings, with descriptions of each. For the load path and require benchmark I have only included the final number in the test to allow for a more focused comparison. Run on OSX 10.7 1.9.2p180 1.9.3* 1.9.3** 1.9.3*** load path 0.667376 0.866228 0.623411 0.699315 requires 6.745875 7.921598 7.261326 0.285119 new rails 2.172 1.412 1.624 1.082 medium rails 18.372 17.59 15.855 10.488 * 1.9.3r31827 (31/5) ** 1.9.3r31827, with r30789 reverted *** 1.9.3r31827, with my patch == Benchmark descriptions ### load path - https://gist.github.com/985224 Loading a file 50 times with 2000 entries in the $LOAD_PATH. This is an academic benchmark, since 2000 entries is a large amount. For comparison, our rails app load path only has 42 entries. Performance for all versions is linear and roughly equivalent. Note that I removed the `GC.disable` that was in earlier versions of this benchmark. ### requires - https://gist.github.com/c8d0d422a9203e1fe492 Loading 2500 files. This is a more relavent benchmark. Our Rails app loads ~2200 files, large apps can load as many as 9000 [1]. All versions display an exponential increase in time as N increases. 1.8.7, though not included in this measurement, also displays such an exponential curve but it isn't as noticeable in comparison with 1.9 since the magnifier is far less [2]. [1] sent to me via private correspondence [2] see https://gist.github.com/c8d0d422a9203e1fe492 ### new rails Using rails 3.0.7: rails new test-rails-app cd test-rails-app time ruby script/rails runner "puts 1" ### medium rails This is my main Rails code base. Sorry I cannot share it, though this benchmark is not required to make a case for this patch since it shows roughly the same percentage improvement as a blank rails app. Thanks, Xavier ---------------------------------------- Bug #3924: Performance bug (in require?) http://redmine.ruby-lang.org/issues/3924 Author: Carsten Bormann Status: Open Priority: Normal Assignee: Category: core Target version: 1.9.3 ruby -v: - =begin Running irb < /dev/null in 1.9.2 causes 3016 calls to lstat64. For instance, there is a sequence of 28 repetitions each of lstat calls to all 6 non-empty path prefixes of /opt/local/lib/ruby1.9/1.9.1/irb.rb -- a total of 170 lstats apparently just to load this file; another set of lstats then occurs later for another 18 (times 6) times. Clearly, something is running amok in the calling sequence rb_require_safe -> realpath_rec -> lstat. Another example: Running a simple test with the baretest gem causes 17008 calls to lstat. According to perftools.rb, 80 % of the 1.2 seconds of CPU is used in Kernel#gem_original_require (and another 12 in GC, some of which may be caused by this). =end -- http://redmine.ruby-lang.org