From: Yehuda Katz Date: 2012-10-16T22:26:06+09:00 Subject: [ruby-core:48028] Re: [ruby-trunk - Bug #7158] require is slow in its bookkeeping; can make Rails startup 2.2x faster --bcaec555503c8acdc204cc2d18ed Content-Type: text/plain; charset=ISO-8859-1 Yehuda Katz (ph) 718.877.1325 On Tue, Oct 16, 2012 at 12:21 AM, gregprice (Greg Price) wrote: > > Issue #7158 has been updated by gregprice (Greg Price). > > > trans (Thomas Sawyer) wrote: > > I believe a great deal of additional speed could be gained by optimizing > > #require_relative (and making use of it, of course). > > I'd like to keep this thread focused on speeding up existing code which > uses #require. If you're interested in making changes to #require_relative > (which is a function that is not involved in the startup time of most > existing libraries or applications), a separate issue would be the best > place to discuss that. > I would agree. Also, I consider require_relative an antipattern, so I hope people don't start insisting that libraries use it in order to speed things up instead of just speeding up require. > > > > From what I understand, #require_relative ends up calling ordinary > > #require code, which is inefficient since #require_relative already > > knows which path to find the script, so why have require search > > the $LOAD_PATH for it? > > Note that most of the time #require spends is not in searching $LOAD_PATH > -- it's in deciding whether the requested library needs to be loaded at > all, or refers to a file that has already been loaded. That's what this > patch series addresses. (Even the expanded $LOAD_PATH, which this Patch 4 > caches, is used in making that decision before it is used to search to find > the script.) > > Greg > > ---------------------------------------- > Bug #7158: require is slow in its bookkeeping; can make Rails startup 2.2x > faster > https://bugs.ruby-lang.org/issues/7158#change-30820 > > Author: gregprice (Greg Price) > Status: Open > Priority: Normal > Assignee: > Category: core > Target version: > ruby -v: ruby 1.9.3p194 (2012-04-20 revision 35409) [i686-linux] > > > =begin > Starting a large application in Ruby is slow. Most of the startup > time is not spent in the actual work of loading files and running Ruby > code, but in bookkeeping in the 'require' implementation. I've > attached a patch series which makes that bookkeeping much faster. > These patches speed up a large Rails application's startup by 2.2x, > and a pure-'require' benchmark by 3.4x. > > These patches fix two ways in which 'require' is slow. Both problems > have been discussed before, but these patches solve the problems with > less code and stricter compatibility than previous patches I've seen. > > * Currently we iterate through $LOADED_FEATURES to see if anything > matches the newly required feature. Further, each iteration > iterates in turn through $LOAD_PATH. Xavier Shay spotted this > problem last year and a series of patches were discussed > (in Issue #3924) to add a Hash index alongside $LOADED_FEATURES, > but for 1.9.3 none were merged; Masaya Tarui committed Revision r31875, > which mitigated the problem. This series adds a Hash index, > and keeps it up to date even if the user modifies $LOADED_FEATURES. > This is worth a 40% speedup on one large Rails application, > and 2.3x on a pure-'require' benchmark. > > * Currently each 'require' call runs through $LOAD_PATH and calls > rb_file_expand_path() on each element. Yura Sokolov (funny_falcon) > proposed caching this last December in Issue #5767, but it wasn't > merged. This series also caches $LOAD_PATH, and keeps the cache up > to date with a different, less invasive technique. The cache takes > 34 lines of code, and is worth an additional 57% speedup in > starting a Rails app and a 46% speedup in pure 'require'. > > > == Staying Compatible > > With both the $LOADED_FEATURES index and the $LOAD_PATH cache, > > * we exactly preserve the semantics of the user modifying $LOAD_PATH > or $LOADED_FEATURES; > > * both $LOAD_PATH and $LOADED_FEATURES remain ordinary Arrays, with > no singleton methods; > > * we make just one semantic change: each element of $LOAD_PATH and > $LOADED_FEATURES is made into a frozen string. This doesn't limit > the flexibility Ruby offers to the programmer in any way; to alter > an element of either array, one simply reassigns it to the new > value. Further, normal path-munging code which only adds and > removes elements shouldn't have to change at all. > > These patches use the following technique to keep the cache and the > index up to date without modifying the methods of $LOADED_FEATURES or > $LOAD_PATH: we take advantage of the sharing mechanism in the Array > implementation to detect, in O(1) time, whether either array has been > mutated. We cause $LOADED_FEATURES to be shared with an Array we keep > privately in load.c; if anything modifies it, it will break the > sharing and we will know to rebuild the index. Similarly for > $LOAD_PATH. > > > == Benchmarks > > First, on my company's Rails application, where $LOAD_PATH.size is 207 > and $LOADED_FEATURES.size is 2126. I measured the time taken by > 'bundle exec rails runner "p 1"'. > > . Rails startup time, > version best of 5 speedup > v1_9_3_194 12.197s > v1_9_3_194+index 8.688s 1.40x > v1_9_3_194+index+cache 5.538s 2.20x > > And now isolating the performance of 'require', by requiring > 16000 empty files. > > version time, best of 5 speedup > trunk (at r36920) 10.115s > trunk+index 4.363s 2.32x > trunk+index+cache 2.984s 3.39x > > (The timings for the Rails application are based on the latest release > rather than trunk because a number of gems failed to compile against > trunk for me.) > > > == The Patches > > I've attached four patches: > > (1) Patch 1 changes no behavior at all. It adds comments and > simplifies a bit of code to help in understanding why patch 3 is > correct. 42 lines, most of them comments. > > (2) Patch 2 adds a function to array.c which will help us tell when > $LOAD_PATH or $LOADED_FEATURES has been modified. 17 lines. > > (3) Patch 3 adds the $LOADED_FEATURES index. 150 lines. > > (4) Patch 4 adds the $LOAD_PATH cache. 34 lines. > > Reviews and comments welcome -- I'm sure there's something I could do > to make these patches better. I hope we can get some form of them > into trunk before the next release. My life has been happier since I > switched to this version because commands in my Rails application all > run faster now, and I want every Ruby programmer to be happier in the > same way with 2.0 and ideally with 1.9.4. > > =end > > > > -- > http://bugs.ruby-lang.org/ > > --bcaec555503c8acdc204cc2d18ed Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
Yehuda Katz
(ph) 718.877.1325


On Tue, Oct 16, 2012 at 12:21 AM, gregpr= ice (Greg Price) <price@mit.edu> wrote:

Issue #7158 has been updated by gregprice (Greg Price).


trans (Thomas Sawyer) wrote:
> I believe a great deal of additional speed could be gained by optimizi= ng
> #require_relative (and making use of it, of course).

I'd like to keep this thread focused on speeding up existing code= which uses #require. If you're interested in making changes to #requir= e_relative (which is a function that is not involved in the startup time of= most existing libraries or applications), a separate issue would be the be= st place to discuss that.

I would agree. Also, I consider require_re= lative an antipattern, so I hope people don't start insisting that libr= aries use it in order to speed things up instead of just speeding up requir= e.
=A0


> From what I understand, #require_relative ends up calling ordinary
> #require code, which is inefficient since #require_relative already > knows which path to find the script, so why have require search
> the $LOAD_PATH for it?

Note that most of the time #require spends is not in searching $LOAD_= PATH -- it's in deciding whether the requested library needs to be load= ed at all, or refers to a file that has already been loaded. That's wha= t this patch series addresses. (Even the expanded $LOAD_PATH, which this Pa= tch 4 caches, is used in making that decision before it is used to search t= o find the script.)

Greg

----------------------------------------
Bug #7158: require is slow in its bookkeeping; can make Rails startup 2.2x = faster
https://bugs.ruby-lang.org/issues/7158#change-30820

Author: gregprice (Greg Price)
Status: Open
Priority: Normal
Assignee:
Category: core
Target version:
ruby -v: ruby 1.9.3p194 (2012-04-20 revision 35409) [i686-linux]


=3Dbegin
Starting a large application in Ruby is slow. =A0Most of the startup
time is not spent in the actual work of loading files and running Ruby
code, but in bookkeeping in the 'require' implementation. =A0I'= ve
attached a patch series which makes that bookkeeping much faster.
These patches speed up a large Rails application's startup by 2.2x,
and a pure-'require' benchmark by 3.4x.

These patches fix two ways in which 'require' is slow. =A0Both prob= lems
have been discussed before, but these patches solve the problems with
less code and stricter compatibility than previous patches I've seen.
* Currently we iterate through $LOADED_FEATURES to see if anything
=A0 matches the newly required feature. =A0Further, each iteration
=A0 iterates in turn through $LOAD_PATH. =A0Xavier Shay spotted this
=A0 problem last year and a series of patches were discussed
=A0 (in Issue #3924) to add a Hash index alongside $LOADED_FEATURES,
=A0 but for 1.9.3 none were merged; Masaya Tarui committed Revision r31875,=
=A0 which mitigated the problem. =A0This series adds a Hash index,
=A0 and keeps it up to date even if the user modifies $LOADED_FEATURES.
=A0 This is worth a 40% speedup on one large Rails application,
=A0 and 2.3x on a pure-'require' benchmark.

* Currently each 'require' call runs through $LOAD_PATH and calls =A0 rb_file_expand_path() on each element. =A0Yura Sokolov (funny_falcon) =A0 proposed caching this last December in Issue #5767, but it wasn't =A0 merged. =A0This series also caches $LOAD_PATH, and keeps the cache up =A0 to date with a different, less invasive technique. =A0The cache takes =A0 34 lines of code, and is worth an additional 57% speedup in
=A0 starting a Rails app and a 46% speedup in pure 'require'.


=3D=3D Staying Compatible

With both the $LOADED_FEATURES index and the $LOAD_PATH cache,

* we exactly preserve the semantics of the user modifying $LOAD_PATH
=A0 or $LOADED_FEATURES;

* both $LOAD_PATH and $LOADED_FEATURES remain ordinary Arrays, with
=A0 no singleton methods;

* we make just one semantic change: each element of $LOAD_PATH and
=A0 $LOADED_FEATURES is made into a frozen string. =A0This doesn't limi= t
=A0 the flexibility Ruby offers to the programmer in any way; to alter
=A0 an element of either array, one simply reassigns it to the new
=A0 value. =A0Further, normal path-munging code which only adds and
=A0 removes elements shouldn't have to change at all.

These patches use the following technique to keep the cache and the
index up to date without modifying the methods of $LOADED_FEATURES or
$LOAD_PATH: we take advantage of the sharing mechanism in the Array
implementation to detect, in O(1) time, whether either array has been
mutated. =A0We cause $LOADED_FEATURES to be shared with an Array we keep privately in load.c; if anything modifies it, it will break the
sharing and we will know to rebuild the index. =A0Similarly for
$LOAD_PATH.


=3D=3D Benchmarks

First, on my company's Rails application, where $LOAD_PATH.size is 207<= br> and $LOADED_FEATURES.size is 2126. =A0I measured the time taken by
'bundle exec rails runner "p 1"'.

=A0. =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0Rails startup time,
=A0version =A0 =A0 =A0 =A0 =A0 =A0 =A0 best of 5 =A0 =A0 =A0 =A0speedup
=A0v1_9_3_194 =A0 =A0 =A0 =A0 =A0 =A0 12.197s
=A0v1_9_3_194+index =A0 =A0 =A0 =A08.688s =A0 =A0 =A0 =A0 =A01.40x
=A0v1_9_3_194+index+cache =A05.538s =A0 =A0 =A0 =A0 =A02.20x

And now isolating the performance of 'require', by requiring
16000 empty files.

=A0version =A0 =A0 =A0 =A0 =A0 =A0time, best of 5 =A0 =A0 speedup
=A0trunk (at r36920) =A0 =A0 =A010.115s
=A0trunk+index =A0 =A0 =A0 =A0 =A0 =A0 4.363s =A0 =A0 =A0 =A0 =A02.32x
=A0trunk+index+cache =A0 =A0 =A0 2.984s =A0 =A0 =A0 =A0 =A03.39x

(The timings for the Rails application are based on the latest release
rather than trunk because a number of gems failed to compile against
trunk for me.)


=3D=3D The Patches

I've attached four patches:

(1) Patch 1 changes no behavior at all. =A0It adds comments and
=A0 =A0 simplifies a bit of code to help in understanding why patch 3 is =A0 =A0 correct. =A042 lines, most of them comments.

(2) Patch 2 adds a function to array.c which will help us tell when
=A0 =A0 $LOAD_PATH or $LOADED_FEATURES has been modified. =A017 lines.

(3) Patch 3 adds the $LOADED_FEATURES index. =A0150 lines.

(4) Patch 4 adds the $LOAD_PATH cache. =A034 lines.

Reviews and comments welcome -- I'm sure there's something I could = do
to make these patches better. =A0I hope we can get some form of them
into trunk before the next release. =A0My life has been happier since I
switched to this version because commands in my Rails application all
run faster now, and I want every Ruby programmer to be happier in the
same way with 2.0 and ideally with 1.9.4.

=3Dend



--
http://bugs.ruby-l= ang.org/


--bcaec555503c8acdc204cc2d18ed--