From: Eric Wong Date: 2014-10-07T07:54:04+00:00 Subject: [ruby-core:65458] Re: [ruby-trunk - Feature #10333] [PATCH 3/1] optimize: "yoda literal" == string ko1@atdot.net wrote: > Comments for this ticket and the following tickets: > > > 1) [Feature #10326] optimize: recv << "literal string" > > 2) [Feature #10329] optimize: foo == "literal string" > To continue this kind of hack, we need tons of instructions for each > methods. What do you think about it? I am not completely happy with my current patches because of verbosity and also icache footprint in the main VM loop. Ruby executable sizes (even stripped) seem to get bigger with every release :< However, perhaps the biggest performance problem is still too many allocations and garbage objects; so I am willing to trade some code size to reduce dynamic allocations. > Basically, we need to add them more carefully. For example, persuasive > explanation are needed, such as statistics (analysis by > parser/compier), benchmark results for major use cases (maybe "<< > 'literal'" for templates. but not sure this ticket for) . Right, we will need to find more real benchmarks. Sadly, there are many places where garbage grows. So maybe this change is only 1-2% overall. We may need a lot of small changes to add up to noticeable improvements. > Another idea is to make more general approach to indicate arguments > (and a receiver) are string literal. It is called specialization. > Specialized instructions (opt_plus and so on) is a kind of > specialization by hands. I've been thinking along these lines, too. For example, I would like to see String#tr! and String#gsub! able to avoid allocations for literal strings. Or even optimize: Time.now.{to_f,to_i,strftime("lit"))} As suggested by akr, users may .freeze (or use constants), but that is verbose and requires VM internal knowledge. My goal is to make optimization as transparent as possible so users may write concise, idiomatic Ruby code. It would be great if things like ruby-trunk r47813 (unicode_norm_gen.rb: optimize concatenation) can be done transparently, even. > Small comments: > > (1) iseq_compile_each() should not use opt_* instructions because we > should be able to make instructions without opt_* insns (on/off by > compile options). Right. I'll see about making it optional and doing it more idiomatically. I mainly used existing opt_{aref,aset}_with compilation as a guide. > (2) Name of instructions should be reconsidered. OK, I do not mind changing names. I also did informal benchmarks with my system Perl installation (Perl 5.14.2 on Debian stable x86-64): > loop_whileloop2 use strict; my $i = 0; while ($i < 6_000_000) { # benchmark loop 2 $i += 1; } Perl 0.228s > trunk 0.10645449301227927 > built 0.10581812914460897 Without the string compare, we're already faster than Perl \o/ > vm2_streq1 use strict; my $i = 0; my $foo = "literal"; while ($i < 6_000_000) { # benchmark loop 2 $i += 1; $foo eq "literal"; } Perl 0.0349s > trunk 0.4726782930083573 > built 0.18452610215172172 We lose to Perl without the optimization, but win with it :) This is just a micro-benchmark, of course, but I think it's an important data point to show gains by avoiding allocations when possible