[#65451] [ruby-trunk - Feature #10333] [PATCH 3/1] optimize: "yoda literal" == string — ko1@...
Issue #10333 has been updated by Koichi Sasada.
ko1@atdot.net wrote:
Eric Wong <normalperson@yhbt.net> wrote:
Eric Wong <normalperson@yhbt.net> wrote:
On 2014/10/09 11:04, Eric Wong wrote:
SASADA Koichi <ko1@atdot.net> wrote:
[#65453] [ruby-trunk - Feature #10328] [PATCH] make OPT_SUPPORT_JOKE a proper VM option — ko1@...
Issue #10328 has been updated by Koichi Sasada.
[#65559] is there a name for this? — Xavier Noria <fxn@...>
When describing stuff about constants (working in their guide), you often
On 2014/10/09 20:41, Xavier Noria wrote:
On Thu, Oct 9, 2014 at 1:59 PM, Nobuyoshi Nakada <nobu@ruby-lang.org> wrote:
[#65566] [ruby-trunk - Feature #10351] [Open] [PATCH] prevent CVE-2014-6277 — shyouhei@...
Issue #10351 has been reported by Shyouhei Urabe.
[#65741] Re: [ruby-cvs:55121] normal:r47971 (trunk): test/ruby/test_rubyoptions.rb: fix race — Nobuyoshi Nakada <nobu@...>
On 2014/10/16 10:10, normal@ruby-lang.org wrote:
Nobuyoshi Nakada <nobu@ruby-lang.org> wrote:
2014-10-16 12:48 GMT+09:00 Eric Wong <normalperson@yhbt.net>:
[#65753] [ruby-trunk - Feature #10333] [PATCH 3/1] optimize: "yoda literal" == string — ko1@...
Issue #10333 has been updated by Koichi Sasada.
[#65818] [ruby-trunk - Feature #10351] [PATCH] prevent CVE-2014-6277 — shyouhei@...
Issue #10351 has been updated by Shyouhei Urabe.
[ruby-core:65459] [ruby-trunk - Feature #10333] [PATCH 3/1] optimize: "yoda literal" == string
Issue #10333 has been updated by Eric Wong.
ko1@atdot.net wrote:
> Comments for this ticket and the following tickets:
>
> > 1) [Feature #10326] optimize: recv << "literal string"
> > 2) [Feature #10329] optimize: foo == "literal string"
> To continue this kind of hack, we need tons of instructions for each
> methods. What do you think about it?
I am not completely happy with my current patches because of verbosity
and also icache footprint in the main VM loop. Ruby executable sizes
(even stripped) seem to get bigger with every release :<
However, perhaps the biggest performance problem is still too many
allocations and garbage objects; so I am willing to trade some code
size to reduce dynamic allocations.
> Basically, we need to add them more carefully. For example, persuasive
> explanation are needed, such as statistics (analysis by
> parser/compier), benchmark results for major use cases (maybe "<<
> 'literal'" for templates. but not sure this ticket for) .
Right, we will need to find more real benchmarks.
Sadly, there are many places where garbage grows. So maybe this change
is only 1-2% overall. We may need a lot of small changes to add up to
noticeable improvements.
> Another idea is to make more general approach to indicate arguments
> (and a receiver) are string literal. It is called specialization.
> Specialized instructions (opt_plus and so on) is a kind of
> specialization by hands.
I've been thinking along these lines, too. For example, I would like to
see String#tr! and String#gsub! able to avoid allocations for literal
strings. Or even optimize: Time.now.{to_f,to_i,strftime("lit"))}
As suggested by akr, users may .freeze (or use constants), but that is
verbose and requires VM internal knowledge. My goal is to make
optimization as transparent as possible so users may write concise,
idiomatic Ruby code.
It would be great if things like ruby-trunk r47813
(unicode_norm_gen.rb: optimize concatenation)
can be done transparently, even.
> Small comments:
>
> (1) iseq_compile_each() should not use opt_* instructions because we
> should be able to make instructions without opt_* insns (on/off by
> compile options).
Right. I'll see about making it optional and doing it more
idiomatically. I mainly used existing opt_{aref,aset}_with compilation
as a guide.
> (2) Name of instructions should be reconsidered.
OK, I do not mind changing names.
I also did informal benchmarks with my system Perl installation
(Perl 5.14.2 on Debian stable x86-64):
> loop_whileloop2
use strict;
my $i = 0;
while ($i < 6_000_000) { # benchmark loop 2
$i += 1;
}
Perl 0.228s
> trunk 0.10645449301227927
> built 0.10581812914460897
Without the string compare, we're already faster than Perl \o/
> vm2_streq1
use strict;
my $i = 0;
my $foo = "literal";
while ($i < 6_000_000) { # benchmark loop 2
$i += 1;
$foo eq "literal";
}
Perl 0.0349s
> trunk 0.4726782930083573
> built 0.18452610215172172
We lose to Perl without the optimization, but win with it :)
This is just a micro-benchmark, of course, but I think it's an important
data point to show gains by avoiding allocations when possible
----------------------------------------
Feature #10333: [PATCH 3/1] optimize: "yoda literal" == string
https://bugs.ruby-lang.org/issues/10333#change-49241
* Author: Eric Wong
* Status: Open
* Priority: Normal
* Assignee:
* Category: core
* Target version: current: 2.2.0
----------------------------------------
This is a follow-up-to:
1) [Feature #10326] optimize: recv << "literal string"
2) [Feature #10329] optimize: foo == "literal string"
This can be slightly faster than: (string == "literal") because
we can guaranteed the "yoda literal" is already a string at
compile time.
Updated benchmarks from Xeon E3-1230 v3 @ 3.30GHz:
target 0: trunk (ruby 2.2.0dev (2014-10-06 trunk 47822) [x86_64-linux]) at "/home/ew/rrrr/b/i/bin/ruby"
target 1: built (ruby 2.2.0dev (2014-10-06 trunk 47822) [x86_64-linux]) at "/home/ew/ruby/b/i/bin/ruby"
-----------------------------------------------------------
loop_whileloop2
i = 0
while i< 6_000_000 # benchmark loop 2
i += 1
end
trunk 0.10712811909615993
trunk 0.10693809622898698
trunk 0.10645449301227927
trunk 0.10646287119016051
built 0.10612367931753397
built 0.10581812914460897
built 0.10592922195792198
built 0.10595094738528132
-----------------------------------------------------------
vm2_streq1
i = 0
foo = "literal"
while i<6_000_000 # benchmark loop 2
i += 1
foo == "literal"
end
trunk 0.47250875690951943
trunk 0.47325073881074786
trunk 0.4726782930083573
trunk 0.4727754699997604
built 0.185972370672971
built 0.1850820742547512
built 0.18558283289894462
built 0.18452610215172172
-----------------------------------------------------------
vm2_streq2
i = 0
foo = "literal"
while i<6_000_000 # benchmark loop 2
i += 1
"literal" == foo
end
trunk 0.4719057851471007
trunk 0.4715963830240071
trunk 0.47177061904221773
trunk 0.4724834677763283
built 0.18247668212279677
built 0.18143231887370348
built 0.18060296680778265
built 0.17929687118157744
-----------------------------------------------------------
raw data:
[["loop_whileloop2",
[[0.10712811909615993,
0.10693809622898698,
0.10645449301227927,
0.10646287119016051],
[0.10612367931753397,
0.10581812914460897,
0.10592922195792198,
0.10595094738528132]]],
["vm2_streq1",
[[0.47250875690951943,
0.47325073881074786,
0.4726782930083573,
0.4727754699997604],
[0.185972370672971,
0.1850820742547512,
0.18558283289894462,
0.18452610215172172]]],
["vm2_streq2",
[[0.4719057851471007,
0.4715963830240071,
0.47177061904221773,
0.4724834677763283],
[0.18247668212279677,
0.18143231887370348,
0.18060296680778265,
0.17929687118157744]]]]
Elapsed time: 6.097474559 (sec)
-----------------------------------------------------------
benchmark results:
minimum results in each 4 measurements.
Execution time (sec)
name trunk built
loop_whileloop2 0.106 0.106
vm2_streq1* 0.366 0.079
vm2_streq2* 0.365 0.073
Speedup ratio: compare with the result of `trunk' (greater is better)
name built
loop_whileloop2 1.006
vm2_streq1* 4.651
vm2_streq2* 4.969
---
benchmark/bm_vm2_streq2.rb | 6 ++++++
compile.c | 20 +++++++++++++++++++-
insns.def | 20 ++++++++++++++++++++
test/ruby/test_string.rb | 12 ++++++++----
4 files changed, 53 insertions(+), 5 deletions(-)
create mode 100644 benchmark/bm_vm2_streq2.rb
---Files--------------------------------
0001-optimize-yoda-literal-string.patch (6.23 KB)
--
https://bugs.ruby-lang.org/