From: ko1@...
Date: 2014-10-07T03:14:00+00:00
Subject: [ruby-core:65451] [ruby-trunk - Feature #10333] [PATCH 3/1] optimize: "yoda literal" == string

Issue #10333 has been updated by Koichi Sasada.


Comments for this ticket and the following tickets:

> 1) [Feature #10326] optimize: recv << "literal string"
> 2) [Feature #10329] optimize: foo == "literal string"

To continue this kind of hack, we need tons of instructions for each methods.
What do you think about it?

Basically, we need to add them more carefully. For example, persuasive explanation are needed, such as statistics (analysis by parser/compier), benchmark results for major use cases (maybe "<< 'literal'" for templates. but not sure this ticket for) .

Another idea is to make more general approach to indicate arguments (and a receiver) are string literal. It is called specialization. Specialized instructions (opt_plus and so on) is a kind of specialization by hands.


Small comments:

(1) iseq_compile_each() should not use opt_* instructions because we should be able to make instructions without opt_* insns (on/off by compile options).

(2) Name of instructions should be reconsidered.


----------------------------------------
Feature #10333: [PATCH 3/1] optimize: "yoda literal" == string
https://bugs.ruby-lang.org/issues/10333#change-49235

* Author: Eric Wong
* Status: Open
* Priority: Normal
* Assignee: 
* Category: core
* Target version: current: 2.2.0
----------------------------------------
This is a follow-up-to:

1) [Feature #10326] optimize: recv << "literal string"
2) [Feature #10329] optimize: foo == "literal string"

This can be slightly faster than: (string == "literal") because
we can guaranteed the "yoda literal" is already a string at
compile time.

Updated benchmarks from Xeon E3-1230 v3 @ 3.30GHz:

target 0: trunk (ruby 2.2.0dev (2014-10-06 trunk 47822) [x86_64-linux]) at "/home/ew/rrrr/b/i/bin/ruby"
target 1: built (ruby 2.2.0dev (2014-10-06 trunk 47822) [x86_64-linux]) at "/home/ew/ruby/b/i/bin/ruby"

-----------------------------------------------------------
loop_whileloop2

i = 0
while i< 6_000_000 # benchmark loop 2
  i += 1
end

trunk	0.10712811909615993
trunk	0.10693809622898698
trunk	0.10645449301227927
trunk	0.10646287119016051
built	0.10612367931753397
built	0.10581812914460897
built	0.10592922195792198
built	0.10595094738528132

-----------------------------------------------------------
vm2_streq1

i = 0
foo = "literal"
while i<6_000_000 # benchmark loop 2
  i += 1
  foo == "literal"
end

trunk	0.47250875690951943
trunk	0.47325073881074786
trunk	0.4726782930083573
trunk	0.4727754699997604
built	0.185972370672971
built	0.1850820742547512
built	0.18558283289894462
built	0.18452610215172172

-----------------------------------------------------------
vm2_streq2

i = 0
foo = "literal"
while i<6_000_000 # benchmark loop 2
  i += 1
  "literal" == foo
end

trunk	0.4719057851471007
trunk	0.4715963830240071
trunk	0.47177061904221773
trunk	0.4724834677763283
built	0.18247668212279677
built	0.18143231887370348
built	0.18060296680778265
built	0.17929687118157744

-----------------------------------------------------------
raw data:

[["loop_whileloop2",
  [[0.10712811909615993,
    0.10693809622898698,
    0.10645449301227927,
    0.10646287119016051],
   [0.10612367931753397,
    0.10581812914460897,
    0.10592922195792198,
    0.10595094738528132]]],
 ["vm2_streq1",
  [[0.47250875690951943,
    0.47325073881074786,
    0.4726782930083573,
    0.4727754699997604],
   [0.185972370672971,
    0.1850820742547512,
    0.18558283289894462,
    0.18452610215172172]]],
 ["vm2_streq2",
  [[0.4719057851471007,
    0.4715963830240071,
    0.47177061904221773,
    0.4724834677763283],
   [0.18247668212279677,
    0.18143231887370348,
    0.18060296680778265,
    0.17929687118157744]]]]

Elapsed time: 6.097474559 (sec)
-----------------------------------------------------------
benchmark results:
minimum results in each 4 measurements.
Execution time (sec)
name	trunk	built
loop_whileloop2	0.106	0.106
vm2_streq1*	0.366	0.079
vm2_streq2*	0.365	0.073

Speedup ratio: compare with the result of `trunk' (greater is better)
name	built
loop_whileloop2	1.006
vm2_streq1*	4.651
vm2_streq2*	4.969
---
 benchmark/bm_vm2_streq2.rb |  6 ++++++
 compile.c                  | 20 +++++++++++++++++++-
 insns.def                  | 20 ++++++++++++++++++++
 test/ruby/test_string.rb   | 12 ++++++++----
 4 files changed, 53 insertions(+), 5 deletions(-)
 create mode 100644 benchmark/bm_vm2_streq2.rb


---Files--------------------------------
0001-optimize-yoda-literal-string.patch (6.23 KB)


-- 
https://bugs.ruby-lang.org/