[#98098] [Ruby master Feature#16824] Follow RubyGems naming conventions for the stdlib — shannonskipper@...

Issue #16824 has been reported by shan (Shannon Skipper).

14 messages 2020/05/01

[#98147] [Ruby master Feature#16832] Use #name rather than #inspect to build "uninitialized constant" error messages — jean.boussier@...

Issue #16832 has been reported by byroot (Jean Boussier).

20 messages 2020/05/06

[#98174] [Ruby master Bug#16837] Can we make Ruby 3.0 as fast as Ruby 2.7 with the new assertions? — takashikkbn@...

Issue #16837 has been reported by k0kubun (Takashi Kokubun).

10 messages 2020/05/07

[#98241] [Ruby master Bug#16845] Building Ruby with old existing system Ruby results in make error with ./tool/file2lastrev.rb — erik@...

Issue #16845 has been reported by ErikSwan (Erik Swan).

7 messages 2020/05/09

[#98256] [Ruby master Feature#16847] Cache instruction sequences by default — jean.boussier@...

Issue #16847 has been reported by byroot (Jean Boussier).

16 messages 2020/05/11

[#98257] [Ruby master Feature#16848] Allow callables in $LOAD_PATH — jean.boussier@...

Issue #16848 has been reported by byroot (Jean Boussier).

27 messages 2020/05/11

[#98318] [Ruby master Bug#16853] calling bla(hash, **kw) with a string-based hash passes the strings into **kw (worked < 2.7) — sylvain.joyeux@...4x.org

Issue #16853 has been reported by sylvain.joyeux (Sylvain Joyeux).

12 messages 2020/05/13

[#98355] [Ruby master Bug#16889] TracePoint.enable { ... } also activates the TracePoint for other threads, even outside the block — eregontp@...

Issue #16889 has been reported by Eregon (Benoit Daloze).

16 messages 2020/05/14

[#98363] [Ruby master Feature#16891] Restore Positional Argument to Keyword Conversion — merch-redmine@...

Issue #16891 has been reported by jeremyevans0 (Jeremy Evans).

23 messages 2020/05/14

[#98371] [Ruby master Feature#16894] Integer division for Ruby 3 — andrew@...

Issue #16894 has been reported by ankane (Andrew Kane).

18 messages 2020/05/15

[#98391] [Ruby master Bug#16896] MakeMakefile methods should be private — eregontp@...

Issue #16896 has been reported by Eregon (Benoit Daloze).

10 messages 2020/05/15

[#98396] [Ruby master Feature#16897] Can a Ruby 3.0 compatible general purpose memoizer be written in such a way that it matches Ruby 2 performance? — sam.saffron@...

Issue #16897 has been reported by sam.saffron (Sam Saffron).

25 messages 2020/05/16

[#98453] [Ruby master Bug#16904] rubygems: psych: superclass mismatch for class Mark (TypeError) — jaruga@...

Issue #16904 has been reported by jaruga (Jun Aruga).

18 messages 2020/05/20

[#98486] [Ruby master Bug#16908] Strange behaviour of Hash#shift when used with `default_proc`. — samuel@...

Issue #16908 has been reported by ioquatix (Samuel Williams).

14 messages 2020/05/23

[#98569] [Ruby master Bug#16921] s390x: ramdom test failures for timeout or segmentation fault — jaruga@...

Issue #16921 has been reported by jaruga (Jun Aruga).

9 messages 2020/05/29

[#98599] [Ruby master Bug#16926] Kernel#require does not load a feature twice when $LOAD_PATH has been modified spec fails only on 2.7 — eregontp@...

Issue #16926 has been reported by Eregon (Benoit Daloze).

12 messages 2020/05/31

[ruby-core:98183] [Ruby master Feature#16837] Can we make Ruby 3.0 as fast as Ruby 2.7 with the new assertions?

From: shyouhei@...
Date: 2020-05-07 09:22:17 UTC
List: ruby-core #98183
Issue #16837 has been updated by shyouhei (Shyouhei Urabe).


Some analysis of the slowdown.

Looking at the generated binary and `perf` output, the slowdown is because some functions are not inlined.  Might depend on compilers, but for me `rb_array_len()` is one of such victim:

```
zsh % gdb -batch -ex 'file miniruby' -ex 'disassemble rb_array_len'
Dump of assembler code for function rb_array_len:
   0x0000000000295540 <+0>:     push   %rbx
   0x0000000000295541 <+1>:     mov    %rdi,%rbx
   0x0000000000295544 <+4>:     test   $0x7,%bl
   0x0000000000295547 <+7>:     jne    0x2955be <rb_array_len+126>
   0x0000000000295549 <+9>:     mov    %rbx,%rax
   0x000000000029554c <+12>:    and    $0xfffffffffffffff7,%rax
   0x0000000000295550 <+16>:    je     0x2955be <rb_array_len+126>
   0x0000000000295552 <+18>:    mov    (%rbx),%rax
   0x0000000000295555 <+21>:    mov    %eax,%edx
   0x0000000000295557 <+23>:    and    $0x1f,%edx
   0x000000000029555a <+26>:    mov    $0x7,%ecx
   0x000000000029555f <+31>:    cmp    $0x7,%edx
   0x0000000000295562 <+34>:    jne    0x295585 <rb_array_len+69>
   0x0000000000295564 <+36>:    test   $0x2000,%eax
   0x0000000000295569 <+41>:    jne    0x295571 <rb_array_len+49>
   0x000000000029556b <+43>:    mov    0x10(%rbx),%rax
   0x000000000029556f <+47>:    pop    %rbx
   0x0000000000295570 <+48>:    retq
   0x0000000000295571 <+49>:    cmp    $0x7,%ecx
   0x0000000000295574 <+52>:    jne    0x2955a2 <rb_array_len+98>
   0x0000000000295576 <+54>:    test   $0x2000,%eax
   0x000000000029557b <+59>:    je     0x2955ea <rb_array_len+170>
   0x000000000029557d <+61>:    shr    $0xf,%eax
   0x0000000000295580 <+64>:    and    $0x3,%eax
   0x0000000000295583 <+67>:    pop    %rbx
   0x0000000000295584 <+68>:    retq
   0x0000000000295585 <+69>:    mov    %rbx,%rdi
   0x0000000000295588 <+72>:    mov    $0x7,%esi
   0x000000000029558d <+77>:    callq  0xcaea2 <rb_check_type>
   0x0000000000295592 <+82>:    mov    (%rbx),%rax
   0x0000000000295595 <+85>:    mov    %eax,%ecx
   0x0000000000295597 <+87>:    and    $0x1f,%ecx
   0x000000000029559a <+90>:    cmp    $0x1b,%rcx
   0x000000000029559e <+94>:    jne    0x295564 <rb_array_len+36>
   0x00000000002955a0 <+96>:    jmp    0x2955cb <rb_array_len+139>
   0x00000000002955a2 <+98>:    mov    %rbx,%rdi
   0x00000000002955a5 <+101>:   mov    $0x7,%esi
   0x00000000002955aa <+106>:   callq  0xcaea2 <rb_check_type>
   0x00000000002955af <+111>:   mov    (%rbx),%rax
   0x00000000002955b2 <+114>:   mov    %eax,%ecx
   0x00000000002955b4 <+116>:   and    $0x1f,%ecx
   0x00000000002955b7 <+119>:   cmp    $0x1b,%ecx
   0x00000000002955ba <+122>:   jne    0x295576 <rb_array_len+54>
   0x00000000002955bc <+124>:   jmp    0x2955cb <rb_array_len+139>
   0x00000000002955be <+126>:   mov    %rbx,%rdi
   0x00000000002955c1 <+129>:   mov    $0x7,%esi
   0x00000000002955c6 <+134>:   callq  0xcaea2 <rb_check_type>
   0x00000000002955cb <+139>:   lea    0x142fe(%rip),%rdi        # 0x2a98d0
   0x00000000002955d2 <+146>:   lea    0x1432f(%rip),%rdx        # 0x2a9908
   0x00000000002955d9 <+153>:   lea    0x14337(%rip),%rcx        # 0x2a9917
   0x00000000002955e0 <+160>:   mov    $0xea,%esi
   0x00000000002955e5 <+165>:   callq  0xcad86 <rb_assert_failure>
   0x00000000002955ea <+170>:   lea    0x14338(%rip),%rdi        # 0x2a9929
   0x00000000002955f1 <+177>:   lea    0x1436d(%rip),%rdx        # 0x2a9965
   0x00000000002955f8 <+184>:   lea    0x14377(%rip),%rcx        # 0x2a9976
   0x00000000002955ff <+191>:   mov    $0x79,%esi
   0x0000000000295604 <+196>:   callq  0xcad86 <rb_assert_failure>
End of assembler dump.
```

Here, assertions practically never fail.  This means jumps are 100% predicted (almost no-op).  They don't slow things.  The problem is those unreachable branches.  If you can read the assembly you see almost 2/3 of the above function just never reach.  They blow the generated binary up significantly.  `rb_array_len` is thus now considered too big to be inlined, to my compiler at least.

An obvious ad-hoc remedy is to supply `__attribute__((__always_inline__))` for everything.  But I don't think that's a good idea, because what is inlined and what is not depends very much on compilers, versions, target architectures, and almost everything.

----------------------------------------
Feature #16837: Can we make Ruby 3.0 as fast as Ruby 2.7 with the new assertions?
https://bugs.ruby-lang.org/issues/16837#change-85423

* Author: k0kubun (Takashi Kokubun)
* Status: Open
* Priority: Normal
----------------------------------------
## Problem
How can we make Ruby 3.0 as fast as (or faster than) Ruby 2.7?

### Background
* Split ruby.h https://github.com/ruby/ruby/pull/2991 added some new assertions
* While it has been helpful for revealing various bugs, it also made some Ruby programs notably slow, especially Optcarrot https://benchmark-driver.github.io/benchmarks/optcarrot/commits.html

## Possible approaches
I have no strong preference yet. Here are some random ideas:

* Optimize the assertion code somehow
* Enable the new assertions only on CIs, at least ones in hot spots
  * Not sure which places have large impact on Optcarrot yet
* Make some other not-so-important assertions CI-only to offset the impact from new ones
* Provide .so for an assertion-enabled mode? (ko1's idea)

I hope people will comment more ideas in this ticket.



-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>

In This Thread