[#101981] [Ruby master Bug#17519] set_visibility fails when a prepended module and a refinement both exist — dbfeldman@...

Issue #17519 has been reported by fledman (David Feldman).

12 messages 2021/01/08

[#102003] [Ruby master Bug#17527] rb_io_wait_readable/writable with scheduler don't check errno — julien@...

Issue #17527 has been reported by ysbaddaden (Julien Portalier).

13 messages 2021/01/11

[#102065] [Ruby master Bug#17536] Segfault in `CFUNC :define_method` — v.ondruch@...

Issue #17536 has been reported by vo.x (Vit Ondruch).

13 messages 2021/01/13

[#102083] [Ruby master Bug#17540] A segfault due to Clang/LLVM optimization on 32-bit ARM Linux — xtkoba+ruby@...

Issue #17540 has been reported by xtkoba (Tee KOBAYASHI).

12 messages 2021/01/14

[#102102] [Ruby master Bug#17543] Ractor isolation broken by `self` in shareable proc — marcandre-ruby-core@...

Issue #17543 has been reported by marcandre (Marc-Andre Lafortune).

14 messages 2021/01/15

[#102118] [Ruby master Feature#17548] Need simple way to include symlink directories in Dir.glob — keithrbennett@...

Issue #17548 has been reported by keithrbennett (Keith Bennett).

8 messages 2021/01/17

[#102158] [Ruby master Bug#17560] Does `Module#ruby2_keywords` return `nil` or `self`? — nobu@...

Issue #17560 has been reported by nobu (Nobuyoshi Nakada).

9 messages 2021/01/19

[#102163] [Ruby master Bug#17561] The timeout option for Addrinfo.getaddrinfo is not reliable on Ruby 2.7.2 — sean@...

Issue #17561 has been reported by smcgivern (Sean McGivern).

8 messages 2021/01/19

[#102249] [Ruby master Bug#17583] Segfault on large stack(RUBY_THREAD_VM_STACK_SIZE) — yoshiokatsuneo@...

Issue #17583 has been reported by yoshiokatsuneo (Tsuneo Yoshioka).

12 messages 2021/01/26

[#102256] [Ruby master Bug#17585] DWAR5 support? — v.ondruch@...

Issue #17585 has been reported by vo.x (Vit Ondruch).

19 messages 2021/01/26

[#102301] [Ruby master Bug#17591] Test frameworks and REPLs do not show deprecation warnings by default — eregontp@...

Issue #17591 has been reported by Eregon (Benoit Daloze).

14 messages 2021/01/29

[#102305] [Ruby master Feature#17592] Ractor should allowing reading shareable class instance variables — marcandre-ruby-core@...

Issue #17592 has been reported by marcandre (Marc-Andre Lafortune).

25 messages 2021/01/29

[ruby-core:102224] [Ruby master Feature#13750] Improve String#casecmp? and Symbol#casecmp? performance with ASCII string

From: naruse@...
Date: 2021-01-24 07:12:36 UTC
List: ruby-core #102224
Issue #13750 has been updated by naruse (Yui NARUSE).


When you avoid that case, you have a option around coderange: coderange is a cached information whether the string contains (1) only ASCII 7 bit characters (2) also has 8 bit characters (3) broken byte sequence (4) unknown. Some strings are already scanned its coderange and caches it in a string object, but others are not. Whether this casecmp? optimization uses the cache and not scan string if the cache doesn't exist, or scan if it doesn't have a cache. If you use the cache, I wonder whether strings in real applications have cache or not. If you scan, I wonder if it still gets faster.

----------------------------------------
Feature #13750: Improve String#casecmp? and Symbol#casecmp? performance with ASCII string
https://bugs.ruby-lang.org/issues/13750#change-90074

* Author: watson1978 (Shizuo Fujita)
* Status: Open
* Priority: Normal
----------------------------------------
I think String#casecmp and String#casecmp? are similar methods. But they have different performance with ASCII strings.

It seems that String#casecmp handles ASCII string only, but it is faster than String#casecmp?.

This patch uses the code of String#casecmp on String#casecmp? for ASCII strings. However, it introduces a minor penalty for UTF8 strings due to detection of ASCII/UTF8 strings.

~~~
String#casecmp? ASCII -> 61.3 % up
String#casecmp? UTF8  ->  1.3 % down
Symbol#casecmp? ASCII -> 80.0 % up
Symbol#casecmp? UTF8  ->  4.0 % down
~~~

### Before
~~~
Calculating -------------------------------------
      String#casecmp      5.961M (3.8%) i/s -     29.838M in   5.017907s
String#casecmp? ASCII
                          3.530M (ア 8.6%) i/s -     17.554M in   5.034848s
String#casecmp? UTF8      1.252M (ア 7.4%) i/s -      6.213M in   5.012168s
      Symbol#casecmp      8.555M (ア 2.4%) i/s -     42.822M in   5.009280s
Symbol#casecmp? ASCII
                          4.235M (ア 9.7%) i/s -     20.824M in   5.001368s
Symbol#casecmp? UTF8      1.329M (ア 0.1%) i/s -      6.704M in   5.043725s
~~~

### After
~~~
Calculating -------------------------------------
      String#casecmp      5.984M (ア 6.4%) i/s -     29.829M in   5.020331s
String#casecmp? ASCII
                          5.658M (ア 1.5%) i/s -     28.308M in   5.004547s
String#casecmp? UTF8      1.215M (ア 4.3%) i/s -      6.132M in   5.060292s
      Symbol#casecmp      8.651M (ア 0.9%) i/s -     43.313M in   5.007215s
Symbol#casecmp? ASCII
                          7.462M (ア 0.5%) i/s -     37.489M in   5.023892s
Symbol#casecmp? UTF8      1.275M (ア 0.2%) i/s -      6.444M in   5.052743s
~~~


### Test code
~~~ruby
require 'benchmark/ips'

Benchmark.ips do |x|
  x.report "String#casecmp" do |loop|
    loop.times { "aBcDeF".casecmp("abcdefg") }
  end
  x.report "String#casecmp? ASCII" do |loop|
    loop.times { "aBcDeF".casecmp?("abcdefg") }
  end
  x.report "String#casecmp? UTF8" do |loop|
    loop.times { "\u{e4 f6 fc}".casecmp?("\u{c4 d6 dc}") }
  end

  x.report "Symbol#casecmp" do |loop|
    loop.times { :aBcDeF.casecmp(:abcdefg) }
  end
  x.report "Symbol#casecmp? ASCII" do |loop|
    loop.times { :aBcDeF.casecmp?(:abcdefg) }
  end
  x.report "Symbol#casecmp? UTF8" do |loop|
    loop.times { :"\u{e4 f6 fc}".casecmp?(:"\u{c4 d6 dc}") }
  end
end
~~~

### Patch
https://github.com/ruby/ruby/pull/1668



-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>

In This Thread

Prev Next