From: watson1978@... Date: 2017-07-17T06:39:26+00:00 Subject: [ruby-core:82087] [Ruby trunk Bug#13750] Improve String#casecmp? and Symbol#casecmp? performance with ASCII string Issue #13750 has been updated by watson1978 (Shizuo Fujita). Because String#casecmp? duplicates object at `rb_str_downcase()` every time, so String#casecmp? is slower than String#casecmp? ---------------------------------------- Bug #13750: Improve String#casecmp? and Symbol#casecmp? performance with ASCII string https://bugs.ruby-lang.org/issues/13750#change-65817 * Author: watson1978 (Shizuo Fujita) * Status: Open * Priority: Normal * Assignee: * Target version: * ruby -v: * Backport: 2.2: UNKNOWN, 2.3: UNKNOWN, 2.4: UNKNOWN ---------------------------------------- I think String#casecmp and String#casecmp? are similar methods. But they have different performance on ASCII string. Seems that String#casecmp handles ASCII string only, however it is faster than String#casecmp?. This patch will use the code of String#casecmp with ASCII string on String#casecmp?. But, this patch will introduce a few penalties for UTF8 string in where detect ASCII/UTF8 string. ~~~ String#casecmp? ASCII -> 61.3 % up String#casecmp? UTF8 -> 1.3 % down Symbol#casecmp? ASCII -> 80.0 % up Symbol#casecmp? UTF8 -> 4.0 % down ~~~ ### Before ~~~ Calculating ------------------------------------- String#casecmp 5.961M (�� 3.8%) i/s - 29.838M in 5.017907s String#casecmp? ASCII 3.530M (�� 8.6%) i/s - 17.554M in 5.034848s String#casecmp? UTF8 1.252M (�� 7.4%) i/s - 6.213M in 5.012168s Symbol#casecmp 8.555M (�� 2.4%) i/s - 42.822M in 5.009280s Symbol#casecmp? ASCII 4.235M (�� 9.7%) i/s - 20.824M in 5.001368s Symbol#casecmp? UTF8 1.329M (�� 0.1%) i/s - 6.704M in 5.043725s ~~~ ### After ~~~ Calculating ------------------------------------- String#casecmp 5.984M (�� 6.4%) i/s - 29.829M in 5.020331s String#casecmp? ASCII 5.658M (�� 1.5%) i/s - 28.308M in 5.004547s String#casecmp? UTF8 1.215M (�� 4.3%) i/s - 6.132M in 5.060292s Symbol#casecmp 8.651M (�� 0.9%) i/s - 43.313M in 5.007215s Symbol#casecmp? ASCII 7.462M (�� 0.5%) i/s - 37.489M in 5.023892s Symbol#casecmp? UTF8 1.275M (�� 0.2%) i/s - 6.444M in 5.052743s ~~~ ### Test code ~~~ruby require 'benchmark/ips' Benchmark.ips do |x| x.report "String#casecmp" do |loop| loop.times { "aBcDeF".casecmp("abcdefg") } end x.report "String#casecmp? ASCII" do |loop| loop.times { "aBcDeF".casecmp?("abcdefg") } end x.report "String#casecmp? UTF8" do |loop| loop.times { "\u{e4 f6 fc}".casecmp?("\u{c4 d6 dc}") } end x.report "Symbol#casecmp" do |loop| loop.times { :aBcDeF.casecmp(:abcdefg) } end x.report "Symbol#casecmp? ASCII" do |loop| loop.times { :aBcDeF.casecmp?(:abcdefg) } end x.report "Symbol#casecmp? UTF8" do |loop| loop.times { :"\u{e4 f6 fc}".casecmp?(:"\u{c4 d6 dc}") } end end ~~~ ### Patch https://github.com/ruby/ruby/pull/1668 -- https://bugs.ruby-lang.org/ Unsubscribe: