From: s.wanabe@... Date: 2017-02-19T01:12:25+00:00 Subject: [ruby-dev:49985] [Ruby trunk Bug#13228] s[i]=c(assigning a character) for String is slower than Array on Linux Issue #13228 has been updated by _ wanabe. ruby -v set to ruby 2.4.0p0 (2016-12-24 revision 57164) [x86_64-linux] `perf` shows that ruby spent most of the time in `search_nonascii()`. ``` $ perf record ruby -ve 'n=100000; s = "a" * n; t = Time.now; n.times do |i| s[i] = "z"; end; p Time.now - t' ruby 2.5.0dev (2017-02-18 trunk 57652) [x86_64-linux] 5.271689721 [ perf record: Woken up 4 times to write data ] [ perf record: Captured and wrote 0.858 MB perf.data (21558 samples) ] $ perf report -n --stdio|head -20 # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 21K of event 'cycles' # Event count (approx.): 15739606081 # # Overhead Samples Command Shared Object Symbol # ........ ............ ........ ................. .......................................... # 96.14% 20654 ruby ruby [.] search_nonascii 0.18% 45 ruby ruby [.] ruby_yyparse 0.18% 38 ruby [wl] [k] osl_readl 0.17% 38 ruby ruby [.] vm_exec_core 0.13% 27 ruby [kernel.kallsyms] [k] delay_tsc 0.07% 16 ruby ruby [.] rb_str_splice_0 0.07% 15 ruby ruby [.] gc_page_sweep 0.07% 15 ruby ruby [.] rb_enc_from_index 0.06% 13 ruby ruby [.] rb_str_update ``` I wonder the script uses only ASCII characters, and we have `RUBY_ENC_CODERANGE_7BIT`. But `rb_str_splice_0()` calls `rb_str_modify()` and clear code-range information by `ENC_CODERANGE_CLEAR()`. ---------------------------------------- Bug #13228: s[i]=c(assigning a character) for String is slower than Array on Linux https://bugs.ruby-lang.org/issues/13228#change-63029 * Author: Tsuneo Yoshioka * Status: Open * Priority: Normal * Assignee: * Target version: * ruby -v: ruby 2.4.0p0 (2016-12-24 revision 57164) [x86_64-linux] * Backport: 2.2: UNKNOWN, 2.3: UNKNOWN, 2.4: UNKNOWN ---------------------------------------- s[i]=c(assigning a character) for String is slower than for Array on Linux. If I split the String to Array, and assign characters, and join the Array to String, then it is much faster than assigning characters directly to the string. Somehow, I don't see the performance difference on Mac OS X. ~$ time ruby -e 'N=100000; s="a"*N; N.times{s[Random.rand(N)]="Z"}; puts s' >/dev/null real 0m0.879s user 0m0.836s sys 0m0.012s ~$ time ruby -e 'N=100000;s="a"*N;s=s.split(""); N.times{s[Random.rand(N)]="Z"}; puts s.join("")' >/dev/null real 0m0.153s user 0m0.108s sys 0m0.016s ~$ uname -a Linux aaaaaaaa 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux ~$ ruby --version ruby 2.4.0p0 (2016-12-24 revision 57164) [x86_64-linux] ~$ lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 16.04.1 LTS Release: 16.04 Codename: xenial -- https://bugs.ruby-lang.org/