[ruby-dev:49985] [Ruby trunk Bug#13228] s[i]=c(assigning a character) for String is slower than Array on Linux
From:
s.wanabe@...
Date:
2017-02-19 01:12:25 UTC
List:
ruby-dev #49985
Issue #13228 has been updated by _ wanabe.
ruby -v set to ruby 2.4.0p0 (2016-12-24 revision 57164) [x86_64-linux]
`perf` shows that ruby spent most of the time in `search_nonascii()`.
```
$ perf record ruby -ve 'n=100000; s = "a" * n; t = Time.now; n.times do |i| s[i] = "z"; end; p Time.now - t'
ruby 2.5.0dev (2017-02-18 trunk 57652) [x86_64-linux]
5.271689721
[ perf record: Woken up 4 times to write data ]
[ perf record: Captured and wrote 0.858 MB perf.data (21558 samples) ]
$ perf report -n --stdio|head -20
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 21K of event 'cycles'
# Event count (approx.): 15739606081
#
# Overhead Samples Command Shared Object Symbol
# ........ ............ ........ ................. ..........................................
#
96.14% 20654 ruby ruby [.] search_nonascii
0.18% 45 ruby ruby [.] ruby_yyparse
0.18% 38 ruby [wl] [k] osl_readl
0.17% 38 ruby ruby [.] vm_exec_core
0.13% 27 ruby [kernel.kallsyms] [k] delay_tsc
0.07% 16 ruby ruby [.] rb_str_splice_0
0.07% 15 ruby ruby [.] gc_page_sweep
0.07% 15 ruby ruby [.] rb_enc_from_index
0.06% 13 ruby ruby [.] rb_str_update
```
I wonder the script uses only ASCII characters, and we have `RUBY_ENC_CODERANGE_7BIT`.
But `rb_str_splice_0()` calls `rb_str_modify()` and clear code-range information by `ENC_CODERANGE_CLEAR()`.
----------------------------------------
Bug #13228: s[i]=c(assigning a character) for String is slower than Array on Linux
https://bugs.ruby-lang.org/issues/13228#change-63029
* Author: Tsuneo Yoshioka
* Status: Open
* Priority: Normal
* Assignee:
* Target version:
* ruby -v: ruby 2.4.0p0 (2016-12-24 revision 57164) [x86_64-linux]
* Backport: 2.2: UNKNOWN, 2.3: UNKNOWN, 2.4: UNKNOWN
----------------------------------------
s[i]=c(assigning a character) for String is slower than for Array on Linux.
If I split the String to Array, and assign characters, and join the Array to String,
then it is much faster than assigning characters directly to the string.
Somehow, I don't see the performance difference on Mac OS X.
~$ time ruby -e 'N=100000; s="a"*N; N.times{s[Random.rand(N)]="Z"}; puts s' >/dev/null
real 0m0.879s
user 0m0.836s
sys 0m0.012s
~$ time ruby -e 'N=100000;s="a"*N;s=s.split(""); N.times{s[Random.rand(N)]="Z"}; puts s.join("")' >/dev/null
real 0m0.153s
user 0m0.108s
sys 0m0.016s
~$ uname -a
Linux aaaaaaaa 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
~$ ruby --version
ruby 2.4.0p0 (2016-12-24 revision 57164) [x86_64-linux]
~$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 16.04.1 LTS
Release: 16.04
Codename: xenial
--
https://bugs.ruby-lang.org/