[ruby-core:94171] [Ruby master Bug#16049] optimization for frozen dynamic string literals "#{exp}".dup and +"#{exp}"
From:
eregontp@...
Date:
2019-08-07 10:13:08 UTC
List:
ruby-core #94171
Issue #16049 has been updated by Eregon (Benoit Daloze).
Part of the explanation:
String#freeze never allocates a new String, it just freezes the receiver in place.
String#-@ deduplicates/interns the String, to do so it needs to return a new String instance (unless it's already an interned String, or it cannot be interned because e.g., it has instance variables).
----------------------------------------
Bug #16049: optimization for frozen dynamic string literals "#{exp}".dup and +"#{exp}"
https://bugs.ruby-lang.org/issues/16049#change-80427
* Author: Dan0042 (Daniel DeLorme)
* Status: Open
* Priority: Normal
* Assignee:
* Target version:
* ruby -v: ruby 2.7.0dev (2019-04-22 trunk 67701) [x86_64-linux]
* Backport: 2.5: UNKNOWN, 2.6: UNKNOWN
----------------------------------------
When the decision was made that `frozen_string_literal: true` should also apply to dynamic string literals, it was mitigated with the following explanation:
> "#{exp}".dup can be optimized so it won’t allocate extra objects like "...".freeze
https://docs.google.com/document/u/1/d/1D0Eo5N7NE_unIySOKG9lVj_eyXf66BQPM4PKp7NvMyQ/pub
However that does not appear to be the case currently.
Using this script that generates 100k String objects:
```ruby
# frozen_string_literal: true
def allocated
GC.stat[:total_allocated_objects]
end
GC.disable
c = ARGV.shift.to_sym
x_eq_i = ARGV.shift=="i"
x = "x"
before = allocated
100_000.times do |i|
x = i.to_s if x_eq_i
case c
when :normal then v = "#{i}"
when :freeze then v = "#{i}".freeze
when :dup then v = "#{i}".dup
when :plus then v = +"#{i}"
when :minus then v = -"#{i}"
else raise
end
end
after = allocated
printf "%d\n", after-before
```
I get the following number of objects allocated
```
x= frozen_string_literal normal freeze dup plus minus
'x' false 200021 200021 300021 200021 300021
'x' true 200021 200021 300021 300021 200021
i false 300021 300021 400021 300021 400021
i true 300021 300021 400021 400021 300021
```
Given that I create 100k strings in that loop, I have no idea why object count increases by 200k. I'm going to hope/assume there is some kind of reason for that.
But we can also see that `"#{i}".dup` and `+"#{i}"` allocate an extra object per iteration
We also see that `-"#{i}"` does not have the same optimization as `"#{i}".freeze` ???
I also tested with `x = i.to_s` to see if deduplication of 100k identical strings was different from 100k different strings. According to the results above it's the same thing; we only have the extra 100k strings created by `i.to_s`. But if I change the script to measure memory instead of object allocations:
```ruby
def allocated
kb = `ps -p#{$$} -orss`[/\d+/].to_i
kb -= GC.stat[:heap_free_slots]*40/1024
(kb / 1024.0).round
end
...
130_000.times do |i|
...
```
I get the following memory usage in MiB
```
x= frozen_string_literal normal freeze dup plus minus
'x' false 10 10 15 10 20*
'x' true 10 10 15 15 10
i false 15 15 20 15 25*
i true 15 15 20 20 15
```
Which is proportional to the previous numbers, except for those marked with an asterisk. Another mystery to me.
Summary:
I expected `"#{v}".dup` and `+"#{v}"` to behave the same regardless of frozen_string_literal (and optimize down to just one allocation)
I expected `"#{v}".freeze` and `-"#{v}"` to behave the same regardless of frozen_string_literal (and optimize down to just one allocation)
but they do not. I think they should.
--
https://bugs.ruby-lang.org/
Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>