[ruby-core:95002] [Ruby master Bug#16121] Stop making a redundant hash copy in Hash#dup
From:
ko1@...
Date:
2019-09-20 08:35:05 UTC
List:
ruby-core #95002
Issue #16121 has been updated by ko1 (Koichi Sasada).
I found that
> ` rb_define_method(rb_cArray, "initialize_copy", rb_ary_replace, 1);`
`Array#initialize_copy` == `Array#replace`. Can we use same technique?
----------------------------------------
Bug #16121: Stop making a redundant hash copy in Hash#dup
https://bugs.ruby-lang.org/issues/16121#change-81630
* Author: dylants (Dylan Thacker-Smith)
* Status: Open
* Priority: Normal
* Assignee: ko1 (Koichi Sasada)
* Target version:
* ruby -v: ruby 2.7.0dev (2019-08-23T16:41:09Z master b38ab0a3a9) [x86_64-darwin18]
* Backport: 2.5: UNKNOWN, 2.6: UNKNOWN
----------------------------------------
## Problem
I noticed while profiling object allocations that Hash#dup was allocating 2 objects instead of only 1 as expected. I looked for alternatives for comparison and found that `Hash[hash]` created a copy with only a single object allocation and seemed to be more than twice as fast. Reading the source code revealed the difference was that Hash#dup creates a copy of the Hash, then rehashes the copy. However, rehashing is done by making a copy of the hash, so the first copy before rehashing was unnecessary.
## Solution
I changed the code to just use rehashing to make the copy of the hash to improve performance while also preserving the existing behaviour.
## Benchmark
```ruby
require 'benchmark'
N = 100000
def report(x, name)
x.report(name) do
N.times do
yield
end
end
end
hashes = {
small_hash: { a: 1 },
larger_hash: 20.times.map { |i| [('a'.ord + i).chr.to_sym, i] }.to_h
}
Benchmark.bmbm do |x|
hashes.each do |name, hash|
report(x, "#{name}.dup") do
hash.dup
end
end
end
```
results on master
```
user system total real
small_hash.dup 0.401350 0.001638 0.402988 ( 0.404608)
larger_hash.dup 7.218548 0.433616 7.652164 ( 7.695990)
```
results with the attached patch
```
user system total real
small_hash.dup 0.336733 0.002425 0.339158 ( 0.341760)
larger_hash.dup 6.617343 0.398407 7.015750 ( 7.070282)
```
---Files--------------------------------
0001-Remove-redundant-Check_Type-after-to_hash.diff.txt (624 Bytes)
0002-Fix-freeing-and-clearing-destination-hash-in-Hash.diff.txt (1.57 KB)
0003-Remove-dead-code-paths-in-rb_hash_initialize_copy.diff.txt (1.12 KB)
0004-Stop-making-a-redundant-hash-copy-in-Hash-dup.diff.txt (1.35 KB)
--
https://bugs.ruby-lang.org/
Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>