[ruby-core:95221] [Ruby master Bug#16237] Set#union performance issue
From:
lionel.perrin@...
Date:
2019-10-04 12:17:39 UTC
List:
ruby-core #95221
Issue #16237 has been reported by lionelperrin (Lionel Perrin).
----------------------------------------
Bug #16237: Set#union performance issue
https://bugs.ruby-lang.org/issues/16237
* Author: lionelperrin (Lionel Perrin)
* Status: Open
* Priority: Normal
* Assignee:
* Target version:
* ruby -v: ruby 2.6.2p47 (2019-03-13 revision 67232) [x86_64-linux]
* Backport: 2.5: UNKNOWN, 2.6: UNKNOWN
----------------------------------------
I've discovered a very large difference in performances between Set#union and the use of a hash to compute this union.
The issue can be shown with the code below.
Here are the benchmark results:
```
user system total real
hash 0.562500 0.078125 0.640625 ( 0.639528)
uniq 54.375000 0.312500 54.687500 ( 54.772007)
set 37.312500 0.125000 37.437500 ( 37.474751)
mix1 43.718750 0.000000 43.718750 ( 43.890027)
mix2 37.109375 0.000000 37.109375 ( 37.190839)
```
The code example:
```
require 'benchmark'
require 'set'
REPEAT = 1_000
$DISTINCT_COUNT = 100_000_000
def random_array(size=1_000)
Array.new(size) { Random.rand(1..$DISTINCT_COUNT) }
end
def uniq
ret = []
REPEAT.times do
ret += random_array
ret.uniq!
end
ret
end
def set
ret = Set.new
REPEAT.times do
ret += Set.new(random_array)
end
ret
end
def mix1
ret = Set.new
REPEAT.times do
ret += random_array
end
ret
end
def mix2
ret = Set.new
REPEAT.times do
ret += random_array.uniq
end
ret
end
def hash
ret = {}
REPEAT.times do
random_array.each { |v| ret[v] = nil }
end
ret.keys
end
Benchmark.bm do |x|
%i(hash uniq set mix1 mix2).each do |i|
x.report(i, &method(i))
end
end
```
--
https://bugs.ruby-lang.org/
Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>