From: lionel.perrin@... Date: 2019-10-04T12:17:39+00:00 Subject: [ruby-core:95221] [Ruby master Bug#16237] Set#union performance issue Issue #16237 has been reported by lionelperrin (Lionel Perrin). ---------------------------------------- Bug #16237: Set#union performance issue https://bugs.ruby-lang.org/issues/16237 * Author: lionelperrin (Lionel Perrin) * Status: Open * Priority: Normal * Assignee: * Target version: * ruby -v: ruby 2.6.2p47 (2019-03-13 revision 67232) [x86_64-linux] * Backport: 2.5: UNKNOWN, 2.6: UNKNOWN ---------------------------------------- I've discovered a very large difference in performances between Set#union and the use of a hash to compute this union. The issue can be shown with the code below. Here are the benchmark results: ``` user system total real hash 0.562500 0.078125 0.640625 ( 0.639528) uniq 54.375000 0.312500 54.687500 ( 54.772007) set 37.312500 0.125000 37.437500 ( 37.474751) mix1 43.718750 0.000000 43.718750 ( 43.890027) mix2 37.109375 0.000000 37.109375 ( 37.190839) ``` The code example: ``` require 'benchmark' require 'set' REPEAT = 1_000 $DISTINCT_COUNT = 100_000_000 def random_array(size=1_000) Array.new(size) { Random.rand(1..$DISTINCT_COUNT) } end def uniq ret = [] REPEAT.times do ret += random_array ret.uniq! end ret end def set ret = Set.new REPEAT.times do ret += Set.new(random_array) end ret end def mix1 ret = Set.new REPEAT.times do ret += random_array end ret end def mix2 ret = Set.new REPEAT.times do ret += random_array.uniq end ret end def hash ret = {} REPEAT.times do random_array.each { |v| ret[v] = nil } end ret.keys end Benchmark.bm do |x| %i(hash uniq set mix1 mix2).each do |i| x.report(i, &method(i)) end end ``` -- https://bugs.ruby-lang.org/ Unsubscribe: