[ruby-core:102141] [Ruby master Bug#17497] Ractor performance issue
From:
keithrbennett@...
Date:
2021-01-18 18:11:43 UTC
List:
ruby-core #102141
Issue #17497 has been updated by keithrbennett (Keith Bennett).
I too have seen strange results testing ractors. I used the code at https://github.com/keithrbennett/keithrbennett-ractor-test/blob/master/my_ractor.rb to do some arbitrary but predictable work. I have a 24-core Ryzen 9 CPU, and I compared using 1 ractor with using 24. With 24, htop reported that all the CPU's were at 100% most of the time, yet the elapsed time using 24 CPU's was only about a third less than when using 1 CPU. Also, the CPU's seemed to be working collectively about ten times harder with 24 CPU's. Here is the program output:
```
1 CPU:
Many HTOP readings are < 100% for all CPU's
time ractor/my_ractor.rb ruby '*.rb'
Running the following command to find all filespecs to process: find -L ruby -type f -name '*.rb' -print
Processing 8218 files in 1 slices, whose sizes are:
[8218]
ractor/my_ractor.rb ruby '*.rb' 2513.90s user 6.75s system 99% cpu 42:03.01 total
24 CPU's:
% time ractor/my_ractor.rb ruby '*.rb' ; espeak finished
Running the following command to find all filespecs to process: find -L ruby -type f -name '*.rb' -print
Processing 8218 files in 24 slices, whose sizes are:
[343, 343, 343, 343, 343, 343, 343, 343, 343, 343, 343, 343, 343, 343, 343, 343, 343, 343, 343, 343, 343, 343, 343, 329]
ractor/my_ractor.rb ruby '*.rb' 22986.42s user 14.98s system 1134% cpu 33:47.96 total
```
(In the command, `ruby` refers to the directory in which I've cloned the Github Ruby repo.)
Here is the current content of the test program:
```
#!/usr/bin/env ruby
require 'amazing_print'
require 'etc'
require 'set'
require 'shellwords'
require 'yaml'
raise "This script requires Ruby version 3 or later." unless RUBY_VERSION.split('.').first.to_i >= 3
# An instance of this parser class is created for each ractor.
class RactorParser
attr_reader :dictionary_words
def initialize(dictionary_words)
@dictionary_words = dictionary_words
end
def parse(filespecs)
filespecs.inject(Set.new) do |found_words, filespec|
found_words | process_one_file(filespec)
end
end
private def word?(string)
dictionary_words.include?(string)
end
private def strip_punctuation(string)
punctuation_regex = /[[:punct:]]/
string.gsub(punctuation_regex, ' ')
end
private def file_lines(filespec)
command = "strings #{Shellwords.escape(filespec)}"
text = `#{command}`
strip_punctuation(text).split("\n")
end
private def line_words(line)
line.split.map(&:downcase).select { |text| word?(text) }
end
private def process_one_file(filespec)
file_words = Set.new
file_lines(filespec).each do |line|
line_words(line).each { |word| file_words << word }
end
# puts "Found #{file_words.count} words in #{filespec}."
file_words
end
end
class Main
BASEDIR = ARGV[0] || '.'
FILEMASK = ARGV[1]
CPU_COUNT = Etc.nprocessors
def call
check_arg_count
slices = get_filespec_slices
ractors = create_and_populate_ractors(slices)
all_words = collate_ractor_results(ractors)
yaml = all_words.to_a.sort.to_yaml
File.write('ractor-words.yaml', yaml)
puts "Words are in ractor-words.yaml."
end
private def check_arg_count
if ARGV.length > 2
puts "Syntax is ractor [base_directory] [filemask], and filemask must be quoted so that the shell does not expand it."
exit -1
end
end
private def collate_ractor_results(ractors)
ractors.inject(Set.new) do |all_words, ractor|
all_words | ractor.take
end
end
private def get_filespec_slices
all_filespecs = find_all_filespecs
slice_size = (all_filespecs.size / CPU_COUNT) + 1
# slice_size = all_filespecs.size # use this line instead of previous to test with 1 ractor
slices = all_filespecs.each_slice(slice_size).to_a
puts "Processing #{all_filespecs.size} files in #{slices.size} slices, whose sizes are:\n#{slices.map(&:size).inspect}"
slices
end
private def create_and_populate_ractors(slices)
words = File.readlines('/usr/share/dict/words').map(&:chomp).map(&:downcase).sort
slices.map do |slice|
ractor = Ractor.new do
filespecs = Ractor.receive
dictionary_words = Ractor.receive
RactorParser.new(dictionary_words).parse(filespecs)
end
ractor.send(slice)
ractor.send(words)
ractor
end
end
private def find_all_filespecs
filemask = FILEMASK ? %Q{-name '#{FILEMASK}'} : ''
command = "find -L #{BASEDIR} -type f #{filemask} -print"
puts "Running the following command to find all filespecs to process: #{command}"
`#{command}`.split("\n")
end
end
Main.new.call
```
----------------------------------------
Bug #17497: Ractor performance issue
https://bugs.ruby-lang.org/issues/17497#change-89993
* Author: marcandre (Marc-Andre Lafortune)
* Status: Open
* Priority: Normal
* Assignee: ko1 (Koichi Sasada)
* ruby -v: ruby 3.0.0p0 (2020-12-25 revision 95aff21468) [x86_64-darwin18]
* Backport: 2.5: UNKNOWN, 2.6: UNKNOWN, 2.7: UNKNOWN
----------------------------------------
There's a strange performance issue with Ractor (at least on MacOS, didn't run on other OS).
I ran a benchmark doing 3 different types of work:
* "fib": method calls (naive fibonacci calculation)
* "cpu": `(0...1000).inject(:+)`
* "sleep": call `sleep`
I get the kind of results I was excepting for the `fib` and for sleeping, but the results for the "cpu" workload show a problem.
It is so slow that my pure Ruby backport (using Threads) is 65x faster on my Mac Pro (despite having 6 cores). Expected results would be 6x slower, so in that case Ractor is 400x slower than it should 仭
On my MacBook (2 cores) the results are not as bad, the `cpu` workload is 3x faster with my pure-Ruby backport (only) instead of ~2x slower, so the factor is 6x too slow.
```
$ gem install backports
Successfully installed backports-3.20.0
1 gem installed
$ ruby ractor_test.rb
<internal:ractor>:267: warning: Ractor is experimental, and the behavior may change in future versions of Ruby! Also there are many implementation issues.
fib: 110 ms | cpu: 22900 ms | sleep: 206 ms
$ B=t ruby ractor_test.rb
Using pure Ruby implementation
fib: 652 ms | cpu: 337 ms | sleep: 209 ms
```
Notice the `sleep` run takes similar time, which is good, and `fib` is ~6x faster on my 6-core CPU (and ~2x faster on my 2-core MacBook), again that's good as the pure ruby version uses Threads and thus runs with a single GVL.
The `cpu` version is the problem.
Script is here: https://gist.github.com/marcandre/bfed626e538a3d0fc7cad38dc026cf0e
--
https://bugs.ruby-lang.org/
Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>