From: "byroot (Jean Boussier) via ruby-core" <ruby-core@...>
Date: 2023-03-04T09:44:55+00:00
Subject: [ruby-core:112687] [Ruby master Bug#19438] Ruby 2.7 -> 3.2 Performance Regression in so_k_nucleotide benchmark

Issue #19438 has been updated by byroot (Jean Boussier).


> If you agree that that is what is being benchmarked, one thing we can do is change the benchmark to use tally instead of manually incrementing the count.

So this is a benchmark that was popularized by the "benchmark-game", if the goal is to look better on that benchmark (I personally don't care for it) then yes that is a solution.

https://benchmarksgame-team.pages.debian.net/benchmarksgame/description/knucleotide.html#knucleotide

However I think the interesting part of this issue, would be to figure out why that specific implementation got slower over time.

It would be interesting to break down the algorithm in smaller pieces and benchmark them idependently to see which operation exactly got slower.

----------------------------------------
Bug #19438: Ruby 2.7 -> 3.2 Performance Regression in so_k_nucleotide benchmark
https://bugs.ruby-lang.org/issues/19438#change-102143

* Author: nick.schwaderer (Nicholas Schwaderer)
* Status: Open
* Priority: Normal
* ruby -v: 3.2.0
* Backport: 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN
----------------------------------------
## Introduction

Recently I had been going through some of the old benchmarks in the [Ruby Great Implementation Shootout](https://programmingzen.com/the-great-ruby-shootout-july-2010/) from around 2010. 

As an experiment, one night I ran the benchmarks against Ruby 3.2.0, Ruby 3.2.0 --yjit, TruffleRuby, TruffleRuby +GraalVM, and Ruby 2.6.10.

Most results were as expected. However there was a benchmark that Ruby 2.6.10 was _consistently_ outperforming all new Rubies on.

## Method

After pairing with @eightbitraptor, we discovered that this old benchmark was remarkably similar to an existing benchmark in the `/benchmark` 
directory, [so_k_nucleotide.yml](https://github.com/ruby/ruby/blob/master/benchmark/so_k_nucleotide.yml). We decided to go with that benchmark. For brevity I have not included the full 150 lines of the benchmark here.

I tested this benchmark out with 100 runs using `benchmark-driver` against Ruby 2.7,3.0,3.1,3.2. (I had discovered that 2.7 was even faster than 2.6.).

It appears that about half of the regression occured from 2.7 -> 3.0; the other half from 3.0 -> 3.2. One other interesting finding is that each minor version does appear to regress 
from the last, even if a little.

## Code

This is my benchmark running code and harnass. [The full code and data can be found here](https://gist.github.com/Schwad/16edf3d7cc5316af4baf23497f3c6a8f)

```ruby
RUNS = 100

results = Hash.new { |h, k| h[k] = [] }
RUNS.times do |i|
  puts i
  run = `benchmark-driver so_k_nucleotide.yml --chruby '2.7.5;3.0.5;3.1.3;3.2.0' -o simple`
  run.scan(/\d\.\d\.\d/).each_with_index do |version, index|
    results[version] << run.scan(/\d\.\d\d\d/)[index]
  end
end

require 'csv'

columns = results.keys
outdata = CSV.generate do |csv|
  csv << columns
  RUNS.times do |i|
    csv << columns.map { |c| results[c][i] }
  end
end

File.write("output.csv", outdata)
```
## Data

Ruby 2.7.5 was consistently ~18-20% faster than Ruby 3.2.0 in this Benchmark

![Screenshot 2023-02-15 at 13 16 10](https://user-images.githubusercontent.com/7865030/219038430-4a124cc6-0d23-46e2-9794-d89d1f26e227.png)

## Next Steps

I am happy to help investigate or learn more about this regression if anyone has any ideas. 


-- 
https://bugs.ruby-lang.org/
 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/