[ruby-core:94504] [Ruby master Feature#16119] Optimize Array#flatten and flatten! for already flattened arrays

From: dylan.smith@...
Date: 2019-08-23 15:32:44 UTC
List: ruby-core #94504
Issue #16119 has been updated by dylants (Dylan Thacker-Smith).


duerst (Martin Dst) wrote:
> However, I expect this to be slower on arrays that are 'almost flat', i.e.
> ```
> almost_flat_ary = 100.times.to_a + [1, 2]
> ```
> Putting the nested array at the end will make sure all elements of the big array are checked, only to discover that actual flattening work is needed. The time needed for checking will not be offset by the allocation savings on the small array at the end.

The optimization handles this case by doing a quick `ary_memcpy` of everything up to the first nested array to avoid redundantly rechecking if those preceding elements were an array.

I benchmarked it by replacing `arrays` and `Benchmark.bmbm` in my above bechmark script with 

```ruby
ary = 100.times.to_a.push([101, 102])

Benchmark.bmbm do |x|
  report(x, "flatten!") do
    ary.flatten!
  end
  report(x, "flatten") do
    ary.flatten
  end
end
```

and got the following results on master

```
               user     system      total        real
flatten!   0.577976   0.002275   0.580251 (  0.582700)
flatten    0.542454   0.001994   0.544448 (  0.546523)
```

and the following results with this patch

```
               user     system      total        real
flatten!   0.420922   0.001052   0.421974 (  0.422957)
flatten    0.430728   0.001173   0.431901 (  0.433274)
```

so I actually noticed improved performance for arrays with a large flat prefix.

----------------------------------------
Feature #16119: Optimize Array#flatten and flatten! for already flattened arrays
https://bugs.ruby-lang.org/issues/16119#change-80938

* Author: dylants (Dylan Thacker-Smith)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
----------------------------------------
## Problem

When doing an object profile from stackprof, I noticed object allocations being made from `Array#flatten!` which was unlike other in-place methods like `Array#uniq!` and `Array#compact!`.  In this case, I wanted to optimize for the array already being flattened and found that `ary.flatten! if ary.any?(Array)` was significantly faster.  The code confirmed my suspicion that `ary.flatten!` was almost equivalent to `ary.replace(ary.flatten)` in implementation. The object allocations I noticed were from a temporary result array, a stack array to handle nesting and a hash table to prevent infinite recursion, all of which can be avoided in the simple case of an already flattened array.

## Solution

Iterate over the array to find the first nested array.  If no nested array is found, then `flatten!` just returns `nil` without creating additional objects and `flatten` returns a shared copy of the original array.  If a nested array is found, then it creates and initializes the temporary objects to resume with the existing algorithm for flattening the array.

## Benchmark

```ruby
require 'benchmark'

N = 100000

def report(x, name)
  x.report(name) do
    N.times do
      yield
    end
  end
end

arrays = {
  small_flat_ary: 5.times.to_a,
  large_flat_ary: 100.times.to_a,
  small_pairs_ary: [[1, 2]] * 5,
  large_pairs_ary: [[1, 2]] * 100,
}

Benchmark.bmbm do |x|
  arrays.each do |name, ary|
    report(x, "#{name}.flatten!") do
      ary.flatten!
    end
    report(x, "#{name}.flatten") do
      ary.flatten
    end
  end
end
```

results on the latest master (`ruby 2.7.0dev (2019-08-22T14:10:55Z master fd20b32130) [x86_64-darwin18]`)

```
                               user     system      total        real
small_flat_ary.flatten!    0.082001   0.000294   0.082295 (  0.082685)
small_flat_ary.flatten     0.078655   0.000211   0.078866 (  0.079176)
large_flat_ary.flatten!    0.552551   0.001643   0.554194 (  0.556166)
large_flat_ary.flatten     0.551520   0.001327   0.552847 (  0.554091)
small_pairs_ary.flatten!   0.100073   0.000302   0.100375 (  0.100687)
small_pairs_ary.flatten    0.109440   0.000232   0.109672 (  0.109850)
large_pairs_ary.flatten!   1.021200   0.001650   1.022850 (  1.024227)
large_pairs_ary.flatten    1.049046   0.002938   1.051984 (  1.055228)
```

results with the attached patch

```
                               user     system      total        real
small_flat_ary.flatten!    0.034868   0.000150   0.035018 (  0.035180)
small_flat_ary.flatten     0.040504   0.000148   0.040652 (  0.040914)
large_flat_ary.flatten!    0.458273   0.000786   0.459059 (  0.460005)
large_flat_ary.flatten     0.453846   0.000833   0.454679 (  0.455324)
small_pairs_ary.flatten!   0.055211   0.000205   0.055416 (  0.055673)
small_pairs_ary.flatten    0.060157   0.000226   0.060383 (  0.060540)
large_pairs_ary.flatten!   0.901698   0.001650   0.903348 (  0.905246)
large_pairs_ary.flatten    0.917180   0.001370   0.918550 (  0.920008)
```

---Files--------------------------------
0001-Optimize-Array-flatten-and-flatten-for-already-fl.diff.txt (2.79 KB)


-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>

In This Thread

Prev Next