[#107765] [Ruby master Bug#18605] Fails to run on (newer) 32bit Windows with ucrt — "lazka (Christoph Reiter)" <noreply@...>

Issue #18605 has been reported by lazka (Christoph Reiter).

8 messages 2022/03/03

[#107769] [Ruby master Misc#18609] keyword decomposition in enumerable (question/guidance) — "Ethan (Ethan -)" <noreply@...>

Issue #18609 has been reported by Ethan (Ethan -).

10 messages 2022/03/04

[#107784] [Ruby master Feature#18611] Promote best practice for combining multiple values into a hash code — "chrisseaton (Chris Seaton)" <noreply@...>

Issue #18611 has been reported by chrisseaton (Chris Seaton).

12 messages 2022/03/07

[#107791] [Ruby master Bug#18614] Error (busy loop) inTestGemCommandsSetupCommand#test_destdir_flag_does_not_try_to_write_to_the_default_gem_home — duerst <noreply@...>

Issue #18614 has been reported by duerst (Martin D端rst).

7 messages 2022/03/08

[#107794] [Ruby master Feature#18615] Use -Werror=implicit-function-declaration by deault for building C extensions — "Eregon (Benoit Daloze)" <noreply@...>

Issue #18615 has been reported by Eregon (Benoit Daloze).

11 messages 2022/03/08

[#107832] [Ruby master Bug#18622] const_get still looks in Object, while lexical constant lookup no longer does — "Eregon (Benoit Daloze)" <noreply@...>

Issue #18622 has been reported by Eregon (Benoit Daloze).

16 messages 2022/03/10

[#107847] [Ruby master Bug#18625] ruby2_keywords does not unmark the hash if the receiving method has a *rest parameter — "Eregon (Benoit Daloze)" <noreply@...>

Issue #18625 has been reported by Eregon (Benoit Daloze).

13 messages 2022/03/11

[#107886] [Ruby master Feature#18630] Introduce general `IO#timeout` and `IO#timeout=`for all (non-)blocking operations. — "ioquatix (Samuel Williams)" <noreply@...>

Issue #18630 has been reported by ioquatix (Samuel Williams).

28 messages 2022/03/14

[#108026] [Ruby master Feature#18654] Enhancements to prettyprint — "kddeisz (Kevin Newton)" <noreply@...>

Issue #18654 has been reported by kddeisz (Kevin Newton).

9 messages 2022/03/22

[#108039] [Ruby master Feature#18655] Merge `IO#wait_readable` and `IO#wait_writable` into core — "byroot (Jean Boussier)" <noreply@...>

Issue #18655 has been reported by byroot (Jean Boussier).

10 messages 2022/03/23

[#108056] [Ruby master Bug#18658] Need openssl 3 support for Ubuntu 22.04 (Ruby 2.7.x and 3.0.x) — "schneems (Richard Schneeman)" <noreply@...>

Issue #18658 has been reported by schneems (Richard Schneeman).

19 messages 2022/03/24

[#108075] [Ruby master Bug#18663] Autoload doesn't work with fiber context switch. — "ioquatix (Samuel Williams)" <noreply@...>

Issue #18663 has been reported by ioquatix (Samuel Williams).

10 messages 2022/03/25

[#108117] [Ruby master Feature#18668] Merge `io-nonblock` gems into core — "Eregon (Benoit Daloze)" <noreply@...>

Issue #18668 has been reported by Eregon (Benoit Daloze).

22 messages 2022/03/30

[ruby-core:107913] [Ruby master Feature#18634] Variable Width Allocation: Arrays

From: "Eregon (Benoit Daloze)" <noreply@...>
Date: 2022-03-15 21:47:47 UTC
List: ruby-core #107913
Issue #18634 has been updated by Eregon (Benoit Daloze).


Improvement is Branch/master?
It seems inconsistent for these 2 example lines:
```
| p100 (ms)             | 5.53   | 6.02   | 0.92x       |
| p100 (ms)             | 5.54   | 7.03   | 1.27x       |
```
i.e. the Branch takes less time for both, but once improvement is <1x and once >1x.

----------------------------------------
Feature #18634: Variable Width Allocation: Arrays
https://bugs.ruby-lang.org/issues/18634#change-96849

* Author: peterzhu2118 (Peter Zhu)
* Status: Open
* Priority: Normal
----------------------------------------
# GitHub PR: https://github.com/ruby/ruby/pull/5660

# Feature description

This patch changes arrays to allocate through Variable Width Allocation.

Similar to strings (implemented in ticket [#18239](https://bugs.ruby-lang.org/issues/18239)), arrays allocated through Variable Width Allocation are embedded, meaning the contents of the array directly follow the array object headers.

When an array is resized, we fallback to allocating memory through the malloc heap. If the array was initially allocated in a larger slot, it would result in wastage of memory. However, in the benchmarks below, we can see that this wastage does not cause memory usage to increase significantly.

# What's next

We're working on implementing cross size pool compaction for Variable Width Allocation. This will allow us to both downsize objects (to save memory) and upsize objects (to improve cache performance).

We're going to continue on implementing more types on Variable Width Allocation, such as Objects, Hashes, and ISeqs.

# Benchmark setup

Benchmarking was done on a bare-metal Ubuntu machine on AWS. All benchmark results are using glibc by default, except when jemalloc is explicitly specified.

```
$ uname -a
Linux 5.13.0-1014-aws #15~20.04.1-Ubuntu SMP Thu Feb 10 17:55:03 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
```

glibc version:

```
$ ldd --version
ldd (Ubuntu GLIBC 2.31-0ubuntu9.2) 2.31
Copyright (C) 2020 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Written by Roland McGrath and Ulrich Drepper.
```

jemalloc version:

```
$ apt list --installed | grep jemalloc

WARNING: apt does not have a stable CLI interface. Use with caution in scripts.

libjemalloc-dev/focal,now 5.2.1-1ubuntu1 amd64 [installed]
libjemalloc2/focal,now 5.2.1-1ubuntu1 amd64 [installed,automatic]
```

To measure memory usage over time, the [mstat tool](https://github.com/bpowers/mstat) was used.

master was benchmarked on commit [bec492c77e](https://github.com/ruby/ruby/commit/bec492c77ed7659cafd2447cd042acde489c8d28). The branch was rebased on top of the same commit.

## railsbench

For railsbench, we ran the [railsbench benchmark](https://github.com/k0kubun/railsbench/blob/master/bin/bench). For both the performance and memory benchmarks, 25 runs were conducted for each combination (branch + glibc, master + glibc, branch + jemalloc, master + jemalloc).

For both glibc and jemalloc allocators, there is not a significant change in RPS, response times, or max memory usage. We can see in the RSS over time graph that the memory behavior of the branch and master is very similar.

### glibc

```
+-----------------------+--------+--------+-------------+
|                       | Branch | master | Improvement |
+-----------------------+--------+--------+-------------+
| RPS                   | 810.38 | 809.50 | 1.00x       |
| p50 (ms)              | 1.20   | 1.20   | 1.00x       |
| p90 (ms)              | 1.32   | 1.32   | 1.00x       |
| p99 (ms)              | 1.75   | 1.72   | 0.98x       |
| p100 (ms)             | 5.53   | 6.02   | 0.92x       |
| Max memory usage (MB) | 90.19  | 90.45  | 1.00x       |
+-----------------------+--------+--------+-------------+
```

![](https://user-images.githubusercontent.com/15860699/157101671-98568350-8960-4a33-8e55-856ab32a4bc1.png)


### jemalloc

```
+-----------------------+--------+--------+-------------+
|                       | Branch | master | Improvement |
+-----------------------+--------+--------+-------------+
| RPS                   | 834.04 | 840.81 | 0.99x       |
| p50 (ms)              | 1.18   | 1.17   | 0.99x       |
| p90 (ms)              | 1.27   | 1.26   | 0.99x       |
| p99 (ms)              | 1.69   | 1.65   | 0.98x       |
| p100 (ms)             | 5.54   | 7.03   | 1.27x       |
| Max memory usage (MB) | 88.50  | 87.48  | 0.99x       |
+-----------------------+--------+--------+-------------+
```

![](https://user-images.githubusercontent.com/15860699/157101712-27d3e02f-4611-45b6-9c8b-c5983c301817.png)

## discourse

Discourse was benchmarked through the [`script/bench.rb`](https://github.com/discourse/discourse/blob/main/script/bench.rb) benchmarking script. The response times for the `home` endpoint and RSS memory usage is shown below.

We see a slight increase in memory usage (5%) with glibc and an insignificant memory usage increase with jemalloc. We don't see big differences in response times.

### glibc

```
+-----------+--------+--------+-------------+
|           | Branch | master | Improvement |
+-----------+--------+--------+-------------+
| p50 (ms)  | 75     | 76     | 1.01x       |
| p90 (ms)  | 88     | 90     | 1.02x       |
| p99 (ms)  | 248    | 261    | 1.05x       |
| RSS (MB)  | 364.48 | 383.80 | 1.05x       |
+-----------+--------+--------+-------------+
```

### jemalloc

```
+-----------+--------+--------+-------------+
|           | Branch | master | Improvement |
+-----------+--------+--------+-------------+
| p50 (ms)  | 73     | 73     | 1.00x       |
| p90 (ms)  | 84     | 86     | 1.02x       |
| p99 (ms)  | 241    | 242    | 1.00x       |
| RSS (MB)  | 347.56 | 349.86 | 1.01x       |
+-----------+--------+--------+-------------+
```

## rdoc generation

In rdoc generation, we see a small improvement in performance in glibc and no change in performance for jemalloc. We see a small max memory usage increase for both glibc and jemalloc. Howevver, the RSS over time graph shows that except for the very end, the branch actually has lower memory usage than master.

### glibc

```
+-----------------------+--------+--------+-------------+
|                       | Branch | master | Improvement |
+-----------------------+--------+--------+-------------+
| Time (s)              | 17.81  | 18.11  | 1.02x       |
| Max memory usage (MB) | 287.74 | 283.24 | 0.98x       |
+-----------------------+--------+--------+-------------+
```

![](https://user-images.githubusercontent.com/15860699/157101976-805bde67-897e-473e-a2b7-16cdba7d21e4.png)

### jemalloc

```
+-----------------------+--------+--------+-------------+
|                       | Branch | master | Improvement |
+-----------------------+--------+--------+-------------+
| Time (s)              | 17.59  | 17.46  | 0.99x       |
| Max memory usage (MB) | 289.92 | 277.30 | 0.96x       |
+-----------------------+--------+--------+-------------+
```

![](https://user-images.githubusercontent.com/15860699/157102010-ad5cd8b9-91ab-4058-8e1b-35bdf2af47a4.png)

## optcarrot

We don't see a change in performance in optcarrot.

```
+------+--------+--------+-------------+
|      | Branch | master | Improvement |
+------+--------+--------+-------------+
| FPS  | 43.10  | 43.25  | 1.00x       |
+------+--------+--------+-------------+
```

## Liquid benchmarks

We don't see a big change in performance in liquid benchmarks.

```
+----------------------+--------+--------+-------------+
|                      | Branch | master | Improvement |
+----------------------+--------+--------+-------------+
| Parse (i/s)          | 39.57  | 40.43  | 0.98x       |
| Render (i/s)         | 129.78 | 130.22 | 1.00x       |
| Parse & Render (i/s) | 28.43  | 28.89  | 0.98x       |
+----------------------+--------+--------+-------------+
```

## Microbenchmarks

These microbenchmarks are very favourable for VWA since the arrays created have a length of 10, so they are embedded in VWA and allocated on the malloc heap for master.

```
+-------------+--------+--------+-------------+
|             | Branch | master | Improvement |
+-------------+--------+--------+-------------+
| Array#first | 2.282k | 2.014k | 1.13x       |
| Array#last  | 2.095k | 2.092k | 1.00x       |
| Array#[0]=  | 2.232k | 2.079k | 1.07x       |
| Array#[-1]= | 2.181k | 2.064k | 1.06x       |
| Array#each  | 319.92 | 314.22 | 1.02x       |
+-------------+--------+--------+-------------+
```

{{collapse(Benchmark source code)


```ruby
require "bundler/inline"
gemfile do
  source "https://rubygems.org"
  gem "benchmark-ips"
end

COUNT = 10_000

arrays = []

COUNT.times do
  arrays << Array.new(10)
end

Benchmark.ips do |x|
  x.report("Array#first") do |times|
    i = 0
    while i < times
      COUNT.times { |i| arrays[i].first }
      i += 1
    end
  end

  x.report("Array#last") do |times|
    i = 0
    while i < times
      COUNT.times { |i| arrays[i].last }
      i += 1
    end
  end

  x.report("Array#[0]=") do |times|
    i = 0
    while i < times
      COUNT.times { |i| arrays[i][0] = 0 }
      i += 1
    end
  end

  x.report("Array#[-1]=") do |times|
    i = 0
    while i < times
      COUNT.times { |i| arrays[i][-1] = 9 }
      i += 1
    end
  end

  x.report("Array#each") do |times|
    i = 0
    while i < times
      COUNT.times { |i| arrays[i].each { |x| } }
      i += 1
    end
  end
end
```
}}




-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>

In This Thread