[#108771] [Ruby master Bug#18816] Ractor segfaulting MacOS 12.4 (aarch64 / M1 processor) — "brodock (Gabriel Mazetto)" <noreply@...>

Issue #18816 has been reported by brodock (Gabriel Mazetto).

8 messages 2022/06/05

[#108802] [Ruby master Feature#18821] Expose Pattern Matching interfaces in core classes — "baweaver (Brandon Weaver)" <noreply@...>

Issue #18821 has been reported by baweaver (Brandon Weaver).

9 messages 2022/06/08

[#108822] [Ruby master Feature#18822] Ruby lack a proper method to percent-encode strings for URIs (RFC 3986) — "byroot (Jean Boussier)" <noreply@...>

Issue #18822 has been reported by byroot (Jean Boussier).

18 messages 2022/06/09

[#108937] [Ruby master Bug#18832] Suspicious superclass mismatch — "fxn (Xavier Noria)" <noreply@...>

Issue #18832 has been reported by fxn (Xavier Noria).

16 messages 2022/06/15

[#108976] [Ruby master Misc#18836] DevMeeting-2022-07-21 — "mame (Yusuke Endoh)" <noreply@...>

Issue #18836 has been reported by mame (Yusuke Endoh).

12 messages 2022/06/17

[#109043] [Ruby master Bug#18876] OpenSSL is not available with `--with-openssl-dir` — "Gloomy_meng (Gloomy Meng)" <noreply@...>

Issue #18876 has been reported by Gloomy_meng (Gloomy Meng).

18 messages 2022/06/23

[#109052] [Ruby master Bug#18878] parse.y: Foo::Bar {} is inconsistently rejected — "qnighy (Masaki Hara)" <noreply@...>

Issue #18878 has been reported by qnighy (Masaki Hara).

9 messages 2022/06/26

[#109055] [Ruby master Bug#18881] IO#read_nonblock raises IOError when called following buffered character IO — "javanthropus (Jeremy Bopp)" <noreply@...>

Issue #18881 has been reported by javanthropus (Jeremy Bopp).

9 messages 2022/06/26

[#109063] [Ruby master Bug#18882] File.read cuts off a text file with special characters when reading it on MS Windows — magynhard <noreply@...>

Issue #18882 has been reported by magynhard (Matth辰us Johannes Beyrle).

15 messages 2022/06/27

[#109081] [Ruby master Feature#18885] Long lived fork advisory API (potential Copy on Write optimizations) — "byroot (Jean Boussier)" <noreply@...>

Issue #18885 has been reported by byroot (Jean Boussier).

23 messages 2022/06/28

[#109083] [Ruby master Bug#18886] Struct aref and aset don't trigger any tracepoints. — "ioquatix (Samuel Williams)" <noreply@...>

Issue #18886 has been reported by ioquatix (Samuel Williams).

8 messages 2022/06/29

[#109095] [Ruby master Misc#18888] Migrate ruby-lang.org mail services to Google Domains and Google Workspace — "shugo (Shugo Maeda)" <noreply@...>

Issue #18888 has been reported by shugo (Shugo Maeda).

16 messages 2022/06/30

[ruby-core:108796] [Ruby master Feature#18819] Moving Strings between size pools

From: "eightbitraptor (Matthew Valentine-House)" <noreply@...>
Date: 2022-06-07 18:07:35 UTC
List: ruby-core #108796
Issue #18819 has been reported by eightbitraptor (Matthew Valentine-House).

----------------------------------------
Feature #18819: Moving Strings between size pools
https://bugs.ruby-lang.org/issues/18819

* Author: eightbitraptor (Matthew Valentine-House)
* Status: Open
* Priority: Normal
----------------------------------------
[Github PR](https://github.com/ruby/ruby/pull/5986)

## Motivation

Using GC Compaction we can move objects around on a Ruby heap, this allows us to decrease memory fragmentation for more efficient memory usage and better copy on write performance. 

Ruby currently has several heaps, each heap can hold different sizes of objects. Currently compaction can only move objects within the same heap.

Some objects in Ruby can change size. For example, using `String#squeeze!` or `String#<<` will modify an existing String in place to either shrink it, or grow it respectively.

This can result in the String no longer being in the most appropriate sized heap, which can either mean wasted memory (when the string is shrunk), or negation of the benefits of Variable Width Allocation (when the String is grown).

In order to mitigate these problems we need a way of moving objects between heaps with different sized slots.

### Example: Growing a String

Allocating a String 16 bytes or less will result in the creation of an embedded RString object in the 40 byte size pool (assuming a 64 bit architecture). 

If we now use `string << "foobar"` to mutate the string, increasing its size by another 6 bytes we can no longer embed the string in a 40 byte slot. 

In this case the string gets its `NOEMBED` bit set, and new memory is allocated for the buffer in the system heap using `malloc`. This new memory is no longer adjacent to the RString object in the Ruby heap and so any locality benefits are gone.

In addition, if the original String exists in a slot with a size larger than `sizeof(RValue)` then the additional space at the end of the slot after the `ptr` to the heap buffer is unused.

### Example: Shrinking a String

Assuming a 64 but architecture, allocating a String longer than 16 bytes will result in the creation of an embedded string in one of the larger size pools (80, 160, 320 or 640 bytes).

If we were to shrink that string so that it is possible to fit into one of the smaller size pools (eg. using `string.squeeze!`), then at least half of the current slot size will be unused space at the end of the string.

## PR Summary

This change builds on the work in [this feature to reverse the compaction cursor movement](https://bugs.ruby-lang.org/issues/18619) to allow movement of objects between size pools during GC Compaction.

The algorithm works as follows

* During GC Compaction
    * During object movement step
        * If the object being moved is a string
            * Calculate how much space the string would require as an embedded string
            * If the embedded size fits within a size pool
                * move the object to the scan cursor position within the desired pool
            * If the embedded size is larger than any possible size pool
                * The object must be `NOEMBED`
                    * move the object to the scan cursor position within the smallest size pool
    * During reference update step
        * If the Object is a string and is not embedded
            * Calculate its size as if it was embedded
            * If the object can be embedded in its current slot
                * Convert the `NOEMBED` string into an embedded string
                * Copy the string buffer into the slot
                * Free the original string buffer

### Notes

* Currently we only support movement of `T_STRING` objects. More objects will be added in further PRs as they're given VWA support.
* We don't attempt to re-embed shared strings, shared roots, strings that wrap C string literals (Strings with `NOFREE` set), or "fstrings" (Strings with `STR_FAKESTR` set)

## Testing movement

This PR adds some extra keys to the hash returned by `GC.compact`. These keys are `:moved_up` and `:moved_down`. Each contains a hash of object types and a count of those object types that have either been moved into a larger or a smaller size pool.

Checking this can be done by creating some fragmentation in different heaps, and forcing strings to be resized, then running compaction.

```
moveables = []
large_slots = []

n = 1500

# Ensure fragmentation in the large heap
base_slot_size = GC.stat_heap[0].fetch(:slot_size)
n.times {
  String.new(+"a" * base_slot_size).downcase
  large_slots << String.new(+"a" * base_slot_size).downcase
}

n.times {
  # strings are created as shared strings when initialized from literals
  # use downcase to force the creation of an embedded string (it calls
  # rb_str_new internally)
  moveables << String.new("a").downcase
}
moveables.map { |s| s << ("bc" * base_slot_size) }

p GC.compact.fetch(:moved_up)
```

This script outputs the following on my development machine.

```
{:T_STRING=>319}
```

## Future Work

This PR implements a framework for moving objects between size pools. There are two main areas to focus on following this change

* Implementing movement for existing mutatable VWA types. Currently this is just Array, as Class objects cannot be mutated and therefore cannot move between pools
* Add support for VWA to more types, implementing object movement where appropriate



-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>

In This Thread

Prev Next