[#109403] [Ruby master Feature#18951] Object#with to set and restore attributes around a block — "byroot (Jean Boussier)" <noreply@...>

Issue #18951 has been reported by byroot (Jean Boussier).

23 messages 2022/08/01

[#109423] [Ruby master Misc#18954] DevMeeting-2022-08-18 — "mame (Yusuke Endoh)" <noreply@...>

Issue #18954 has been reported by mame (Yusuke Endoh).

10 messages 2022/08/04

[#109449] [Ruby master Feature#18959] Handle gracefully nil kwargs eg. **nil — "LevLukomskyi (Lev Lukomskyi)" <noreply@...>

Issue #18959 has been reported by LevLukomskyi (Lev Lukomskyi).

27 messages 2022/08/08

[#109456] [Ruby master Bug#18960] Module#using raises RuntimeError when called at toplevel from wrapped script — "shioyama (Chris Salzberg)" <noreply@...>

Issue #18960 has been reported by shioyama (Chris Salzberg).

15 messages 2022/08/09

[#109550] [Ruby master Feature#18965] Further Thread::Queue improvements — "byroot (Jean Boussier)" <noreply@...>

Issue #18965 has been reported by byroot (Jean Boussier).

14 messages 2022/08/18

[#109575] [Ruby master Bug#18967] Segmentation fault in stackprof with Ruby 2.7.6 — "RubyBugs (A Nonymous)" <noreply@...>

Issue #18967 has been reported by RubyBugs (A Nonymous).

10 messages 2022/08/19

[#109598] [Ruby master Bug#18970] CRuby adds an invalid header to bin/bundle (and others) which makes it unusable in Bash on Windows — "Eregon (Benoit Daloze)" <noreply@...>

Issue #18970 has been reported by Eregon (Benoit Daloze).

17 messages 2022/08/20

[#109645] [Ruby master Bug#18973] Kernel#sprintf: %c allows codepoints above 127 for 7-bits ASCII encoding — "andrykonchin (Andrew Konchin)" <noreply@...>

Issue #18973 has been reported by andrykonchin (Andrew Konchin).

8 messages 2022/08/23

[#109689] [Ruby master Misc#18977] DevMeeting-2022-09-22 — "mame (Yusuke Endoh)" <noreply@...>

Issue #18977 has been reported by mame (Yusuke Endoh).

16 messages 2022/08/25

[#109707] [Ruby master Feature#18980] Re-reconsider numbered parameters: `it` as a default block parameter — "k0kubun (Takashi Kokubun)" <noreply@...>

Issue #18980 has been reported by k0kubun (Takashi Kokubun).

40 messages 2022/08/26

[#109756] [Ruby master Feature#18982] Add an `exception: false` argument for Queue#push, Queue#pop, SizedQueue#push and SizedQueue#pop — "byroot (Jean Boussier)" <noreply@...>

Issue #18982 has been reported by byroot (Jean Boussier).

11 messages 2022/08/29

[#109773] [Ruby master Misc#18984] Doc for Range#size for Float/Rational does not make sense — "masasakano (Masa Sakano)" <noreply@...>

Issue #18984 has been reported by masasakano (Masa Sakano).

7 messages 2022/08/29

[ruby-core:109410] [Ruby master Feature#18885] Long lived fork advisory API (potential Copy on Write optimizations)

From: "byroot (Jean Boussier)" <noreply@...>
Date: 2022-08-02 09:03:32 UTC
List: ruby-core #109410
Issue #18885 has been updated by byroot (Jean Boussier).


@mame 

> it would be nice to be properly confirm this advantage before introduction.

https://bugs.ruby-lang.org/issues/11164 is an example of how bad things can go without nakayoshi_fork (or similar). I can get production data from some of our apps if you wish, but the effect is going to be very app dependent, so I'm not sure if it's very relevant. You can craft demo-apps for which memory usage totally blow up if you don't promote objects to the old generation before forking.

> How is it integrated with Process._fork?

Since I wrote this, I'm now convinced that it shouldn't be a `fork` argument, but a distinct API on `RubyVM`, since if you fork multiple workers you don't want to run them things again as it could invalidate CoW in previous workers. 

So IMO `RubyVM.prepare_for_long_lived_fork` is the proper API (aside from the name which I doubt is desirable).

Also please note that several of the proposed optimization can only be done from inside Ruby, so decorating like nakayoshi_fork does is not an option. Hence the main reason of this proposal.

----------------------------------------
Feature #18885: Long lived fork advisory API (potential Copy on Write optimizations)
https://bugs.ruby-lang.org/issues/18885#change-98559

* Author: byroot (Jean Boussier)
* Status: Open
* Priority: Normal
----------------------------------------
### Context

It is rather common to deploy Ruby with forking servers. A process first load the code and data of the application, and then forks a number of workers to handle an incoming workload.
The advantage is that each child has its own GVL and its own GC, so they don't impact each others latency. The downside however is that in uses more memory than using threads or fibers.
That increased memory usage is largely mitigated by Copy on Write, but it's far from perfect. Over time various memory regions will be written into and unshared.

The classic example is the objects generation, young objects must be promoted to the old generation before forking, otherwise they'll get invalidated on the next GC run. That's what https://github.com/ko1/nakayoshi_fork addresses.

But there are other sources of CoW invalidation that could be addressed by MRI if it had a clear notification when it needs to be done.

### Proposal

MRI could assume than any `fork` may be long lived and perform all the optimizations it can then, but It may be preferable to have a dedicated API for that. e.g.

  - `Process.fork(long_lived: true)`
  - `Process.long_lived_fork`
  - `RubyVM.prepare_for_long_lived_fork`

### Potential optimizations

`nakayoshi_fork` already does the following:

  - Do a major GC run to get rid of as many dangling objects as possible.
  - Promote all surviving objects to the highest generation
  - Compact the heap.

But it would be much simpler to do this from inside the VM rather than do cryptic things such as `4.times { GC.start }` from the Ruby side.

Also after discussing with @jhawthorn, @tenderlovemaking and @alanwu, we believe this would open the door to several other CoW optimizations:

#### Precompute inline caches

Even though we don't have hard data to prove it, we are convinced that a big source of CoW invalidation are inline caches. Most ISeq are never invoked during initialization, so child processed are forked with mostly cold caches. As a result the first time a method is executed in the child, many memory pages holding ISeq are invalidated as caches get updated.

We think MRI could try to precompute these caches before forking children. Constant cache particularly should be resolvable statically (somewhat related https://github.com/ruby/ruby/pull/6049).

Method caches are harder to resolve statically, but we can probably apply some heuristics to at least reduce the cache misses.

#### Copy on Write aware GC

We could also keep some metadata about which memory pages are shared, or even introduce a "permanent" generation. [The Instagram engineering team introduced something like that in Python](https://instagram-engineering.com/copy-on-write-friendly-python-garbage-collection-ad6ed5233ddf) ([ticket](https://bugs.python.org/issue31558), [PR](https://github.com/python/cpython/pull/3705)).

That makes the GC aware of which objects live on a shared page. With this information the GC can decide to no free dangling objects leaving on these pages, not to compact these pages, etc.

#### Scan the coderange of all strings

Strings have a lazily computed `coderange` attribute in their flags. So if a string is allocated at boot, but only used after fork, its coderange may be computed and the string mutated.

Using https://github.com/ruby/ruby/pull/6076, I noticed that 58% of the strings retained at the end of the boot sequence had an `UNKNOWN` coderange.

So eagerly scanning the coderange of all strings could also improve Copy on Write performance.




-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>

In This Thread